This reverts commit 8806237234.
Fixes: 8806237234 ("vhost: introduce new VhostOps vhost_set_config_call")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 081f864f56.
Fixes: 081f864f56 ("virtio: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit f7220a7ce2.
Fixes: f7220a7ce2 ("vhost: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 497679d510.
Fixes: 497679d510 ("virtio-net: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds the support of the '-cpu rv128' option to
qemu-system-riscv64 so that we can indicate that we want to run rv128
executables.
Still, there is no support for 128-bit insns at that stage so qemu fails
miserably (as expected) if launched with this option.
Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-8-frederic.petrot@univ-grenoble-alpes.fr
[ Changed by AF
- Rename CPU to "x-rv128"
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Addition of div and rem on 128-bit integers, using the 128/64->128 divu and
64x64->128 mulu in host-utils.
These operations will be used within div/rem helpers in the 128-bit riscv
target.
Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-4-frederic.petrot@univ-grenoble-alpes.fr
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Adding defines to handle signed 64-bit and unsigned 128-bit quantities in
memory accesses.
Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-3-frederic.petrot@univ-grenoble-alpes.fr
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Renaming defines for quad in their various forms so that their signedness is
now explicit.
Done using git grep as suggested by Philippe, with a bit of hand edition to
keep assignments aligned.
Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-2-frederic.petrot@univ-grenoble-alpes.fr
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Linux supports up to 32 cores for both 32-bit and 64-bit RISC-V, so
let's set that as the maximum for the virt board.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/435
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20220105213937.1113508-9-alistair.francis@opensource.wdc.com>
New virtio mem options.
A vhost-user cleanup.
Control over smbios entry point type.
Config interrupt support for vdpa.
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmHY2zEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpCiEH/jv5tHUffDdGz5M2pN7FTWPQ9UAMQZXbn5AS
PPVutOI/B+ILYBuNjYLvMGeq6ymG4/0DM940/jkQwCWD4ku1OG0ReM5T5klUR8lY
df5y1SCDv3Yoq0vxpQCnssKqbgm8Kf9tnAFjni7Lvbu3oo6DCq77m6MWEapLoEUu
IkM+l60NKmHAClnE6RF4KobLa5srIlDTho1iBXH5S39CRF1LvP9NgnYzl7nqiEkq
ZYQEqkKO5XGxZji9banZPJD2kxt1iL7s24QI6OJG2Lz8Hf86b0Yo7XJpmw4ShP9h
Vl1SL3m/HhHSMBuXOb7w/EkCm59b7whXCmoyYBF/GqaxtZkvVnM=
=4VIN
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: features,fixes,cleanups
New virtio mem options.
A vhost-user cleanup.
Control over smbios entry point type.
Config interrupt support for vdpa.
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 07 Jan 2022 04:30:41 PM PST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (55 commits)
tests: acpi: Add updated TPM related tables
acpi: tpm: Add missing device identification objects
tests: acpi: prepare for updated TPM related tables
virtio/vhost-vsock: don't double close vhostfd, remove redundant cleanup
hw/scsi/vhost-scsi: don't double close vhostfd on error
hw/scsi/vhost-scsi: don't leak vqs on error
docs: reSTify virtio-balloon-stats documentation and move to docs/interop
hw/i386/pc: Add missing property descriptions
acpihp: simplify acpi_pcihp_disable_root_bus
tests: acpi: SLIC: update expected blobs
tests: acpi: add SLIC table test
tests: acpi: whitelist expected blobs before changing them
acpi: fix QEMU crash when started with SLIC table
intel-iommu: correctly check passthrough during translation
virtio-mem: Set "unplugged-inaccessible=auto" for the 7.0 machine on x86
virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
linux-headers: sync VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
MAINTAINERS: Add a separate entry for acpi/VIOT tables
virtio: signal after wrapping packed used_idx
virtio-mem: Support "prealloc=on" option
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we signal the VM that reading
unplugged memory is not supported. We have to fail feature negotiation
in case the guest does not support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.
First, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE is required to properly handle
memory backends (or architectures) without support for the shared zeropage
in the hypervisor cleanly. Without the shared zeropage, even reading an
unpopulated virtual memory location can populate real memory and
consequently consume memory in the hypervisor. We have a guaranteed shared
zeropage only on MAP_PRIVATE anonymous memory.
Second, we want VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE to be the default
long-term as even populating the shared zeropage can be problematic: for
example, without THP support (possible) or without support for the shared
huge zeropage with THP (unlikely), the PTE page tables to hold the shared
zeropage entries can consume quite some memory that cannot be reclaimed
easily.
Third, there are other optimizations+features (e.g., protection of
unplugged memory, reducing the total memory slot size and bitmap sizes)
that will require VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.
We really only support x86 targets with virtio-mem for now (and
Linux similarly only support x86), but that might change soon, so prepare
for different targets already.
Add a new "unplugged-inaccessible" tristate property for x86 targets:
- "off" will keep VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE unset and legacy
guests working.
- "on" will set VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE and stop legacy guests
from using the device.
- "auto" selects the default based on support for the shared zeropage.
Warn in case the property is set to "off" and we don't have support for the
shared zeropage.
For existing compat machines, the property will default to "off", to
not change the behavior but eventually warn about a problematic setup.
Short-term, we'll set the property default to "auto" for new QEMU machines.
Mid-term, we'll set the property default to "on" for new QEMU machines.
Long-term, we'll deprecate the parameter and disallow legacy
guests completely.
The property has to match on the migration source and destination. "auto"
will result in the same VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE setting as long
as the qemu command line (esp. memdev) match -- so "auto" is good enough
for migration purposes and the parameter doesn't have to be migrated
explicitly.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134039.29670-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's synchronize the new feature flag, available in Linux since
v5.16-rc1.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134039.29670-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For scarce memory resources, such as hugetlb, we want to be able to
prealloc such memory resources in order to not crash later on access. On
simple user errors we could otherwise easily run out of memory resources
an crash the VM -- pretty much undesired.
For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.
Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.
A common use case for hugetlb will be using "reserve=off,prealloc=off" for
the memory backend and "prealloc=on" for the virtio-mem device. This
way, no huge pages will be reserved for the process, but we can recover
if there are no actual huge pages when plugging memory. Libvirt is
already prepared for this.
Note that preallocation cannot protect from the OOM killer -- which
holds true for any kind of preallocation in QEMU. It's primarily useful
only for scarce memory resources such as hugetlb, or shared file-backed
memory. It's of little use for ordinary anonymous memory that can be
swapped, KSM merged, ... but we won't forbid it.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-9-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The ITS code has to check whether various parameters passed in
commands are in-bounds, where the limit is defined in terms of the
number of bits that are available for the parameter. (For example,
the GITS_TYPER.Devbits ID register field specifies the number of
DeviceID bits minus 1, and device IDs passed in the MAPTI and MAPD
command packets must fit in that many bits.)
Currently we have off-by-one bugs in many of these bounds checks.
The typical problem is that we define a max_foo as 1 << n. In
the Devbits example, we set
s->dt.max_ids = 1UL << (GITS_TYPER.Devbits + 1).
However later when we do the bounds check we write
if (devid > s->dt.max_ids) { /* command error */ }
which incorrectly permits a devid of 1 << n.
These bugs will not cause QEMU crashes because the ID values being
checked are only used for accesses into tables held in guest memory
which we access with address_space_*() functions, but they are
incorrect behaviour of our emulation.
Fix them by standardizing on this pattern:
* bounds limits are named num_foos and are the 2^n value
(equal to the number of valid foo values)
* bounds checks are either
if (fooid < num_foos) { good }
or
if (fooid >= num_foos) { bad }
In this commit we fix the handling of the number of IDs
in the device table and the collection table, and the number
of commands that will fit in the command queue.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
The TableDesc struct defines properties of the in-guest-memory tables
which the guest tells us about by writing to the GITS_BASER<n>
registers. This struct currently has a union 'maxids', but all the
fields of the union have the same type (uint32_t) and do the same
thing (record one-greater-than the maximum ID value that can be used
as an index into the table).
We're about to add another table type (the GICv4 vPE table); rather
than adding another specifically-named union field for that table
type with the same type as the other union fields, remove the union
entirely and just have a 'uint32_t max_ids' struct field.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
AST2600 Display Port MCU introduces 0x18000000~0x1803FFFF as it's memory
and io address. If guest machine try to access DPMCU memory, it will
cause a fatal error.
Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20211210083034.726610-1-troy_lee@aspeedtech.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE
does not require a SIGBUS handler, doesn't actually touch page content,
and avoids context switches; it is, therefore, faster and easier to handle
than our current approach.
While MADV_POPULATE_WRITE is, in general, faster than manual
prefaulting, and especially faster with 4k pages, there is still value in
prefaulting using multiple threads to speed up preallocation.
More details on MADV_POPULATE_WRITE can be found in the Linux commits
4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault
page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for
MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1].
This resolves the TODO in do_touch_pages().
In the future, we might want to look into using fallocate(), eventually
combined with MADV_POPULATE_READ, when dealing with shared file/fd
mappings and not caring about memory bindings.
[1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The i440fx and Q35 machine types are both hardcoded to use the
legacy SMBIOS 2.1 (32-bit) entry point. This is a sensible
conservative choice because SeaBIOS only supports SMBIOS 2.1
EDK2, however, can also support SMBIOS 3.0 (64-bit) entry points,
and QEMU already uses this on the ARM virt machine type.
This adds a property to allow the choice of SMBIOS entry point
versions For example to opt in to 64-bit SMBIOS entry point:
$QEMU -machine q35,smbios-entry-point-type=64
Based on a patch submitted by Daniel Berrangé.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-4-ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This prepares for exposing the SMBIOS entry point type as a
machine property on x86.
Based on a patch from Daniel P. Berrangé.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-3-ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Rename the enums to match the naming style used by QAPI, and to
use "32" and "64" instead of "20" and "31". This will allow us
to more easily move the enum to the QAPI schema later.
About the naming choice: "SMBIOS 2.1 entry point"/"SMBIOS 3.0
entry point" and "32-bit entry point"/"64-bit entry point" are
synonymous in the SMBIOS specification. However, the phrases
"32-bit entry point" and "64-bit entry point" are used more often.
The new names also avoid confusion between the entry point format
and the actual SMBIOS version reported in the entry point
structure. For example: currently the 32-bit entry point
actually report SMBIOS 2.8 support, not 2.1.
Based on portions of a patch submitted by Daniel P. Berrangé.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-2-ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move the pci_intx() definition to the PCI header file, so that it can
be called from other PCI files. It is used by the next patch.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-Id: <20211116170133.724751-3-fbarrat@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Add functions to support configure interrupt in virtio_net
The functions are config_pending and config_mask, while
this input idx is VIRTIO_CONFIG_IRQ_IDX will check the
function of configure interrupt.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-9-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add functions to support configure interrupt.
The configure interrupt process will start in vhost_dev_start
and stop in vhost_dev_stop.
Also add the functions to support vhost_config_pending and
vhost_config_mask, for masked_config_notifier, we only
use the notifier saved in vq 0.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-8-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add the functions to support the configure interrupt in virtio
The function virtio_config_guest_notifier_read will notify the
guest if there is an configure interrupt.
The function virtio_config_set_guest_notifier_fd_handler is
to set the fd hander for the notifier
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-7-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces new VhostOps vhost_set_config_call. This function allows the
vhost to set the event fd to kernel
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-5-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To support configure interrupt for vhost-vdpa
Introduce VIRTIO_CONFIG_IRQ_IDX -1 as configure interrupt's queue index,
Then we can reuse the functions guest_notifier_mask and guest_notifier_pending.
Add the check of queue index in these drivers, if the driver does not support
configure interrupt, the function will just return
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-2-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This requires extra work for each target, but adds the
common syscall code, and the necessary flag in CPUState.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211227150127.2659293-4-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Add 7.0 machine types for arm/i440fx/q35/s390x/spapr.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211217143948.289995-1-cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Remove qemu_run_machine_init_done_notifiers() since no implementation
and user.
Fixes: f66dc8737c ("vl: move all generic initialization out of vl.c")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220104024136.1433545-1-xiaoyao.li@intel.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.
So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
int thread_id;
int core_id;
int cluster_id;
int package_id;
int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t cluster_sibling;
cpumask_t llc_sibling;
}
A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.
In virtualization, on the Hosts which have pClusters (physical
clusters), if we can design a vCPU topology with cluster level for
guest kernel and have a dedicated vCPU pinning. A Cluster-Aware
Guest kernel can also make use of the cache affinity of CPU clusters
to gain similar scheduling performance.
This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: Added '(since 7.0)' to @clusters in qapi/machine.json]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
All methods related to MachineState are prefixed with "machine_".
smp_parse() does not need to be an exception. Rename it and
const'ify the SMPConfiguration argument, since it doesn't need
to be modified.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211216132015.815493-9-philmd@redhat.com>
@pin is an input where we connect a device output.
Rename it @input_pin to simplify the documentation.
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20211218130437.1516929-5-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
qdev_connect_gpio_out_named() is described as qdev_connect_gpio_out(),
and referring to itself in an endless loop, which is confusing. Fix.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211218130437.1516929-4-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
qdev_init_gpio_out_named() is described as qdev_init_gpio_out(),
and referring to itself in an endless loop, which is confusing. Fix.
Reported-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211218130437.1516929-3-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Add empty lines to have a clearer distinction between different
functions declarations.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211218130437.1516929-2-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
ld*_dma() returns a MemTxResult type. Do not discard
it, return it to the caller.
Update the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-24-philmd@redhat.com>
st*_dma() returns a MemTxResult type. Do not discard
it, return it to the caller.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-23-philmd@redhat.com>
Let devices specify transaction attributes when calling ld*_pci_dma().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-22-philmd@redhat.com>
Let devices specify transaction attributes when calling st*_pci_dma().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-21-philmd@redhat.com>
dma_memory_read() returns a MemTxResult type. Do not discard
it, return it to the caller.
Update the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-19-philmd@redhat.com>
dma_memory_write() returns a MemTxResult type. Do not discard
it, return it to the caller.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-18-philmd@redhat.com>
Let devices specify transaction attributes when calling ld*_dma().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-17-philmd@redhat.com>
Let devices specify transaction attributes when calling st*_dma().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-16-philmd@redhat.com>
Let devices specify transaction attributes when calling
dma_buf_read().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-13-philmd@redhat.com>
Let devices specify transaction attributes when calling
dma_buf_write().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-12-philmd@redhat.com>
Let devices specify transaction attributes when calling pci_dma_rw().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-10-philmd@redhat.com>
DMA operations are run on any kind of buffer, not arrays of
uint8_t. Convert dma_buf_read/dma_buf_write functions to take
a void pointer argument and save us pointless casts to uint8_t *.
Remove this pointless casts in the megasas device model.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-9-philmd@redhat.com>
Let devices specify transaction attributes when calling
dma_memory_map().
Patch created mechanically using spatch with this script:
@@
expression E1, E2, E3, E4;
@@
- dma_memory_map(E1, E2, E3, E4)
+ dma_memory_map(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-7-philmd@redhat.com>
Let devices specify transaction attributes when calling
dma_memory_rw().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-5-philmd@redhat.com>
We will add the MemTxAttrs argument to dma_memory_rw() in
the next commit. Since dma_memory_rw_relaxed() is only used
by dma_memory_rw(), modify it first in a separate commit to
keep the next commit easier to review.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-4-philmd@redhat.com>
Let devices specify transaction attributes when calling
dma_memory_set().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-3-philmd@redhat.com>
Let devices specify transaction attributes when calling
dma_memory_valid().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-2-philmd@redhat.com>
It's unused now (except for permission handling)[*]. The only reasonable
user of it was block-stream job, recently updated to use own blk. And
other block jobs prefer to use own source node related objects.
So, the arguments of dropping the field are:
- block jobs prefer not to use it
- block jobs usually has more then one node to operate on, and better
to operate symmetrically (for example has both source and target
blk's in specific block-job state structure)
*: BlockJob.blk is used to keep some permissions. We simply move
permissions to block-job child created in block_job_create() together
with blk.
In mirror, we just should not care anymore about restoring state of
blk. Most probably this code could be dropped long ago, after dropping
bs->job pointer. Now it finally goes away together with BlockJob.blk
itself.
iotest 141 output is updated, as "bdrv_has_blk(bs)" check in
qmp_blockdev_del() doesn't fail (we don't have blk now). Still, new
error message looks even better.
In iotest 283 we need to add a job id, otherwise "Invalid job ID"
happens now earlier than permission check (as permissions moved from
blk to block-job node).
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Nikita Lapshin <nikita.lapshin@virtuozzo.com>
We are going to drop BlockJob.blk. So let's retrieve block job context
from underlying job instead of main node.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Nikita Lapshin <nikita.lapshin@virtuozzo.com>
Add a new chardev backend which allows D-Bus client to handle the
chardev stream & events.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Add an option to use direct connections instead of via the bus. Clients
are accepted with QMP add_client.
This allows to provide the D-Bus display without a bus. It also
simplifies the testing setup (some CI have issues to setup a D-Bus bus
in a container).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
The "dbus" display backend exports the QEMU consoles and other
UI-related interfaces over D-Bus.
By default, the connection is established on the session bus, but you
can specify a different bus with the "addr" option.
The backend takes the "org.qemu" service name, while still allowing
further instances to queue on the same name (so you can lookup all the
available instances too). It accepts any number of clients at this
point, although this is expected to evolve with options to restrict
clients, or only accept p2p via fd passing.
The interface is intentionally very close to the internal QEMU API,
and can be introspected or interacted with busctl/dfeet etc:
$ ./qemu-system-x86_64 -name MyVM -display dbus
$ busctl --user introspect org.qemu /org/qemu/Display1/Console_0
org.qemu.Display1.Console interface - - -
.RegisterListener method h - -
.SetUIInfo method qqiiuu - -
.DeviceAddress property s "pci/0000/01.0" emits-change
.Head property u 0 emits-change
.Height property u 480 emits-change
.Label property s "VGA" emits-change
.Type property s "Graphic" emits-change
.Width property u 640 emits-change
[...]
See the interfaces XML source file and Sphinx docs for the generated API
documentations.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Add a new DisplayScanout structure to save the current scanout details.
This allows to attach later UI backends and set the scanout.
Introduce displaychangelistener_display_console() helper function to
handle the dpy_gfx_switch/gl_scanout() & dpy_gfx_update() calls.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
This will allow to have one GL context but a variable number of
listeners.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
GraphicHw.gl_flushed was introduced to notify the
device (vhost-user-gpu) that the GL resources (the display scanout) are
no longer needed.
It was decoupled from QEMU own gl-blocking mechanism, but that
difference isn't helping. Instead, we can reuse QEMU gl-blocking and
notify virtio_gpu_gl_flushed() when unblocking (to unlock
vhost-user-gpu).
An extra block/unblock is added arount dpy_gl_update() so existing
backends that don't block will have the flush event handled. It will
also help when there are no backends associated.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
The next patch will make use of this function to dissociate
DisplayChangeListener from GL context.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
A remote client, such as Spice, will already avoid flooding the stream
by delaying the resize requests.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Use a QemuClipboardNotify union type for extendable clipboard events.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Move linux-user safe-syscall.S and safe-syscall-error.c to common-user
so that bsd-user can also use it. Also move safe-syscall.h to
include/user/. Since there is nothing here that is related to the guest,
as opposed to the host, build it once.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit 739e95f574 ("scsi: Replace scsi_bus_new() with
scsi_bus_init(), scsi_bus_init_named()") forgot to rename
scsi_bus_init() in the function documentation string.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20211122104744.1051554-1-f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"if (tcg_enabled())" allows elision of the code inside it; we only need
the prototype to exist, so that the code compile even for the --disable-tcg
case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* General cleanup for Mac machines (Peter)
* Fixes for FPU exceptions (Lucas)
* Support for new ISA31 instructions (Matheus)
* Fixes for ivshmem (Daniel)
* Cleanups for PowerNV PHB (Christophe and Cedric)
* Updates of PowerNV and pSeries documentation (Leonardo and Daniel)
* Fixes for PowerNV (Daniel)
* Large cleanup of FPU implementation (Richard)
* Removal of SoftTLBs support for PPC74x CPUs (Fabiano)
* Fixes for exception models in MPCx and 60x CPUs (Fabiano)
* Removal of 401/403 CPUs (Cedric)
* Deprecation of taihu machine (Thomas)
* Large rework of PPC405 machine (Cedric)
* Fixes for VSX instructions (Victor and Matheus)
* Fix for e6500 CPU (Fabiano)
* Initial support for PMU (Daniel)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmG8xt8ACgkQUaNDx8/7
7KG3Dg/9EXK3GslNgUNRvB1pgRSimnrUirGUiDmZPXxevIbsoPsYaXmUcD1zOnlb
zXiCzQ2Bvi8ZUjT1uScP7dkFCdzs6gXYbTEcTzscX3k2VnTjXHXhQ3cnb0uModP5
U1QzrjV7K/q1usJW5OVSGZS1PoWOqWuZNdcp0mIUWcJHhSaYtUGGPohp7rH0JSug
ncmkRA0KLgIX8eg8swyfJxrw9wCcXlFIcmwHipB8S/Dd/gUpmFEoaQsmugSJNYZe
zi8Fd4jfzlRXVwb8EUSiOiaXSd/WKjEcQx/usbzzaBacbktk/nfy+rligUMryCpO
vGFM5blxEX5SXD3Cd0vcFwYhCZImphD8K+Sxe6Us69rsUH11hJS+q29/Puk1MkHt
DTubqB3k4BheiatOV1zeUMlbRm5svUhGj3VstFZYZeZ3Oh47Jsx3XH4hoytUuc/1
lP9UGkaf3nIx12vSqBA/3Crc7zalWX5OhaUV5RG30+jxd8zHOKcasKbd22710DNz
4WybQLb3bpUr091mWMKcaAkP6bxcE8S+mR4LE2kdELboAnkB+OgSmrdZ3slceaCv
btV8qjNl4f8lBvyFQVxZ5bn05+TfxUXFlFxXipxf1fI64bYwRnyQQ3yRxMHipRYK
CRta1akVgIgcBbeeRHBZLA12UgTQJY6WIoDaZMz9NxIDHJnX/jw=
=APFd
-----END PGP SIGNATURE-----
Merge tag 'pull-ppc-20211217' of https://github.com/legoater/qemu into staging
ppc 7.0 queue:
* General cleanup for Mac machines (Peter)
* Fixes for FPU exceptions (Lucas)
* Support for new ISA31 instructions (Matheus)
* Fixes for ivshmem (Daniel)
* Cleanups for PowerNV PHB (Christophe and Cedric)
* Updates of PowerNV and pSeries documentation (Leonardo and Daniel)
* Fixes for PowerNV (Daniel)
* Large cleanup of FPU implementation (Richard)
* Removal of SoftTLBs support for PPC74x CPUs (Fabiano)
* Fixes for exception models in MPCx and 60x CPUs (Fabiano)
* Removal of 401/403 CPUs (Cedric)
* Deprecation of taihu machine (Thomas)
* Large rework of PPC405 machine (Cedric)
* Fixes for VSX instructions (Victor and Matheus)
* Fix for e6500 CPU (Fabiano)
* Initial support for PMU (Daniel)
# gpg: Signature made Fri 17 Dec 2021 09:20:31 AM PST
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* tag 'pull-ppc-20211217' of https://github.com/legoater/qemu: (101 commits)
ppc/pnv: Use QOM hierarchy to scan PEC PHB4 devices
ppc/pnv: Move realize of PEC stacks under the PEC model
ppc/pnv: Remove "system-memory" property from PHB4 PEC
ppc/pnv: Compute the PHB index from the PHB4 PEC model
ppc/pnv: Introduce a num_stack class attribute
ppc/pnv: Introduce a "chip" property under the PHB4 model
ppc/pnv: Introduce version and device_id class atributes for PHB4 devices
ppc/pnv: Introduce a num_pecs class attribute for PHB4 PEC devices
ppc/pnv: Use QOM hierarchy to scan PHB3 devices
ppc/pnv: Move mapping of the PHB3 CQ regions under pnv_pbcq_realize()
ppc/pnv: Drop the "num-phbs" property
ppc/pnv: Use the chip class to check the index of PHB3 devices
ppc/pnv: Introduce a "chip" property under PHB3
PPC64/TCG: Implement 'rfebb' instruction
target/ppc/power8-pmu.c: add PM_RUN_INST_CMPL (0xFA) event
target/ppc: enable PMU instruction count
target/ppc: enable PMU counter overflow with cycle events
target/ppc: PMU: update counters on MMCR1 write
target/ppc: PMU: update counters on PMCs r/w
target/ppc: PMU basic cycle count for pseries TCG
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Each PEC device of the POWER9 chip has a predefined number of stacks,
equivalent of a root port complex:
PEC0 -> 1 stack
PEC1 -> 2 stacks
PEC2 -> 3 stacks
Introduce a class attribute to hold these values and remove the
"num-stacks" property.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211213132830.108372-11-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
And check the PEC index using the chip class.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211213132830.108372-10-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
It prepares ground for PHB5 which has different values.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211213132830.108372-9-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
POWER9 processor comes with 3 PHB4 PEC (PCI Express Controller) and
each PEC can have several PHBs :
* PEC0 provides 1 PHB (PHB0)
* PEC1 provides 2 PHBs (PHB1 and PHB2)
* PEC2 provides 3 PHBs (PHB3, PHB4 and PHB5)
A num_pecs class attribute represents better the logic units of the
POWER9 chip. Use that instead of num_phbs which fits POWER8 chips.
This will ease adding support for user created devices.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211213132830.108372-8-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This change will help us move the mapping of XSCOM regions under the
PHB3 realize routine, which will be necessary for user created PHB3
devices.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211213132830.108372-3-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
These variants take a float64 as input, compute the result to
infinite precision (as we do with FloatParts), round the result
to the precision and dynamic range of float32, and then return
the result in the format of float64.
This is the operation PowerPC requires for its float32 operations.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-28-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-8-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-7-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-6-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has these flags, and it's easier to compute them here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-5-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-4-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-3-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
We will shortly have more than 8 bits of exceptions.
Repack the existing flags into low bits and reformat to hex.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20211119160502.17432-2-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
When experimenting raising GLIB_VERSION_MIN_REQUIRED to 2.68
(Fedora 34 provides GLib 2.68.1) we get:
hw/virtio/virtio-crypto.c:245:24: error: 'g_memdup' is deprecated: Use 'g_memdup2' instead [-Werror,-Wdeprecated-declarations]
...
g_memdup() has been updated by g_memdup2() to fix eventual security
issues (size argument is 32-bit and could be truncated / wrapping).
GLib recommends to copy their static inline version of g_memdup2():
https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
Our glib-compat.h provides a comment explaining how to deal with
these deprecated declarations (see commit e71e8cc035
"glib: enforce the minimum required version and warn about old APIs").
Following this comment suggestion, implement the g_memdup2_qemu()
wrapper to g_memdup2(), and use the safer equivalent inlined when
we are using pre-2.68 GLib.
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210903174510.751630-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The DTSM is a mask that specifies which I/O Address Translation designation
types are supported. Today QEMU only supports DT=1.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <20211203142706.427279-5-mjrosato@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The current default PCI group being used can technically collide with a
real group ID passed from a hostdev. Let's instead use a group ID that
comes from a special pool (0xF0-0xFF) that is architected to be reserved
for simulated devices.
Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <20211203142706.427279-2-mjrosato@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
* add support for KVM_GUESTDBG_BLOCKIRQ (Maxim)
* update linux-headers to Linux 5.16 (myself)
* configure cleanups (myself)
* lsi53c895a assertion failure fix (Philippe)
* fix incorrect description for die-id (Yanan)
* support for NUMA in SGX enclave memory (Yang Zhong)
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmG5yEgUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOFkwf9Glt3tnHfO/gWZectiMAjjM8vg3cR
yEsWU0fa0iu8MO8NiOW0b6DT0yosZ8rl2SYSZBQGqwa/Lq2MZLO2v0JG4LblBom7
WVmzNOG4t0OFoE2gwMlZ2J+ppaHV5zN1Yc6lMXOVQcefY0pXoyze+0Fq/cwKf2+p
6WDZTuOpjtxMwt9n71b93kNAHagk422MpNi1xi+o37hNsm/Zh3BTtaSJ7WCkCmWq
YmwwQZgyB/V0Ah42wchuf5zxEnQPMbjaQSu8jQU4iyCB7+2biBuh3Cqw5eR8tqC1
cGNINGlo2Orl9ASyEunHyuDj8p0wN8J37wNuBo4kghhrTVsVKQYF0wmsqw==
=U1Xy
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* improve compatibility for macOS scripts/entitlement.sh (Evan)
* add support for KVM_GUESTDBG_BLOCKIRQ (Maxim)
* update linux-headers to Linux 5.16 (myself)
* configure cleanups (myself)
* lsi53c895a assertion failure fix (Philippe)
* fix incorrect description for die-id (Yanan)
* support for NUMA in SGX enclave memory (Yang Zhong)
# gpg: Signature made Wed 15 Dec 2021 02:49:44 AM PST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [unknown]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
configure: remove dead variables
doc: Add the SGX numa description
numa: Support SGX numa in the monitor and Libvirt interfaces
numa: Enable numa for SGX EPC sections
kvm: add support for KVM_GUESTDBG_BLOCKIRQ
gdbstub, kvm: let KVM report supported singlestep flags
gdbstub: reject unsupported flags in handle_set_qemu_sstep
linux-headers: update to 5.16-rc1
virtio-gpu: do not byteswap padding
scripts/entitlement.sh: Use backward-compatible cp flags
qapi/machine.json: Fix incorrect description for die-id
tests/qtest: Add fuzz-lsi53c895a-test
hw/scsi/lsi53c895a: Do not abort when DMA requested and no data queued
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* ITS: error reporting cleanup
* aspeed: improve documentation
* Fix STM32F2XX USART data register readout
* allow emulated GICv3 to be disabled in non-TCG builds
* fix exception priority for singlestep, misaligned PC, bp, etc
* Correct calculation of tlb range invalidate length
* npcm7xx_emc: fix missing queue_flush
* virt: Add VIOT ACPI table for virtio-iommu
* target/i386: Use assert() to sanity-check b1 in SSE decode
* Don't include qemu-common unnecessarily
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmG5xekZHHBldGVyLm1h
eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3ramD/0WL8YV70sW5B/tHdb+/em1
xTBuABUUj5QDvKnxNoPIBwJI0vgmzwhAonYzcKKEUvlbL97crkgt6xSPvVxv2nf5
wnSYMKTDEC11AuYVdEyIMm5KLc88mq1w78pTYkFSUJmujCpfqLAsyXdEastIPHfN
MdrwkpQ3wVmMeMcNBTq2yCxiGlz7x/myeJtDU9ihgPTcsgXa8BzziK6qCZHAOGCL
0/ljXDbVTJtLYUki9IqptPs8QUtlqOBt3rLplxHfKRKpmjiuD+xFlQ4GuIOBX+AL
tQWgEyyiR9FnYpY1t3fWVtuKgjYXzlbh1A6cwdsK3Q68+qfi7Yr+lPryjwrmOkx7
/Yupq+QB/xgK4nxF4ydDXLvqI3h6GjaF2U9qujK3H9DyMOEYJDpaX1TZMphtWI89
9u7kLO6DNE00oUoiX+6Aty0qQtXv12SSaNpJmFON87/WLJJamHuiS6NiZp/r4ORU
51ds2LPGJAKAy9duqmZJ/81WlNjmHmurq1v+FIl29XInc4a2SpwEUM0rsTrrQTaD
16Qh2OZCnlYEg9nh6B54FQe8xP+pp69Gn/BRFhcwW9fPq4/pHSrwKEkI6lE+Yuiq
+Fe8r0DbZczfhjcGdoUlIgMj+WSVY9Q8Opztsmv/kjZqxt0VvfdmAVp0odl5KdB4
cKAeYciNSgq2bGd+N4kuHA==
=KuTi
-----END PGP SIGNATURE-----
Merge tag 'pull-target-arm-20211215' of https://git.linaro.org/people/pmaydell/qemu-arm into staging
target-arm queue:
* ITS: error reporting cleanup
* aspeed: improve documentation
* Fix STM32F2XX USART data register readout
* allow emulated GICv3 to be disabled in non-TCG builds
* fix exception priority for singlestep, misaligned PC, bp, etc
* Correct calculation of tlb range invalidate length
* npcm7xx_emc: fix missing queue_flush
* virt: Add VIOT ACPI table for virtio-iommu
* target/i386: Use assert() to sanity-check b1 in SSE decode
* Don't include qemu-common unnecessarily
# gpg: Signature made Wed 15 Dec 2021 02:39:37 AM PST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
* tag 'pull-target-arm-20211215' of https://git.linaro.org/people/pmaydell/qemu-arm: (33 commits)
tests/acpi: add expected blob for VIOT test on virt machine
tests/acpi: add expected blobs for VIOT test on q35 machine
tests/acpi: add test case for VIOT
tests/acpi: allow updates of VIOT expected data files
hw/arm/virt: Use object_property_set instead of qdev_prop_set
hw/arm/virt: Reject instantiation of multiple IOMMUs
hw/arm/virt: Remove device tree restriction for virtio-iommu
hw/arm/virt-acpi-build: Add VIOT table for virtio-iommu
hw/net: npcm7xx_emc fix missing queue_flush
target/arm: Correct calculation of tlb range invalidate length
hw/arm: Don't include qemu-common.h unnecessarily
target/rx/cpu.h: Don't include qemu-common.h
target/hexagon/cpu.h: don't include qemu-common.h
include/hw/i386: Don't include qemu-common.h in .h files
target/i386: Use assert() to sanity-check b1 in SSE decode
tests/tcg: Add arm and aarch64 pc alignment tests
target/arm: Suppress bp for exceptions with more priority
target/arm: Assert thumb pc is aligned
target/arm: Take an exception if PC is misaligned
target/arm: Split compute_fsr_fsc out of arm_deliver_fault
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Hi
This are the reviewed patches for the freeze period:
- colo: fix/optimize several things (rao, chen)
- shutdown qio channels correctly when an error happens (li)
- serveral multifd patches for the zero series (me)
Please apply.
Thanks, Juan.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmG5xCkACgkQ9IfvGFhy
1yODJA/8CcxfASdxCdiTGcoOQz43259Awp61vg44qqdGz1R0aBQNCCYIL0BFlHJo
Ta6m1pZfi8CPWCVkReSvkMp2B3RUSwyX1DhI2WxhTmEhY0Kpa4nS2S5NGbA3hCEJ
3GFN/zDshnCJB47BXNZAz6GoNfY6pYlcGhCEDduWNcbTMAy9AdvTWq/XllGoKCfu
A3JXNg0P5PqNzdR2AVr24M5gwqNOR3ral36rr5vLWZF5CNd71jLpMj5jmDGnpiA5
QCQyLcOf7muSXOOFbIZPCmDh3Kvd5voDuG3UbL+2wycLny/8gpc4GoheYHaVaO4F
87hcwbLZL99/3BOLhXIuumchA3ZOiJruH/VWbSbY6vnuz7/j5EZMQycCSQQuV/+T
zYK3jDqIUDTiTiRj+d68bHlfAloNF42kQVAbhF9Uk9YDfZsoOSIkRNl+BP/N+YCQ
OyXoZZ/U1rSjqRw0pTn8A9VvJ5SAsKNEDizkEnvEtjW9mWEq/h/Vym17Nw50V0CK
JxVNOkg4PBjoy0WdXrF10YHf/R5bOanNUm3tFSMTarAuEu1Oc/I5e7GrffxoVqS+
Sj+0QS//fVJdzynybA4SpXLhZ5n8HixYT3SfihveevItZateePtJ7gkP3RugwZ9V
/eUMuC3PwFeYwQ5EC+dkGfku+of3UE+TOzBP8VdM9ahRwRSh1X8=
=3ge7
-----END PGP SIGNATURE-----
Merge tag 'migration-20211214-pull-request' of https://gitlab.com/juan.quintela/qemu into staging
Migration Pull request
Hi
This are the reviewed patches for the freeze period:
- colo: fix/optimize several things (rao, chen)
- shutdown qio channels correctly when an error happens (li)
- serveral multifd patches for the zero series (me)
Please apply.
Thanks, Juan.
# gpg: Signature made Wed 15 Dec 2021 02:32:09 AM PST
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
* tag 'migration-20211214-pull-request' of https://gitlab.com/juan.quintela/qemu:
multifd: Make zlib compression method not use iovs
multifd: Make zstd compression method not use iovs
COLO: Move some trace code behind qemu_mutex_unlock_iothread()
multifd: Shut down the QIO channels to avoid blocking the send threads when they are terminated.
multifd: Fill offset and block for reception
multifd: remove used parameter from send_recv_pages() method
multifd: remove used parameter from send_prepare() method
multifd: The variable is only used inside the loop
multifd: Add missing documention
multifd: Rename used field to num
migration: Never call twice qemu_target_page_size()
multifd: Delete useless operation
dump: Remove is_zero_page()
migration: Remove is_zero_range()
migration/colo: Optimize COLO primary node start code path
Fixed a QEMU hang when guest poweroff in COLO mode
migration/colo: More accurate update checkpoint time
migration/ram.c: Remove the qemu_mutex_lock in colo_flush_ram_cache.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
drive_get_next() is basically a bad idea. It returns the "next" block
backend of a certain interface type. "Next" means bus=0,unit=N, where
subsequent calls count N up from zero, per interface type.
This lets you define unit numbers implicitly by execution order. If the
order changes, or new calls appear "in the middle", unit numbers change.
ABI break. Hard to spot in review.
The previous commits eliminated all uses. Drop the function.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20211117163409.3587705-14-armbru@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
The qemu-common.h header is not supposed to be included from any
other header files, only from .c files (as documented in a comment at
the start of it).
include/hw/i386/x86.h and include/hw/i386/microvm.h break this rule.
In fact, the include is not required at all, so we can just drop it
from both files.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20211129200510.1233037-2-peter.maydell@linaro.org
When the PVM guest poweroff, the COLO thread may wait a semaphore
in colo_process_checkpoint().So, we should wake up the COLO thread
before migration shutdown.
Signed-off-by: Lei Rao <lei.rao@intel.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Scan the PCI devices to find bridge and set PCI_SECONDARY_BUS and
PCI_SUBORDINATE_BUS (algorithm from seabios)
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20211208130350.10178-2-lvivier@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The basic SGX did not enable numa for SGX EPC sections, which
result in all EPC sections located in numa node 0. This patch
enable SGX numa function in the guest and the EPC section can
work with RAM as one numa node.
The Guest kernel related log:
[ 0.009981] ACPI: SRAT: Node 0 PXM 0 [mem 0x180000000-0x183ffffff]
[ 0.009982] ACPI: SRAT: Node 1 PXM 1 [mem 0x184000000-0x185bfffff]
The SRAT table can normally show SGX EPC sections menory info in different
numa nodes.
The SGX EPC numa related command:
......
-m 4G,maxmem=20G \
-smp sockets=2,cores=2 \
-cpu host,+sgx-provisionkey \
-object memory-backend-ram,size=2G,host-nodes=0,policy=bind,id=node0 \
-object memory-backend-epc,id=mem0,size=64M,prealloc=on,host-nodes=0,policy=bind \
-numa node,nodeid=0,cpus=0-1,memdev=node0 \
-object memory-backend-ram,size=2G,host-nodes=1,policy=bind,id=node1 \
-object memory-backend-epc,id=mem1,size=28M,prealloc=on,host-nodes=1,policy=bind \
-numa node,nodeid=1,cpus=2-3,memdev=node1 \
-M sgx-epc.0.memdev=mem0,sgx-epc.0.node=0,sgx-epc.1.memdev=mem1,sgx-epc.1.node=1 \
......
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20211101162009.62161-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
[Extracted from Maxim's patch into a separate commit. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20211111110604.207376-5-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In Linux 5.16, the padding of struct virtio_gpu_ctrl_hdr has become a
single-byte field followed by a uint8_t[3] array of padding bytes,
and virtio_gpu_ctrl_hdr_bswap does not compile anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211111110604.207376-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Here we introduce a new compiler flag to disable the checking of exit
request (icount_decr.u32). This is useful when we want to ensure the
next block cannot be preempted by an asynchronous event.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211129140932.4115115-2-alex.bennee@linaro.org>
- Fixes to image streaming job and block layer reconfiguration to make
iotest 030 pass again
- docs: Deprecate incorrectly typed device_add arguments
- file-posix: Fix alignment after reopen changing O_DIRECT
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEy2LXoO44KeRfAE00ofpA0JgBnN8FAmGTqp8SHGhyZWl0ekBy
ZWRoYXQuY29tAAoJEKH6QNCYAZzfE3MP/jP0AB8jjZOktZN0hGWetOcGPIIg8Bvs
D2/mMQiyFcqir3aqHhVvf4Ah2vLCKAGAB3tfq1RHEJdPLateOv1/L3XrSNgv8vSV
fnKzZlVxOsiZnieAQ5i3+mr4+2dYTr0+joVB6Ydo9DNsL1vRotFC/eouoKaRVcSj
N78Bo44US0/wnurW4AwXW2UP94j6telEo9uc8+WXkLTvyPC9oiDb1gb11OoBDj6z
T2wS8SzsNS7QHBwkC6kG/oWsC0NhyYwMoED17Ws5hKGA0QJ67p7/4ZFmjUyORio/
ho3335gg2Mm5qVOYLI7yOU7yOqnMWD1cx1wUCWNr+N00YoWlnsc78XQcy88DbekL
Rvju29BBLuhcZcYo3Blx/3YAVt1TfpkjJvP1bzs/Uw74fX20x/Hmq4x7rP1DbNNA
u6TYECdl99xlODRzk82Xy8du9rVz+MVUCUmKrVDWxYYs0sKO3SxrozZgDinF3Ti7
SyKDf6Cje8ozhTpo7urUQH62SRtifIYpvejVMLrT6vvPh5Rzsm7/OmsSUBFIRQtv
Aj3Z7Ma/kNSM07iaYBNAMFE+U7TBNrPikn0Oi3HfHu8pi7LUZ3ZOCegHbPgQ8C6V
72oW4juNdp3LHdi6eA+zstyM8pi50UkoZmtrWjrixZJsjVJaGFf9ayOLEWmjuxL2
pGogJtAFLqW/
=ltwK
-----END PGP SIGNATURE-----
Merge tag 'pull-block-2021-11-16' of https://gitlab.com/hreitz/qemu into staging
Block patches for 6.2.0-rc1:
- Fixes to image streaming job and block layer reconfiguration to make
iotest 030 pass again
- docs: Deprecate incorrectly typed device_add arguments
- file-posix: Fix alignment after reopen changing O_DIRECT
# gpg: Signature made Tue 16 Nov 2021 01:57:03 PM CET
# gpg: using RSA key CB62D7A0EE3829E45F004D34A1FA40D098019CDF
# gpg: issuer "hreitz@redhat.com"
# gpg: Good signature from "Hanna Reitz <hreitz@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: CB62 D7A0 EE38 29E4 5F00 4D34 A1FA 40D0 9801 9CDF
* tag 'pull-block-2021-11-16' of https://gitlab.com/hreitz/qemu:
file-posix: Fix alignment after reopen changing O_DIRECT
softmmu/qdev-monitor: fix use-after-free in qdev_set_id()
docs: Deprecate incorrectly typed device_add arguments
iotests/030: Unthrottle parallel jobs in reverse
block: Let replace_child_noperm free children
block: Let replace_child_tran keep indirect pointer
transactions: Invoke clean() after everything else
block: Restructure remove_file_or_backing_child()
block: Pass BdrvChild ** to replace_child_noperm
block: Drop detached child from ignore list
block: Unite remove_empty_child and child_free
block: Manipulate children list in .attach/.detach
stream: Traverse graph after modification
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Invoke the transaction drivers' .clean() methods only after all
.commit() or .abort() handlers are done.
This makes it easier to have nested transactions where the top-level
transactions pass objects to lower transactions that the latter can
still use throughout their commit/abort phases, while the top-level
transaction keeps a reference that is released in its .clean() method.
(Before this commit, that is also possible, but the top-level
transaction would need to take care to invoke tran_add() before the
lower-level transaction does. This commit makes the ordering
irrelevant, which is just a bit nicer.)
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20211111120829.81329-8-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-8-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
pci power management fixes
acpi hotplug fixes
misc other fixes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmGSh40PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpH2AH/RLY+ONL98GT4D+hi8MGCOhN669jrPbAqJ1L
l9p2GxDoXSD4PCDReU3VCzLtRsxsv/camgMx/DnDxaZtdgKm8SlXGJutMtNHpTY6
PHZLMLoixZSsKi6Tm3xGut/FSzsTZl4Gc3rqwaryHQLqptNO+XQJSBmP+oEAdjAd
nKVepHRveOTVVBBzCmoNpFA+BEXRTItdfG0ZKprPkXUobc2jeV7ymkTX9s2OBLEf
/pE49tZj1K4ab8g4+RY4cFEFoDZbXZ55Aq3ck5LAb47qIr/1cXPVP7PxINmasy4Y
H+oTUpWLBM7rFbdP/GBFANu5HkEQ5pnjeosWYOKsopE4UFCyDxc=
=y9C8
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
pci,pc,virtio: bugfixes
pci power management fixes
acpi hotplug fixes
misc other fixes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 15 Nov 2021 05:15:09 PM CET
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
* tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
pcie: expire pending delete
pcie: fast unplug when slot power is off
pcie: factor out pcie_cap_slot_unplug()
pcie: add power indicator blink check
pcie: implement slot power control for pcie root ports
pci: implement power state
vdpa: Check for existence of opts.vhostdev
vdpa: Replace qemu_open_old by qemu_open at
virtio: use virtio accessor to access packed event
virtio: use virtio accessor to access packed descriptor flags
tests: bios-tables-test update expected blobs
hw/i386/acpi-build: Deny control on PCIe Native Hot-plug in _OSC
bios-tables-test: Allow changes in DSDT ACPI tables
hw/acpi/ich9: Add compat prop to keep HPC bit set for 6.1 machine type
pcie: rename 'native-hotplug' to 'x-native-hotplug'
hw/mem/pc-dimm: Restrict NUMA-specific code to NUMA machines
vhost: Fix last vq queue index of devices with no cvq
vhost: Rename last_index to vq_index_end
softmmu/qdev-monitor: fix use-after-free in qdev_set_id()
net/vhost-vdpa: fix memory leak in vhost_vdpa_get_max_queue_pairs()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Our GICv3 QOM interface includes an array property
redist-region-count which allows board models to specify that the
registributor registers are not in a single contiguous range, but
split into multiple pieces. We implemented this for KVM, but
currently the TCG GICv3 model insists that there is only one region.
You can see the limit being hit with a setup like:
qemu-system-aarch64 -machine virt,gic-version=3 -smp 124
Add support for split regions to the TCG GICv3. To do this we switch
from allocating a simple array of MemoryRegions to an array of
GICv3RedistRegion structs so that we can use the GICv3RedistRegion as
the opaque pointer in the MemoryRegion read/write callbacks. Each
GICv3RedistRegion contains the MemoryRegion, a backpointer allowing
the read/write callback to get hold of the GICv3State, and an index
which allows us to calculate which CPU's redistributor is being
accessed.
Note that arm_gicv3_kvm always passes in NULL as the ops argument
to gicv3_init_irqs_and_mmio(), so the only MemoryRegion read/write
callbacks we need to update to handle this new scheme are the
gicv3_redist_read/write functions used by the emulated GICv3.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The GICv3 devices have an array property redist-region-count.
Currently we check this for errors (bad values) in
gicv3_init_irqs_and_mmio(), just before we use it. Move this error
checking to the arm_gicv3_common_realize() function, where we
sanity-check all of the other base-class properties. (This will
always be before gicv3_init_irqs_and_mmio() is called, because
that function is called in the subclass realize methods, after
they have called the parent-class realize.)
The motivation for this refactor is:
* we would like to use the redist_region_count[] values in
arm_gicv3_common_realize() in a subsequent patch, so we need
to have already done the sanity-checking first
* this removes the only use of the Error** argument to
gicv3_init_irqs_and_mmio(), so we can remove some error-handling
boilerplate
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Add an expire time for pending delete, once the time is over allow
pressing the attention button again.
This makes pcie hotplug behave more like acpi hotplug, where one can
try sending an 'device_del' monitor command again in case the guest
didn't respond to the first attempt.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20211111130859.1171890-7-kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows to power off pci devices. In "off" state the devices will
not be visible. No pci config space access, no pci bar access, no dma.
Default state is "on", so this patch (alone) should not change behavior.
Use case: Allows hotplug controllers implement slot power. Hotplug
controllers doing so should set the inital power state for devices in
the ->plug callback.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20211111130859.1171890-2-kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To solve issues [1-2] the Hot Plug Capable bit in PCIe Slots will be
turned on, while the switch to ACPI Hot-plug will be done in the
DSDT table.
Introducing 'x-keep-native-hpc' property disables the HPC bit only
in 6.1 and as a result keeps the forced 'reserve-io' on
pcie-root-ports in 6.1 too.
[1] https://gitlab.com/qemu-project/qemu/-/issues/641
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2006409
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20211112110857.3116853-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* force_rcu notifiers
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGMQFwUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPDcAf8CBwO73zxJd0Z3eHzgiSlDavb+ORy
vXkgGNMHBavaO+QuKzWrzuz42+6r+BW3mlMLEWGyEUGq7ZLbAwTGQ+zT+La8J+TG
xK872G8skl1j9Xb1TL7t/DeT9ja4MlZbB0LehFa/GIgh2V6mFjXTzH05PH5p9hd0
M8JGiLtrPEcIv4Df+T3pxbuQy45FqD4hLtEZJW4mKUm2oywxwHOLFty5+VVRxw5h
Rl5Xuf5UfhAdmmBIyIjhVcVGJf+I2Fg7M+6uf62RQ2SlVdg2ufanEL2uCYYPt4sD
kDbybursvyqf1IW4LF0vP2KznQE2Hckj6FeACYw32HrlQT6UzX7nbu2TdA==
=70MJ
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* Fixes for SGX
* force_rcu notifiers
# gpg: Signature made Wed 10 Nov 2021 10:57:48 PM CET
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
sgx: Reset the vEPC regions during VM reboot
numa: avoid crash with SGX and "info numa"
accel/tcg: Register a force_rcu notifier
rcu: Introduce force_rcu notifier
target/i386: sgx: mark device not user creatable
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The doc of this field pointed out that last_index is the last vq index.
This is misleading, since it's actually one past the end of the vqs.
Renaming and modifying comment.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211104085625.2054959-2-eperezma@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The drain_rcu_call() function can be blocked as long as an RCU reader
stays in a read-side critical section. This is typically what happens
when a TCG vCPU is executing a busy loop. It can deadlock the QEMU
monitor as reported in https://gitlab.com/qemu-project/qemu/-/issues/650 .
This can be avoided by allowing drain_rcu_call() to enforce an RCU grace
period. Since each reader might need to do specific actions to end a
read-side critical section, do it with notifiers.
Prepare ground for this by adding a notifier list to the RCU reader
struct and use it in wait_for_readers() if drain_rcu_call() is in
progress. An API is added for readers to register their notifiers.
This is largely based on a draft from Paolo Bonzini.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211109183523.47726-2-groug@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recent commit 6952026120 "monitor: Tidy up find_device_state()"
assumed the function's argument is "the device's ID or QOM path" (as
documented for device_del). It's actually either an absolute QOM
path, or a QOM path relative to /machine/peripheral/. Such a relative
path is a device ID when it doesn't contain a slash. When it does,
the function now always fails. Broke iotest 200, which uses relative
path "vda/virtio-backend".
It fails because object_resolve_path_component() resolves just one
component, not a relative path.
The obvious function to resolve relative paths is
object_resolve_path(). It picks a parent automatically. Too much
magic, we want to specify the parent. Create new
object_resolve_path_at() for that, and use it in find_device_state().
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20211019085711.86377-1-armbru@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
This will be used to implement PowerPC's dctfixqq.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-7-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-6-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This will be used to implement PowerPC's dcffixqq.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-2-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The single backtick markup in ReST is the "default role". Currently,
Sphinx's default role is called "content". Sphinx suggests you can use
the "Any" role instead to turn any single-backtick enclosed item into a
cross-reference.
This is useful for things like autodoc for Python docstrings, where it's
often nicer to reference other types with `foo` instead of the more
laborious :py:meth:`foo`. It's also useful in multi-domain cases to
easily reference definitions from other Sphinx domains, such as
referencing C code definitions from outside of kerneldoc comments.
Before we do that, though, we'll need to turn all existing usages of the
"content" role to inline verbatim markup wherever it does not correctly
resolve into a cross-refernece by using double backticks instead.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20211004215238.1523082-2-jsnow@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmGFFc0ACgkQTLbY7tPo
cTifABAA285RPpgwuI5RIo9rA6OfnPEAEC5z+NUX/g5X5hnMUXnfIGrPVG1mRU8r
yPqM0mAoLoYcVQcpGJZCUxOnpm0cTmYJDPIjZ28D8wtQtwEpCDagapckj9L87Sec
1HgGDoYqdtQoG1HvHRCwR5a35gt9Qi7JwdedT6DkCVy2Tk1NcrR2nfNIl+P/lsBy
zHZbPwPpPtHP1GirMGL8eq8tDLq6ymMG4wAVytFOHJkfkUxlsK7oB6640jKxGAzB
j5BW/XyrD2nbPNhtu7JdoKHr3MuP4rCSejllSqzNu0gPsxN7/eLbKj74bW8eaRO1
FZRneQgmgHwipKDgGFve9RBDlRWpMSNdkuM4h8VVW3ZSb1bWgFFmwStcRpas3Vcw
Cvg4V7pmO/980Dq7gnbwlmK+QTM2bTEfsUCR74pekgimqxT//85HK10i5/5fRWCo
cjQUz5809XlFMv7IDjb1wv/uUmIoUAzVSBYxWQF3CzTNlRaHl77K5C5NXgLohxl5
m+/drFZIVFWNJP8ERa8Y2W/Mlf6jwU9r35wy0UsGOHm4skYHETa/TSvkDGfMsWhu
QVRG/XmpBjbKk08tGCvPlrS8zdbT8XMR86WsBQj8kpouawd+Uvo+lRitLfv3+oXd
VnYyXck8Wx/atodOQC7IeFFWyKqkqhKVf4spqYKg9s12YA+LM3U=
=O6Wg
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kraxel/tags/egl-20211105-pull-request' into staging
gtk: a collection of egl fixes.
# gpg: Signature made Fri 05 Nov 2021 07:30:21 AM EDT
# gpg: using RSA key A0328CFFB93A17A79901FE7D4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
* remotes/kraxel/tags/egl-20211105-pull-request:
ui/gtk-egl: blitting partial guest fb to the proper scanout surface
ui/gtk: gd_draw_event returns FALSE when no cairo surface is bound
ui/gtk-egl: guest fb texture needs to be regenerated when reinitializing egl
ui/gtk-egl: make sure the right context is set as the current
ui/gtk-egl: un-tab and re-tab should destroy egl surface and context
virtio-gpu: splitting one extended mode guest fb into n-scanouts
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When guest is running Linux/X11 with extended multiple displays mode enabled,
the guest shares one scanout resource each time containing whole surface
rather than sharing individual display output separately. This extended frame
is properly splited and rendered on the corresponding scanout surfaces but
not in case of blob-resource (zero copy).
This code change lets the qemu split this one large surface data into multiple
in case of blob-resource as well so that each sub frame then can be blitted
properly to each scanout.
v2: resizing qemu console in virtio_gpu_update_dmabuf to scanout's width and
height
v3: updating stub function of virtio_gpu_update_dmabuf to match the type
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20211104065153.28897-5-dongwon.kim@intel.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Currently we make the assumption that the guest frontend loads all
op code bytes sequentially. This mostly holds up for regular fixed
encodings but some architectures like s390x like to re-read the
instruction which causes weirdness to occur. Rather than changing the
frontends make the plugin API a little more ergonomic and able to
handle the re-read case.
Stuff will still get strange if we read ahead of the opcode but so far
no front ends have done that and this patch asserts the case so we can
catch it early if they do.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211026102234.3961636-21-alex.bennee@linaro.org>
* DMA support in the multiboot option ROM
* Rename default-bus-bypass-iommu
* Deprecate -watchdog and cleanup -watchdog-action
* HVF fix for <PAGE_SIZE regions
* Support TSC scaling for AMD nested virtualization
* Fix for ESP fuzzing bug
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGBUeEUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOh+Qf+OMRhRiv6dYjbK/5zXrx81AgxYAY3
dBUSr8v16LyrMl1U3DZWzhD+MzQsC83m/Xsh4lGxlHDWtkK9QQA5xDG95JZdY26i
MGCbbjnFHISbyBQV9Y724gPfPjOOODuoFbzafSx6VLITOcyv1ye0cm7TOjOPB+tt
E4c3JqTZ7g8a5yMe8ItkVhz5pPY+oVw8dxMNRp6Sup5Dbfx0DjacIwLasLsHfPL7
qBADfqB20ovHUzLjXu7oWgEd4KxJ6kiSCaJJu/KD36hg0wB8+WVP1o43j4PkczHT
QjU7eZaeaTrN5Cf34ttPge6QReMi5SFNCaA9O9/HLqrQgdEtt/diZWuqjQ==
=a2mC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Build system fixes and cleanups
* DMA support in the multiboot option ROM
* Rename default-bus-bypass-iommu
* Deprecate -watchdog and cleanup -watchdog-action
* HVF fix for <PAGE_SIZE regions
* Support TSC scaling for AMD nested virtualization
* Fix for ESP fuzzing bug
# gpg: Signature made Tue 02 Nov 2021 10:57:37 AM EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* remotes/bonzini/tags/for-upstream: (27 commits)
configure: fix --audio-drv-list help message
configure: Remove the check for the __thread keyword
Move the l2tpv3 test from configure to meson.build
meson: remove unnecessary coreaudio test program
meson: remove pointless warnings
meson.build: Allow to disable OSS again
meson: bump submodule to 0.59.3
qtest/am53c974-test: add test for cancelling in-flight requests
esp: ensure in-flight SCSI requests are always cancelled
KVM: SVM: add migration support for nested TSC scaling
hw/i386: fix vmmouse registration
watchdog: remove select_watchdog_action
vl: deprecate -watchdog
watchdog: add information from -watchdog help to -device help
hw/i386: Rename default_bus_bypass_iommu
hvf: Avoid mapping regions < PAGE_SIZE as ram
configure: do not duplicate CPU_CFLAGS into QEMU_LDFLAGS
configure: remove useless NPTL probe
target/i386: use DMA-enabled multiboot ROM for new-enough QEMU machine types
optionrom: add a DMA-enabled multiboot ROM
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add gdb-xml for MVE
More uses of tcg_constant_* in target/arm
Fix parameter naming for default-bus-bypass-iommu
Ignore cache operations to mmio in HVF
-----BEGIN PGP SIGNATURE-----
iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmGBgjkdHHJpY2hhcmQu
aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8sAAgAsHaW2sHH/W4TzCwl
DfqFar4u047Q+ZtQHjNehGHF9Bxp4NS4A0qL52vk0hVoqeWlyF1N29MOnewgVDqY
q1x+uxJtG9xjTse7oEEshEEFF/7J8eB8dN4E78TFn/6IhvVhGiUeeRu29s44Ot6N
E2KABcXfd+4gEdqhepLGEbi5n0TnA8ARmmeffZNWVEbsxQjHnMQQYmqGmllB3xV3
qPpnp3avvD1015zMwrLVmlDO+tSRr/1bed7k3k26ebga2B/zitxcpXFNCDlgePx0
LNT5QYvBDpE7HOruGQjf4iXPJHfYw5VMtopK7K++rY9KWiJgBVSjQUcB462sdCPk
wNAp0g==
=vlZ5
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/rth/tags/pull-arm-20211102-2' into staging
Add nuvoton sd module for NPCM7XX
Add gdb-xml for MVE
More uses of tcg_constant_* in target/arm
Fix parameter naming for default-bus-bypass-iommu
Ignore cache operations to mmio in HVF
# gpg: Signature made Tue 02 Nov 2021 02:23:53 PM EDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* remotes/rth/tags/pull-arm-20211102-2:
hvf: arm: Ignore cache operations on MMIO
hw/arm/virt: Rename default_bus_bypass_iommu
target/arm: Use tcg_constant_i32() in gen_rev16()
target/arm: Use tcg_constant_i64() in do_sat_addsub_64()
target/arm: Use the constant variant of store_cpu_field() when possible
target/arm: Introduce store_cpu_field_constant() helper
target/arm: Use tcg_constant_i32() in op_smlad()
target/arm: Advertise MVE to gdb when present
tests/qtest/libqos: add SDHCI commands
hw/arm: Attach MMC to quanta-gbs-bmc
hw/arm: Add Nuvoton SD module to board
hw/sd: add nuvoton MMC
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
virtio-iommu support for x86/ACPI.
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmGAefYPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpiCUH/2pIs3FmOGIasEqn4BnqXr4dHMReUO5Ghg0v
cXle4ZUrbg7Qpnxh07CwMuUpJV3Qv+xtVK7hzbD13nnxrkTZuKzBRV1AthkA1Hly
zIKOxnEgV497LaXoaSOtqAx48fuznk5XOHju91usgu4mehJ0qe2gcwb4H8uWGkQi
hrsR7a9woP0M4H/jvb3+aQRCJKMscj8ReabM1ulOugNpPdNI/jIKtBvZBtTxAqtQ
CH9/DJLfVmzDRYdeBpnF06A+tXm4uU1Q5BmpmF9qaymk/PzthN54gdnDd6zH405Z
Tmjp9UA2xfEYDmKzuTCBdPmoUe6OI7mU9o0WbB5MGYx5RRRBETw=
=R7DD
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,pci,virtio: features, fixes
virtio-iommu support for x86/ACPI.
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 01 Nov 2021 07:36:22 PM EDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
* remotes/mst/tags/for_upstream:
hw/i386: fix vmmouse registration
pci: Export pci_for_each_device_under_bus*()
pci: Define pci_bus_dev_fn/pci_bus_fn/pci_bus_ret_fn
hw/i386/pc: Allow instantiating a virtio-iommu device
hw/i386/pc: Move IOMMU singleton into PCMachineState
hw/i386/pc: Remove x86_iommu_get_type()
hw/acpi: Add VIOT table
vhost-vdpa: Set discarding of RAM broken when initializing the backend
qtest: fix 'expression is always false' build failure in qtest_has_accel()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
to linux-user/host/arch/host-signal.h
- Replace TCGCPUOps.tlb_fill with TCGCPUOps.record_sigsegv for user-only
- Add TCGCPUOps.record_sigbus for user-only
- Remove a lot of target-specific cpu_loop handling for signals,
now accomplished with generic code.
-----BEGIN PGP SIGNATURE-----
iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmGBG6YdHHJpY2hhcmQu
aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+6Lwf6A03bNhdXQjSjwUOD
tW0qGiZ5jtomitHdR04xawkGYWyAoojg7QMm8ZsOgGaGHVRaycmIXaOhIbX7S9/I
aV0nSn4GHhCWtvFUFvsYdptN/sf1AtuG6L1T4uzJ3qog1G6aKbdbRl85aWWR4w6P
+YTsQIQaU16k7vG7WwItMyga2fcAgT1qkofCHokw5CCJfni6mfNF1Ku8cWSVHSwF
5KXs7G5nGLHLKEAtrE+R7xzJPPwwmWR3BNHeTWSd0YDEMWEMxSzLZEHnnTeos4EF
Hl1WNNO7d8W6KWud4nsMBuPumIMtJwDtqUqSyW4DfvO+Dw+dgvAqoxt8bV/jtJp+
QCLE0g==
=Vv/u
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20211102' into staging
- Split out host signal handing from accel/tcg/user-exec.c
to linux-user/host/arch/host-signal.h
- Replace TCGCPUOps.tlb_fill with TCGCPUOps.record_sigsegv for user-only
- Add TCGCPUOps.record_sigbus for user-only
- Remove a lot of target-specific cpu_loop handling for signals,
now accomplished with generic code.
# gpg: Signature made Tue 02 Nov 2021 07:06:14 AM EDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* remotes/rth/tags/pull-tcg-20211102: (60 commits)
linux-user: Handle BUS_ADRALN in host_signal_handler
tcg: Add helper_unaligned_{ld,st} for user-only sigbus
accel/tcg: Report unaligned load/store for user-only
accel/tcg: Report unaligned atomics for user-only
target/sparc: Set fault address in sparc_cpu_do_unaligned_access
target/sparc: Split out build_sfsr
target/sparc: Remove DEBUG_UNALIGNED
target/sh4: Set fault address in superh_cpu_do_unaligned_access
target/s390x: Implement s390x_cpu_record_sigbus
linux-user/ppc: Remove POWERPC_EXCP_ALIGN handling
target/ppc: Restrict ppc_cpu_do_unaligned_access to sysemu
target/ppc: Set fault address in ppc_cpu_do_unaligned_access
target/ppc: Move SPR_DSISR setting to powerpc_excp
target/microblaze: Do not set MO_ALIGN for user-only
linux-user/hppa: Remove EXCP_UNALIGN handling
target/arm: Implement arm_cpu_record_sigbus
target/alpha: Implement alpha_cpu_record_sigbus
linux-user: Add cpu_loop_exit_sigbus
hw/core: Add TCGCPUOps.record_sigbus
accel/tcg: Restrict TCGCPUOps::tlb_fill() to sysemu
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Without this the struct has the wrong size: sizeof() evaluates
to 16 instead of 13. In most cases the bug is hidden by the
fact that guests submits a buffer which is exactly 13 bytes
long, so the padding added by the compiler is simply ignored.
But sometimes guests submit a larger buffer and expect a short
transfer, which does not work properly with the wrong struct
size.
Cc: vintagepc404@protonmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Fixes: a917d384ac ("SCSI TCQ support.")
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210906045523.1259629-1-kraxel@redhat.com>
Any extra draw call for the same blob resource representing guest scanout
before the previous drawing is not finished can break synchronous draw
sequence. To prevent this, drawing is now done only once for each draw
submission (when draw_submitted == true).
v2:
- removed mutex
- updated commit msg
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-Id: <20210924225105.24930-1-dongwon.kim@intel.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Allows edk2 detect virtio-mmio devices and pcie ecam.
See comment in hw/i386/microvm-dt.c for more details.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20211014193617.2475578-1-kraxel@redhat.com>
This is a counterpart to the HMP "info opcount" command. It is being
added with an "x-" prefix because this QMP command is intended as an
ad hoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is a counterpart to the HMP "info jit" command. It is being
added with an "x-" prefix because this QMP command is intended as an
ad hoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is a counterpart to the HMP "info ramblock" command. It is being
added with an "x-" prefix because this QMP command is intended as an
adhoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is a counterpart to the HMP "info rdma" command. It is being
added with an "x-" prefix because this QMP command is intended as an
adhoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This provides a foundation on which to convert simple HMP commands to
use QMP. The QMP implementation will generate formatted text targeted
for human consumption, returning it in the HumanReadableText data type.
The HMP command handler will simply print out the formatted string
within the HumanReadableText data type. Since this will be an entirely
formulaic action in the case of HMP commands taking no arguments, a
custom command handler is provided.
Thus instead of registering a 'cmd' callback for the HMP command, a
'cmd_info_hrt' callback is provided, which will simply be a pointer
to the QMP implementation.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This turns the pattern
if (err) {
hmp_handle_error(mon, err);
return;
}
into
if (hmp_handle_error(mon, err)) {
return;
}
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This command was turned into a no-op four years ago in
commit 0c8465440d
Author: Peter Xu <peterx@redhat.com>
Date: Fri Dec 29 15:31:04 2017 +0800
hmp: obsolete "info ioapic"
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Instead of invoking select_watchdog_action from both HMP and command line,
go directly from HMP to QMP and use QemuOpts as the intermediary for the
command line.
This makes -watchdog-action explicitly a shortcut for "-action watchdog",
so that "-watchdog-action" and "-action watchdog" override each other
based on the position on the command line; previously, "-action watchdog"
always won.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This removes a parameter from x86_load_linux, and will avoid code
duplication between the linux and multiboot cases once multiboot
starts to support DMA.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As qemu guidelines:
Unless a pointer is used to modify the pointed-to storage, give it the
"const" attribute.
In the particular case of iova_tree_find it allows to enforce what is
requested by its comment, since the compiler would shout in case of
modifying or freeing the const-qualified returned pointer.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211013182713.888753-2-eperezma@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hi
this includes pending bits of migration patches.
- virtio-mem support by David Hildenbrand
- dirtyrate improvements by Hyman Huang
- fix rdma wrid by Li Zhijian
- dump-guest-memory fixes by Peter Xu
Pleas apply.
Thanks, Juan.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmGAZEAACgkQ9IfvGFhy
1yPMlxAAx3HRMTCqlluM6B28TKHpGmg7O87g6F0U5fRZNJEro+8p08zYC1Yo2HNm
Po7dd++lZxcGPKrq7q1IKPH+wbQ5Yg/3jCeruXP2GRq3AKo9MyUK4WKd2BKRZbnl
q2oioUSLKYmsUqyl6YI/8nlgyDvmdGet8+GHxhmG5fVNGabWnGhwJDlCbOh1LAqb
cqACvahXuIVj3X7nMbz3e3Xy4YY/hJqJb3+e0DrQwlPDQRLDhadlQ7zv9vJ75BeY
Lt0/jnYI223m5LuiTecjv1S9AQjQpqJZq9N2K9miXmd3jtVkm2iqHdXZDK/Sr5oO
TE5OCf8xtFEcZ2KNwxQYMW+gkx2Gj6aoxIobu3HJ5kELErmvVhdnM7rkLmSHf8WB
Un/O55xUE/Hyg4G/oZOjAwk6eHS7RM+fIBq5wDGn5MNyYpBXid6JhWxSKv0i/gFX
8JA5i8wyzkUD23c8Ez+Ms6nmIL9LJS7xpVx9jqV2fNBdf+15opHg2ufnB5NnQ9y8
JJkzPjW2xKh5EsznY8iDeTztN7Im9Bn+4VcNl53Okugh5QFlTOtcAE21EjPrhv0K
XC6PJmDnSZenhJkhgXeDzUe4wZu9wvAjH/R/yTVrW2jT51Azebw3dtreX8F/Dqap
n+T+jupShCrrNFw0tCWsuLu+OZJrSwA83tFo+6DfH/idi0CJoJs=
=8B3Y
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/juanquintela/tags/migration-20211031-pull-request' into staging
Migration Pull request
Hi
this includes pending bits of migration patches.
- virtio-mem support by David Hildenbrand
- dirtyrate improvements by Hyman Huang
- fix rdma wrid by Li Zhijian
- dump-guest-memory fixes by Peter Xu
Pleas apply.
Thanks, Juan.
# gpg: Signature made Mon 01 Nov 2021 06:03:44 PM EDT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
* remotes/juanquintela/tags/migration-20211031-pull-request:
migration/dirtyrate: implement dirty-bitmap dirtyrate calculation
memory: introduce total_dirty_pages to stat dirty pages
migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots
migration/ram: Factor out populating pages readable in ram_block_populate_pages()
migration: Simplify alignment and alignment checks
migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
virtio-mem: Drop precopy notifier
migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source
virtio-mem: Implement replay_discarded RamDiscardManager callback
memory: Introduce replay_discarded callback for RamDiscardManager
dump-guest-memory: Block live migration
migration: Add migrate_add_blocker_internal()
migration: Make migration blocker work for snapshots too
migration/dirtyrate: implement dirty-ring dirtyrate calculation
migration/dirtyrate: move init step of calculation to main thread
migration/dirtyrate: adjust order of registering thread
migration/dirtyrate: introduce struct and adjust DirtyRateStat
memory: make global_dirty_tracking a bitmask
KVM: introduce dirty_pages and kvm_dirty_ring_enabled
migration/rdma: Fix out of order wrid
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Between the submission of a request and the unplug, other devices
with larger limits may have been queued new requests without flushing
the batch.
Using the new `dev_max_batch` parameter, laio_io_unplug() can check
if the batch exceeds the device limit to flush the current batch.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20211026162346.253081-4-sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This new parameter can be used by block devices to limit the
Linux AIO batch size more than the limit set by the AIO context.
file-posix backend supports this, passing its `aio-max-batch` option
previously added.
Add an helper function to calculate the maximum batch size.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20211026162346.253081-3-sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
To be called from tcg generated code on hosts that support
unaligned accesses natively, in response to an access that
is supposed to be aligned.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is a new interface to be provided by the os emulator for
raising SIGBUS on fault. Use the new record_sigbus target hook.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a new user-only interface for updating cpu state before
raising a signal. This will take the place of do_unaligned_access
for user-only and should result in less boilerplate for each guest.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have replaced tlb_fill with record_sigsegv for user mode.
Move the declaration to restrict it to system emulation.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is a new interface to be provided by the os emulator for
raising SIGSEGV on fault. Use the new record_sigsegv target hook.
Reviewed by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a new user-only interface for updating cpu state before
raising a signal. This will replace tlb_fill for user-only
and should result in less boilerplate for each guest.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that all of the linux-user hosts have been converted
to host-signal.h, drop the compatibility code.
Reviewed by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
They're actually more commonly used than the helper without _under_bus, because
most callers do have the pci bus on hand. After exporting we can switch a lot
of the call sites to use these two helpers.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20211028043129.38871-3-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
They're used in quite a few places of pci.[ch] and also in the rest of the code
base. Define them so that it doesn't need to be defined all over the places.
The pci_bus_fn is similar to pci_bus_dev_fn that only takes a PCIBus* and an
opaque. The pci_bus_ret_fn is similar to pci_bus_fn but it allows to return a
void* pointer.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20211028043129.38871-2-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We're about to support a third vIOMMU for x86, virtio-iommu which
doesn't inherit X86IOMMUState. Move the IOMMU singleton into
PCMachineState, so it can be shared between all three vIOMMUs.
The x86_iommu_get_default() helper is still needed by KVM and IOAPIC to
fetch the default IRQ-remapping IOMMU. Since virtio-iommu doesn't
support IRQ remapping, this interface doesn't need to change for the
moment. We could later replace X86IOMMUState with an "IRQ remapping
IOMMU" interface if necessary.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211026182024.2642038-4-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To generate the IOMMU ACPI table, acpi-build.c can use base QEMU types
instead of a special IommuType value.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211026182024.2642038-3-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
introduce global var total_dirty_pages to stat dirty pages
along with memory_global_dirty_log_sync.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Migration code now properly handles RAMBlocks which are indirectly managed
by a RamDiscardManager. No need for manual handling via the free page
optimization interface, let's get rid of it.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce replay_discarded callback similar to our existing
replay_populated callback, to be used my migration code to never migrate
discarded memory.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
An internal version that removes -only-migratable implications. It can be used
for temporary migration blockers like dump-guest-memory.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
since dirty ring has been introduced, there are two methods
to track dirty pages of vm. it seems that "logging" has
a hint on the method, so rename the global_dirty_log to
global_dirty_tracking would make description more accurate.
dirty rate measurement may start or stop dirty tracking during
calculation. this conflict with migration because stop dirty
tracking make migration leave dirty pages out then that'll be
a problem.
make global_dirty_tracking a bitmask can let both migration and
dirty rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION
and GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty
tracking aims for, migration or dirty rate.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <9c9388657cfa0301bd2c1cfa36e7cf6da4aeca19.1624040308.git.huangy81@chinatelecom.cn>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
dirty_pages is used to calculate dirtyrate via dirty ring, when
enabled, kvm-reaper will increase the dirty pages after gfns
being dirtied.
kvm_dirty_ring_enabled shows if kvm-reaper is working. dirtyrate
thread could use it to check if measurement can base on dirty
ring feature.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <fee5fb2ab17ec2159405fc54a3cff8e02322f816.1624040308.git.huangy81@chinatelecom.cn>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Now that we check sysbus device types during device creation, we
can remove the check in the machine init done notifier.
This was the only thing done by this notifier, so we remove the
whole sysbus_notifier structure of the MachineState.
Note: This notifier was checking all /peripheral and /peripheral-anon
sysbus devices. Now we only check those added by -device cli option or
device_add qmp command when handling the command/option. So if there
are some devices added in one of these containers manually (eg in
machine C code), these will not be checked anymore.
This use case does not seem to appear apart from
hw/xen/xen-legacy-backend.c (it uses qdev_set_id() and in this case,
not for a sysbus device, so it's ok).
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211029142258.484907-4-damien.hedde@greensocs.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Right now the allowance check for adding a sysbus device using
-device cli option (or device_add qmp command) is done well after
the device has been created. It is done during the machine init done
notifier: machine_init_notify() in hw/core/machine.c
This new function will allow us to do the check at the right time and
issue an error if it fails.
Also make device_is_dynamic_sysbus() use the new function.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211029142258.484907-2-damien.hedde@greensocs.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
We are going to introduce an unit test for the parser smp_parse()
in hw/core/machine.c, but now machine.c is only built in softmmu.
In order to solve the build dependency on the smp parsing code and
avoid building unrelated stuff for the unit tests, move the tested
code from machine.c into a separate file, i.e., machine-smp.c and
build it in common field.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211026034659.22040-2-wangyanan55@huawei.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This is the major portion of handle_cpu_signal which is specific
to tcg, handling the page protections for the translations.
Most of the rest will migrate to linux-user/ shortly.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Pass guest address to handle_sigsegv_accerr_write.
Split out a function to adjust the raw signal pc into a
value that could be passed to cpu_restore_state.
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Adjust pc in place; return MMUAccessType.
This function is very simple and provides no advantage. Call sites
become simpler without it so just write it in line and drop the
separate function.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <a98d1f7f94e91a42796b7d91e9153a7eaa3d1c44.1635541329.git.balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <92902ba34fdf2c8c62232365fbb6531b1036d557.1635541329.git.balaton@eik.bme.hu>
[PMD: Use g_strdup() to initialize DeviceState::id]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
New option parameters unstable-input and unstable-output set policy
for unstable interfaces just like deprecated-input and
deprecated-output set policy for deprecated interfaces (see commit
6dd75472d5 "qemu-options: New -compat to set policy for deprecated
interfaces"). This is intended for testing users of the management
interfaces. It is experimental.
For now, this covers only syntactic aspects of QMP, i.e. stuff tagged
with feature 'unstable'. We may want to extend it to cover semantic
aspects, or the command line.
Note that there is no good way for management application to detect
presence of these new option parameters: they are not visible output
of query-qmp-schema or query-command-line-options. Tolerable, because
it's meant for testing. If running with -compat fails, skip the test.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-Id: <20211028102520.747396-10-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Doc comments fixed up]
The code to check policy for handling deprecated input is triplicated.
Factor it out into compat_policy_input_ok() before I mess with it in
the next commit.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211028102520.747396-9-armbru@redhat.com>
[Policy code moved from qmp-dispatch.c to qapi-util.c to make visitors
link without qmp-dispatch.o]
- Use a shared PLIC config helper function
- Fixup the OpenTitan PLIC configuration
- Add support for the experimental J extension
- Update the fmin/fmax handling
- Fixup VS interrupt forwarding
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE9sSsRtSTSGjTuM6PIeENKd+XcFQFAmF7nNMACgkQIeENKd+X
cFQhegf/U3L/SOPCU5uICn67TZHRUeyzH1ebw6p9hHkGbhUq2hLsg2N5yqIPusbM
Y/uouTHciRXqSNiqNle24wvdORxBPdwkE+hplyU3os3wvIelU+8HAhBIrFsJPOVV
G3kuMoc7rKPhjbwSjSIQcrfDA52pT3wQJUfza3bvVZ1VoI4jb+I2yopRLVq7S0qA
d/Hl5QoUC/CcSrpubDp8AtN/lQWnmFlFq4vfbaFg/NJK3+lCR1JWc8RxfRJ3Y9T0
V3AZP8m8+dTubjoMAbNkuIlRLKtvCfa+qFe9WFwxLKul6sbM/qONVDFQJAiHczyG
Pjkg77ZXRXvu1McMN/rF4tz5k9dQOA==
=5EIe
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/alistair23/tags/pull-riscv-to-apply-20211029-1' into staging
Fifth RISC-V PR for QEMU 6.2
- Use a shared PLIC config helper function
- Fixup the OpenTitan PLIC configuration
- Add support for the experimental J extension
- Update the fmin/fmax handling
- Fixup VS interrupt forwarding
# gpg: Signature made Fri 29 Oct 2021 12:03:47 AM PDT
# gpg: using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [full]
* remotes/alistair23/tags/pull-riscv-to-apply-20211029-1:
target/riscv: change the api for RVF/RVD fmin/fmax
softfloat: add APIs to handle alternative sNaN propagation for fmax/fmin
target/riscv: remove force HS exception
target/riscv: fix VS interrupts forwarding to HS
target/riscv: Allow experimental J-ext to be turned on
target/riscv: Implement address masking functions required for RISC-V Pointer Masking extension
target/riscv: Support pointer masking for RISC-V for i/c/f/d/a types of instructions
target/riscv: Print new PM CSRs in QEMU logs
target/riscv: Add J extension state description
target/riscv: Support CSRs required for RISC-V PM extension except for the h-mode
target/riscv: Add CSR defines for RISC-V PM extension
target/riscv: Add J-extension into RISC-V
hw/riscv: opentitan: Fixup the PLIC context addresses
hw/riscv: virt: Use the PLIC config helper function
hw/riscv: microchip_pfsoc: Use the PLIC config helper function
hw/riscv: sifive_u: Use the PLIC config helper function
hw/riscv: boot: Add a PLIC config string function
hw/riscv: virt: Don't use a macro for the PLIC configuration
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The code to check enumeration value policy can see special feature
flag 'deprecated' in QEnumLookup member flags[value]. I want to make
feature flag 'unstable' visible there as well, so I can add policy for
it.
Instead of extending flags[], replace it by @special_features (a
bitset of QapiSpecialFeature), because that's how special features get
passed around elsewhere.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-Id: <20211028102520.747396-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
The code to check command policy can see special feature flag
'deprecated' as command flag QCO_DEPRECATED. I want to make feature
flag 'unstable' visible there as well, so I can add policy for it.
To let me make it visible, add member @special_features (a bitset of
QapiSpecialFeature) to QmpCommand, and adjust the generator to pass it
through qmp_register_command(). Then replace "QCO_DEPRECATED in
@flags" by QAPI_DEPRECATED in @special_features", and drop
QCO_DEPRECATED.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-Id: <20211028102520.747396-7-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The generated visitor functions call visit_deprecated_accept() and
visit_deprecated() when visiting a struct member with special feature
flag 'deprecated'. This makes the feature flag visible to the actual
visitors. I want to make feature flag 'unstable' visible there as
well, so I can add policy for it.
To let me make it visible, replace these functions by
visit_policy_reject() and visit_policy_skip(), which take the member's
special features as an argument. Note that the new functions have the
opposite sense, i.e. the return value flips.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20211028102520.747396-6-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Unbreak forward visitor]
New enum QapiSpecialFeature enumerates the special feature flags.
New helper gen_special_features() returns code to represent a
collection of special feature flags as a bitset.
The next few commits will put them to use.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-Id: <20211028102520.747396-5-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-Id: <20211028102520.747396-4-armbru@redhat.com>
For "fmax/fmin ft0, ft1, ft2" and if one of the inputs is sNaN,
The original logic:
Return NaN and set invalid flag if ft1 == sNaN || ft2 == sNan.
The alternative path:
Set invalid flag if ft1 == sNaN || ft2 == sNaN.
Return NaN only if ft1 == NaN && ft2 == NaN.
The IEEE 754 spec allows both implementation and some architecture such
as riscv choose different defintions in two spec versions.
(riscv-spec-v2.2 use original version, riscv-spec-20191213 changes to
alternative)
Signed-off-by: Chih-Min Chao <chihmin.chao@sifive.com>
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211021160847.2748577-2-frank.chang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20211022060133.3045020-4-alistair.francis@opensource.wdc.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20211022060133.3045020-3-alistair.francis@opensource.wdc.com
Add a generic function that can create the PLIC strings.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20211022060133.3045020-2-alistair.francis@opensource.wdc.com
Using a macro for the PLIC configuration doesn't make the code any
easier to read. Instead it makes it harder to figure out what is going
on, so let's remove it.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20211022060133.3045020-1-alistair.francis@opensource.wdc.com
These will be used to implement new decimal floating point
instructions from Power ISA 3.1.
The remainder is now returned directly by divu128/divs128,
freeing up phigh to receive the high 64 bits of the quotient.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211025191154.350831-4-luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move udiv_qrnnd() from include/fpu/softfloat-macros.h to host-utils,
so it can be reused by divu128().
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211025191154.350831-3-luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for changing the divu128/divs128 implementations
to allow for quotients larger than 64 bits, move the div-by-zero
and overflow checks to the callers.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211025191154.350831-2-luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Addition of not and xor on 128-bit integers.
Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
Message-Id: <20211025122818.168890-3-frederic.petrot@univ-grenoble-alpes.fr>
[rth: Split out logical operations.]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This copies the code implementing the policy from qapi/qmp-dispatch.c
to qapi/qobject-input-visitor.c. Tolerable, but if we acquire more
copies, we should look into factoring them out.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
Message-Id: <20211025042405.3762351-5-armbru@redhat.com>
The next commit needs to access compat policy from the generic visitor
core. Move it there from qobject input and output visitor.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20211025042405.3762351-4-armbru@redhat.com>
- Vector extension bug fixes
- Bit manipulation extension bug fix
- Support vhost-user and numa mem options on all boards
- Rationalise XLEN and operand lengths
- Bump the OpenTitan FPGA support
- Remove the Ibex PLIC
- General code cleanup
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE9sSsRtSTSGjTuM6PIeENKd+XcFQFAmFyvkoACgkQIeENKd+X
cFRvOQgAmSgFMu7BbiuLJ7W/0kK/a8vM53+HAMFyFvYiTS7ae2NRmxKyeR/OhFLf
qm36G/SbBhymWAvJysACa7mjvn104WnXkMdhLYGimkne/65IbnQOImzpXk81WP/n
45p/y0DdCs5pP2JNf9aIVxMMzgH2Lo4IgKxLFz+qpnphxtZGjdpocrCz837BOZQH
p61hadfi3rTc06w0Auq8A4Zr+Rp/gpoGzXmB0ujfRGfqi+brg40TV4EAX33e3BJi
249922MbFy2KqqO8H8So9rTZiK4gb/KJ0vGp61nwPcGsf1JPWXpeeb2g6L1syXHW
8J9VezmnIs8bs2SXDvkf0aZpEvDrIA==
=AKR1
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/alistair23/tags/pull-riscv-to-apply-20211022-2' into staging
Fourth RISC-V PR for QEMU 6.2
- Vector extension bug fixes
- Bit manipulation extension bug fix
- Support vhost-user and numa mem options on all boards
- Rationalise XLEN and operand lengths
- Bump the OpenTitan FPGA support
- Remove the Ibex PLIC
- General code cleanup
# gpg: Signature made Fri 22 Oct 2021 06:36:10 AM PDT
# gpg: using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [full]
* remotes/alistair23/tags/pull-riscv-to-apply-20211022-2: (33 commits)
hw/riscv: spike: Use MachineState::ram and MachineClass::default_ram_id
hw/riscv: sifive_u: Use MachineState::ram and MachineClass::default_ram_id
hw/riscv: sifive_e: Use MachineState::ram and MachineClass::default_ram_id
hw/riscv: shakti_c: Use MachineState::ram and MachineClass::default_ram_id
hw/riscv: opentitan: Use MachineState::ram and MachineClass::default_ram_id
hw/riscv: microchip_pfsoc: Use MachineState::ram and MachineClass::default_ram_id
hw/intc: sifive_plic: Cleanup the irq_request function
hw/intc: sifive_plic: Cleanup the realize function
hw/intc: sifive_plic: Move the properties
hw/intc: Remove the Ibex PLIC
hw/riscv: opentitan: Update to the latest build
target/riscv: Compute mstatus.sd on demand
target/riscv: Use riscv_csrrw_debug for cpu_dump
target/riscv: Use gen_shift*_per_ol for RVB, RVI
target/riscv: Use gen_unary_per_ol for RVB
target/riscv: Adjust trans_rev8_32 for riscv64
target/riscv: Use gen_arith_per_ol for RVM
target/riscv: Replace DisasContext.w with DisasContext.ol
target/riscv: Replace is_32bit with get_xl/get_xlen
target/riscv: Properly check SEW in amo_op
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since commit 12b6e9b27d ("disas: Clean up CPUDebug initialization")
the disassemble_info->bfd_endian enum is set for all targets in
target_disas(). We can directly call print_insn_nios2() and simplify.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210807110939.95853-3-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* New fp5280g2-bmc board (John)
* Small cleanup in Aspeed SMC model (Cedric)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmFybmYACgkQUaNDx8/7
7KEqzRAAigdJJDS+F7uVYN9srdu8GtbXKRaGYgBcaPIGMzC42Ot9hE46UMUPPbf/
72QXynB09IvtkFgkX7xCaR3Lu1II7Ag+vf/VrjPltX1F05tdj9hTjPflJnPxWbBo
2nTU1vlUPXseaD5HQV/0ygYzqvbqyjCIhjSn0RKoMgKwrimBo52UByemOzh1unbV
nS4eCtQZlUj8ZPiwKViXk1XHscVyQY/7NhTTgdAUzkU74eIlf+op/mMRl3RWMOma
TsarphUoAfzCiYYZwg7IkWZWhTZt/zNR8OslNRfMbRFDfzFicCK3iu9L45jLaUix
W+N8z0xt1QxjWjzAAqHWVaNQion9zZS1Y219pI7rS76FI7m5CNDQ394lD3Pz3fqk
iW1JOtJACL9majRnZ58P7ScdVk68R/2Kt9VVcR93mkL4uAdrdxiWFScMssI1GmwR
CQ6m/Ktupw2YPqx87qNprsjpJriEx7ooU5SGdKpX7eLf2v5pcQRXLArfHOmL6mCe
xK0GGGTyz8E+GqzV8KciZpz9sEz0FoxPyqTjqboArvRPNCGnBvSSBISi24KXPXj3
MnOXiqLQ9BCAQRsSUMvWtof/X6DeAH7eQ8Pfk3+GxmnSsSInT6pOfbDESajMy7SY
ADji7W87JzIPqXO46JCgKU5yXmxaNfUKrJHm5nH0nlmQDUVT21E=
=qbBN
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/clg/tags/pull-aspeed-20211022' into staging
Aspeed patches :
* New fp5280g2-bmc board (John)
* Small cleanup in Aspeed SMC model (Cedric)
# gpg: Signature made Fri 22 Oct 2021 12:55:18 AM PDT
# gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1
* remotes/clg/tags/pull-aspeed-20211022:
speed/sdhci: Add trace events
aspeed/smc: Use a container for the flash mmio address space
aspeed: Add support for the fp5280g2-bmc board
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Because AddressSpaces must not be sysbus-mapped, commit e9c568dbc2
("hw/arm/aspeed: Do not sysbus-map mmio flash region directly, use
alias") introduced an alias for the flash mmio region.
Using a container is cleaner.
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-Id: <20211018132609.160008-5-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Generate DBG2 table
Switch to ssize_t for elf loader return type
Fixed sbsa cpu type error message typo
Only initialize required submodules for edk2
Dont create device-tree node for empty NUMA node
-----BEGIN PGP SIGNATURE-----
iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmFxhbgdHHJpY2hhcmQu
aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/Z6gf/Wt/iI+cwM8+eNYgV
fFrvkN2YUBS4B7QBVs7axbUmHs+/B8PIyBjo9WLmkMTdgnj13JiXmvE879PZwryr
zlh/2KqKBq+87L5CMOfPXtePizlZI/TuxJL0koIMW/L85ABEYOMyaDp0bDqhJHHb
UNBxvlxHgBmZUpBkmJKWoOKxQ9Jz8M5jpoUjKFCxhMHqbzAQOjS0B0IR6E4n8R4o
pKiWStc2TB7aGtVRY/Jm4EaSy9n3g77K40Yxkn6txN8Eqm/QxCLq9uwEZM2YMR5h
IJUxdUVNT31mCd2EhSAlHqpCrKuFaGCb+t8fQxLkd73AT5Unm5jJl5alfsO8khgV
TgipUA==
=boXv
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/rth/tags/pull-arm-20211021' into staging
Introduce cpu topology support
Generate DBG2 table
Switch to ssize_t for elf loader return type
Fixed sbsa cpu type error message typo
Only initialize required submodules for edk2
Dont create device-tree node for empty NUMA node
# gpg: Signature made Thu 21 Oct 2021 08:22:32 AM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* remotes/rth/tags/pull-arm-20211021:
tests/data/acpi/virt: Update the empty expected file for PPTT
hw/arm/virt-acpi-build: Generate PPTT table
tests/data/acpi/virt: Add an empty expected file for PPTT
hw/acpi/aml-build: Add PPTT table
hw/acpi/aml-build: Add Processor hierarchy node structure
hw/arm/virt: Add cpu-map to device tree
device_tree: Add qemu_fdt_add_path
hw/arm/virt: Only describe cpu topology since virt-6.2
bios-tables-test: Generate reference table for virt/DBG2
hw/arm/virt_acpi_build: Generate DBG2 table
tests/acpi: Add void table for virt/DBG2 bios-tables-test
hw/elf_ops.h: switch to ssize_t for elf loader return type
hw/arm/sbsa-ref: Fixed cpu type error message typo.
roms/edk2: Only initialize required submodules
roms/edk2: Only init brotli submodule to build BaseTools
hw/arm/virt: Don't create device-tree node for empty NUMA node
tests/acpi: Generate reference blob for IORT rev E.b
hw/arm/virt-acpi-build: IORT upgrade up to revision E.b
tests/acpi: Get prepared for IORT E.b revision upgrade
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add the Processor Properties Topology Table (PPTT) used to
describe CPU topology information to ACPI guests.
Note, a DT-boot Linux guest with a non-flat CPU topology will
see socket and core IDs being sequential integers starting
from zero, which is different from ACPI-boot Linux guest,
e.g. with -smp 4,sockets=2,cores=2,threads=1
a DT boot produces:
cpu: 0 package_id: 0 core_id: 0
cpu: 1 package_id: 0 core_id: 1
cpu: 2 package_id: 1 core_id: 0
cpu: 3 package_id: 1 core_id: 1
an ACPI boot produces:
cpu: 0 package_id: 36 core_id: 0
cpu: 1 package_id: 36 core_id: 1
cpu: 2 package_id: 96 core_id: 2
cpu: 3 package_id: 96 core_id: 3
This is due to several reasons:
1) DT cpu nodes do not have an equivalent field to what the PPTT
ACPI Processor ID must be, i.e. something equal to the MADT CPU
UID or equal to the UID of an ACPI processor container. In both
ACPI cases those are platform dependant IDs assigned by the
vendor.
2) While QEMU is the vendor for a guest, if the topology specifies
SMT (> 1 thread), then, with ACPI, it is impossible to assign a
core-id the same value as a package-id, thus it is not possible
to have package-id=0 and core-id=0. This is because package and
core containers must be in the same ACPI namespace and therefore
must have unique UIDs.
3) ACPI processor containers are not mandatorily required for PPTT
tables to be used and, due to the limitations of which IDs are
selected described above in (2), they are not helpful for QEMU,
so we don't build them with this patch. In the absence of them,
Linux assigns its own unique IDs. The maintainers have chosen not
to use counters from zero, but rather ACPI table offsets, which
explains why the numbers are so much larger than with DT.
4) When there is no SMT (threads=1) the core IDs for ACPI boot guests
match the logical CPU IDs, because these IDs must be equal to the
MADT CPU UID (as no processor containers are present), and QEMU
uses the logical CPU ID for these MADT IDs.
So in summary, with QEMU as the vendor for the guests, we simply
use sequential integers starting from zero for the non-leaf nodes
but with ID-valid flag unset, so that guest will ignore them and
use table offsets as unique container IDs. And we use logical CPU
IDs for the leaf nodes with the ID-valid flag set, which will be
consistent with MADT.
Currently the implementation of PPTT generation complies with ACPI
specification 5.2.29 (Revision 6.3). The 6.3 spec can be found at:
https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Co-developed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20211020142125.7516-6-wangyanan55@huawei.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu_fdt_add_path() works like qemu_fdt_add_subnode(), except it
also adds all missing subnodes from the given path. We'll use it
in a coming patch where we will add cpu-map to the device tree.
And we also tweak an error message of qemu_fdt_add_subnode().
Co-developed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20211020142125.7516-3-wangyanan55@huawei.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
On existing older machine types, without cpu topology described
in ACPI or DT, the guest will populate one by default. With the
topology described, it will read the information and set up its
topology as instructed, but that may not be the same as what was
getting used by default. It's possible that an user application
has a dependency on the default topology and if the default one
gets changed it will probably behave differently.
Based on above consideration we'd better only describe topology
information to the guest on 6.2 and later machine types.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20211020142125.7516-2-wangyanan55@huawei.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
and use them to set and test the ASSERTED bit of LSI sources.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211004212141.432954-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Until now, int was used as the return type for all the ELF
loader related functions. The returned value is the sum of all loaded
program headers "MemSize" fields.
Because of the overflow check in elf_ops.h, trying to load an ELF bigger
than INT_MAX will fail. Switch to ssize_t to remove this limitation.
Signed-off-by: Luc Michel <lmichel@kalray.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20211014194325.19917-1-lmichel@kalray.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a new auxmode GPIO that is updated when port B bit 6 is changed indicating
whether the hardware is configured for A/UX mode.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211020134131.4392-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch implements the control virtqueue support for vhost. This
requires virtio-net to figure out the datapath queue pairs and control
virtqueue via is_datapath and pass the number of those two types
of virtqueues to vhost_net_start()/vhost_net_stop().
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-10-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces a new field in the vhost_dev structure to record
the last virtqueue index for the virtio device. This will be useful
for the vhost backends with 1:N model to start or stop the device
after all the vhost_dev structures were started or stopped.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-9-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Most of the time, "queues" really means queue pairs. So this patch
switch to use "queue_pairs" to avoid confusion.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-8-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We assume there's no cvq in the past, this is not true when we need
control virtqueue support for vhost-user backends. So this patch
implements the control virtqueue support for vhost-net. As datapath,
the control virtqueue is also required to be coupled with the
NetClientState. The vhost_net_start/stop() are tweaked to accept the
number of datapath queue pairs plus the the number of control
virtqueue for us to start and stop the vhost device.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-7-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces a boolean for the device has control queue which
can accepts control command via network queue.
The first user would be the control virtqueue support for vhost.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-6-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Vhost-vdpa uses one device multiqueue queue (pairs) model. So we need
to classify the one time request (e.g SET_OWNER) and make sure those
request were only called once per device.
This is used for multiqueue support.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a random number generator (RNG) backend that communicates
with a vhost-user server to retrieve entropy. That way other VMM
that comply with the vhost user protocl can use the same vhost-user
daemon without having to write yet another RNG driver.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Message-Id: <20211012205904.4106769-2-mathieu.poirier@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop base_name and turn generic_name into
"virtio-iommu-pci". This is more in line with
other modern-only devices.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20211013191755.767468-3-eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Check vdpa device range before updating memory regions so we don't add
any outside of it, and report the invalid change if any.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20211014141236.923287-4-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Other functions in the VT82xx chips need to raise ISA interrupts. Keep
a reference to them in the device state and add via_isa_set_irq() to
allow setting their state.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <778c04dc2c8affac060b8edf9e8d7dab3c3e04eb.1634259980.git.balaton@eik.bme.hu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This reverts commit 1b36e4f5a5.
Despite a comment saying why cpu_common_props cannot be placed in
a file that is compiled once, it was moved anyway. Revert that.
Since then, Property is not defined in hw/core/cpu.h, so it is now
easier to declare a function to install the properties rather than
the Property array itself.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Make bytes argument int64_t to be consistent with modern block-layer.
Callers should be OK with it as type becomes wider.
What is inside functions?
- Conversion from int64_t to size_t. Still, we
can't have a buffer larger than SIZE_MAX, therefore bytes should not be
larger than SIZE_MAX as well. Add an assertion.
- Passing to blk_co_pwritev() / blk_co_preadv() which already has
int64_t bytes argument.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211007175243.642516-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: spelling fix]
Signed-off-by: Eric Blake <eblake@redhat.com>
We already have this marker for the blk_co_flush function declaration in
block/block-backend.c. Add it in the header too.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211007175243.642516-1-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: wording tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
1. Convert bytes in BlkAioEmAIOCB:
aio->bytes is only passed to already int64_t interfaces, and set in
blk_aio_prwv, which is updated here.
2. For all updated functions the parameter type becomes wider so callers
are safe.
3. In blk_aio_prwv we only store bytes to BlkAioEmAIOCB, which is
updated here.
4. Other updated functions are wrappers on blk_aio_prwv.
Note that blk_aio_preadv and blk_aio_pwritev become safer: before this
commit, it's theoretically possible to pass qiov with size exceeding
INT_MAX, which than converted to int argument of blk_aio_prwv. Now it's
converted to int64_t which is a lot better. Still add assertions.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006131718.214235-11-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: tweak assertion and grammar]
Signed-off-by: Eric Blake <eblake@redhat.com>
Function is updated so that parameter type becomes wider, so all
callers should be OK with it.
Look at blk_co_copy_range() itself: bytes is passed only to
blk_check_byte_request() and bdrv_co_copy_range(), which already have
int64_t bytes parameter, so we are OK.
Note that requests exceeding INT_MAX are still restricted by
blk_check_byte_request().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006131718.214235-10-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
Convert blk_pdiscard, blk_pwrite_compressed, blk_pwrite_zeroes.
These are just wrappers for functions with int64_t argument, so allow
passing int64_t as well. Parameter type becomes wider so all callers
should be OK with it.
Note that requests exceeding INT_MAX are still restricted by
blk_check_byte_request().
Note also that we don't (and are not going to) convert blk_pwrite and
blk_pread: these functions return number of bytes on success, so to
update them, we should change return type to int64_t as well, which
will lead to investigating and updating all callers which is too much.
So, blk_pread and blk_pwrite remain unchanged.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006131718.214235-9-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
We updated blk_do_pdiscard() and its wrapper blk_co_pdiscard(). Both
functions are updated so that the parameter type becomes wider, so all
callers should be OK with it.
Look at blk_do_pdiscard(): bytes is passed only to
blk_check_byte_request() and bdrv_co_pdiscard(), which already have
int64_t bytes parameter, so we are OK.
Note that requests exceeding INT_MAX are still restricted by
blk_check_byte_request().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006131718.214235-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
We convert blk_do_pwritev_part() and some wrappers:
blk_co_pwritev_part(), blk_co_pwritev(), blk_co_pwrite_zeroes().
All functions are converted so that the parameter type becomes wider, so
all callers should be OK with it.
Look at blk_do_pwritev_part() body:
bytes is passed to:
- trace_blk_co_pwritev (we update it here)
- blk_check_byte_request, throttle_group_co_io_limits_intercept,
bdrv_co_pwritev_part - all already have int64_t argument.
Note that requests exceeding INT_MAX are still restricted by
blk_check_byte_request().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006131718.214235-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
For both updated functions, the type of bytes becomes wider, so all callers
should be OK with it.
blk_co_preadv() only passes its arguments to blk_do_preadv().
blk_do_preadv() passes bytes to:
- trace_blk_co_preadv, which is updated too
- blk_check_byte_request, throttle_group_co_io_limits_intercept,
bdrv_co_preadv, which are already int64_t.
Note that requests exceeding INT_MAX are still restricted by
blk_check_byte_request().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006131718.214235-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
QDicts are both what QMP natively uses and what the keyval parser
produces. Going through QemuOpts isn't useful for either one, so switch
the main device creation function to QDicts. By sharing more code with
the -object/object-add code path, we can even reduce the code size a
bit.
This commit doesn't remove the detour through QemuOpts from any code
path yet, but it allows the following commits to do so.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211008133442.141332-15-kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of accessing the global QemuOptsList, which really belong to the
command line parser and shouldn't be accessed from devices, store a
pointer to the QemuOpts in a new VirtIONet field.
This is not the final state, but just an intermediate step to get rid of
QemuOpts in devices. It will later be replaced with an options QDict.
Before this patch, two "primary" devices could be hidden for the same
standby device, but only one of them would actually be enabled and the
other one would be kept hidden forever, so this doesn't make sense.
After this patch, configuring a second primary device is an error.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211008133442.141332-13-kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
hide_device() is used for virtio-net failover, where the standby virtio
device delays creation of the primary device. It only makes sense to
have a single primary device for each standby device. Adding a second
one should result in an error instead of hiding it and never using it
afterwards.
Prepare for this by adding an Error parameter to the hide_device()
callback where virtio-net is informed about adding a primary device.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211008133442.141332-12-kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qdev_set_id() is mostly used when the user adds a device (using
-device cli option or device_add qmp command). This commit adds
an error parameter to handle the case where the given id is
already taken.
Also document the function and add a return value in order to
be able to capture success/failure: the function now returns the
id in case of success, or NULL in case of failure.
The commit modifies the 2 calling places (qdev-monitor and
xen-legacy-backend) to add the error object parameter.
Note that the id is, right now, guaranteed to be unique because
all ids came from the "device" QemuOptsList where the id is used
as key. This addition is a preparation for a future commit which
will relax the uniqueness.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211008133442.141332-10-kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
DeviceState.id is a pointer to a string that is stored in the QemuOpts
object DeviceState.opts and freed together with it. We want to create
devices without going through QemuOpts in the future, so make this a
separately allocated string.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211008133442.141332-9-kwolf@redhat.com>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some network backends (vhost-user and vhost-vdpa) work only with
specific devices. At startup, they second guess what the command line
option handling will do and error out if they think a non-virtio device
will attach to them.
This second guessing is not only ugly, it can lead to wrong error
messages ('-device floppy,netdev=foo' should complain about an unknown
property, not about the wrong kind of network device being attached) and
completely ignores hotplugging.
Add a callback where backends can check compatibility with a device when
it actually tries to attach, even on hotplug.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211008133442.141332-2-kwolf@redhat.com>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add cpu_ld/st_mmu memory primitives.
Move helper_ld/st memory helpers out of tcg.h.
Canonicalize alignment flags in MemOp.
-----BEGIN PGP SIGNATURE-----
iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmFnG/0dHHJpY2hhcmQu
aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/P8Qf/TIb+nP/q4ZesoHV5
hNuKIMcGMiIWjP7YkuXg7H8n4QQxSK+nKXI3qlWCTIVtKOQFC3jkqNnxV8ncHUyS
RW6ePEcmJfb+yv20MnDLObxMcAq6mIkHtOjARQcvcHiXxMNEZdIvJ8f8/qrkYib1
RRJarqIGlYFJvGyfbplq/JA/WYcJleIElEUx7JPSewz38Kk0gDIH2+BR2TBFrWAD
TDfh+GvlHeX8IYU19rWnt7pFv8TVPVQODqJBtlRPEYnl+LGdpJPCP2ATUAggWHiA
hucYKsuMWXXXhGx2nsurkpSNrBfGe6OHybOE5d1ARqmq0MnyHJat+ryh6qTx3Z9w
oZKi+Q==
=QpK0
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20211013' into staging
Use MO_128 for 16-byte atomic memory operations.
Add cpu_ld/st_mmu memory primitives.
Move helper_ld/st memory helpers out of tcg.h.
Canonicalize alignment flags in MemOp.
# gpg: Signature made Wed 13 Oct 2021 10:48:45 AM PDT
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]
* remotes/rth/tags/pull-tcg-20211013:
tcg: Canonicalize alignment flags in MemOp
tcg: Move helper_*_mmu decls to tcg/tcg-ldst.h
target/arm: Use cpu_*_mmu instead of helper_*_mmu
target/sparc: Use cpu_*_mmu instead of helper_*_mmu
target/s390x: Use cpu_*_mmu instead of helper_*_mmu
target/mips: Use 8-byte memory ops for msa load/store
target/mips: Use cpu_*_data_ra for msa load/store
accel/tcg: Move cpu_atomic decls to exec/cpu_ldst.h
accel/tcg: Add cpu_{ld,st}*_mmu interfaces
target/hexagon: Implement cpu_mmu_index
target/s390x: Use MO_128 for 16 byte atomics
target/ppc: Use MO_128 for 16 byte atomics
target/i386: Use MO_128 for 16 byte atomics
target/arm: Use MO_128 for 16 byte atomics
memory: Log access direction for invalid accesses
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These functions have been replaced by cpu_*_mmu as the
most proper interface to use from target code.
Hide these declarations from code that should not use them.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The previous placement in tcg/tcg.h was not logical.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These functions are much closer to the softmmu helper
functions, in that they take the complete MemOpIdx,
and from that they may enforce required alignment.
The previous cpu_ldst.h functions did not have alignment info,
and so did not enforce it. Retain this by adding MO_UNALN to
the MemOp that we create in calling the new functions.
Note that we are not yet enforcing alignment for user-only,
but we now have the information with which to do so.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move qmp_query_sgx() and hmp_info_sgx() from target/i386/monitor.c
to hw/i386/sgx.c, removing the sgx_get_info() indirection and the
"hw/i386/sgx.h" header.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007175612.496366-5-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move qmp_query_sgx_capabilities() from target/i386/monitor.c to
hw/i386/sgx.c, removing the sgx_get_capabilities() indirection.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007175612.496366-4-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007175612.496366-3-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While being conditionally used for TARGET_I386 in hmp-commands-info.hx,
hmp_info_sev() is declared for all targets. Reduce its declaration
to target including "monitor/hmp-target.h". This is a minor cleanup.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007161716.453984-23-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"sysemu/sev.h" is only used from x86-specific files. Let's move it
to include/hw/i386, and merge it with target/i386/sev.h.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007161716.453984-16-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- don't override the test compiler when specified
- split some multiarch tests by guest OS
- add riscv64 docker image and cross-compile tests
- drop release tarball test from Travis
- skip check-patch on master repo
- fix passing of TEST_TARGETS to cirrus
- fix missing symbols in plugins
- ensure s390x insn start ops precede plugin instrumentation
- refactor plugin instruction boundary detection
- update github repo lockdown
- add a debian-native test image for multi-arch builds
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmFlVsQACgkQ+9DbCVqe
KkQP4wf/WErJTOCALRjH3ebLasdOAC4O9BZhH5vMx39o8jwbap2/dZT70IVSgEPj
2bePVnCTRTkgNqcQR/3nsvTkIxpzxR8HAtwbv0XdDBo6b+7090st2z+jHf6ZgFdV
bVqNE0nDAScsUPW2xpgQ4UwlJHMI8QucMt+ptPM5lmRnxPvHij9MeodergPooqt/
joI+eUtsnT6bQQTzJA4dJpHunQofjPyvtviYae3PvPSQIITUz461JQRr0kJZO6Ql
VHuBmuupfuAGijPSTsVPKAFAYkd2UkMKnvAmx2hzKDAVL/QmB0bE90BdAde7d6+X
3/wR/jVE8QpSlP1nwVERdy++YU0oZw==
=0UIz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stsquad/tags/pull-for-6.2-121021-2' into staging
Some testing and plugin updates:
- don't override the test compiler when specified
- split some multiarch tests by guest OS
- add riscv64 docker image and cross-compile tests
- drop release tarball test from Travis
- skip check-patch on master repo
- fix passing of TEST_TARGETS to cirrus
- fix missing symbols in plugins
- ensure s390x insn start ops precede plugin instrumentation
- refactor plugin instruction boundary detection
- update github repo lockdown
- add a debian-native test image for multi-arch builds
# gpg: Signature made Tue 12 Oct 2021 02:35:00 AM PDT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* remotes/stsquad/tags/pull-for-6.2-121021-2:
tests/docker: add a debian-native image and make available
.github: move repo lockdown to the v2 configuration
accel/tcg: re-factor plugin_inject_cb so we can assert insn_idx is valid
target/s390x: move tcg_gen_insn_start to s390x_tr_insn_start
plugins/: Add missing functions to symbol list
gitlab: fix passing of TEST_TARGETS env to cirrus
gitlab: skip the check-patch job on the upstream repo
travis.yml: Remove the "Release tarball" job
gitlab: Add cross-riscv64-system, cross-riscv64-user
tests/docker: promote debian-riscv64-cross to a full image
tests/tcg: move some multiarch files and make conditional
tests/tcg/sha1: remove endian include
configure: don't override the selected host test compiler if defined
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Coverity doesn't know enough about how we have arranged our plugin TCG
ops to know we will always have incremented insn_idx before injecting
the callback. Let us assert it for the benefit of Coverity and protect
ourselves from accidentally breaking the assumption and triggering
harder to grok errors deeper in the code if we attempt a negative
indexed array lookup.
However to get to this point we re-factor the code and remove the
second hand instruction boundary detection in favour of scanning the
full set of ops and using the existing INDEX_op_insn_start to cleanly
detect when the instruction has started. As we no longer need the
plugin specific list of ops we delete that.
My initial benchmarks shows no discernible impact of dropping the
plugin specific ops list.
Fixes: Coverity 1459509
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210917162332.3511179-12-alex.bennee@linaro.org>
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Delevoryas <pdel@fb.com>
Message-Id: <20211005052604.1674891-3-pdel@fb.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This model implements enough behaviour to do basic functionality tests
such as device initialisation and read out of dummy sample values. The
sample value generation strategy is similar to the STM ADC already in
the tree.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
[clg : support for multiple engines (AST2600) ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[pdel : refactored engine register struct fields to regs[] array field]
[pdel : added guest-error checking for upper-8 channel regs in AST2600]
[pdel : allow 16-bit reads of the channel data registers]
Signed-off-by: Peter Delevoryas <pdel@fb.com>
Message-Id: <20211005052604.1674891-2-pdel@fb.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The gpio array is declared as a dense array:
qemu_irq gpios[ASPEED_GPIO_NR_PINS];
(AST2500 has 228, AST2400 has 216, AST2600 has 208)
However, this array is used like a matrix of GPIO sets
(e.g. gpio[NR_SETS][NR_PINS_PER_SET] = gpio[8][32])
size_t offset = set * GPIOS_PER_SET + gpio;
qemu_set_irq(s->gpios[offset], !!(new & mask));
This can result in an out-of-bounds access to "s->gpios" because the
gpio sets do _not_ have the same length. Some of the groups (e.g.
GPIOAB) only have 4 pins. 228 != 8 * 32 == 256.
To fix this, I converted the gpio array from dense to sparse, to that
match both the hardware layout and this existing indexing code.
Fixes: 4b7f956862 ("hw/gpio: Add basic Aspeed GPIO model for AST2400 and AST2500")
Signed-off-by: Peter Delevoryas <pdel@fb.com>
Message-Id: <20211008033501.934729-2-pdel@fb.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Introduce an AspeedI2CBus SysBusDevice model and attach the associated
memory region and IRQ to the newly instantiated objects.
Before this change, the I2C bus IRQs were all attached to the
SysBusDevice model of the I2C controller. Adapt the AST2600 SoC
realize routine to take into account this change.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The AST2400 SPI controller has a transitional HW interface and it
stores the address width currently in use in a different register than
all the other SMC controllers. It needs special handling when working
in 4B mode.
Make it clear through a class handler. This also removes another use
of the segments array.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This simplifies the reset handler and has the benefit to remove some
"bad" use of the segments array as an identifier of the controller model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
AspeedSMCFlash is a small structure representing the AHB memory window
through which the contents of a flash device can be accessed with MMIOs.
Introduce an AspeedSMCFlash SysBusDevice model and attach the associated
memory region to the newly instantiated objects.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
'cs' is a more appropriate name to index SPI flash devices.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
AspeedSMCFlash::size is only used to compute the initial size of the
boot_rom region. Not very useful, so directly call memory_region_size()
instead.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
There is no need to keep a reference of the flash qdev in the AspeedSMCFlash
state: the SPI bus takes ownership and will release its resources. Remove
AspeedSMCFlash::flash.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The characteristics of the Aspeed controllers are described in a
AspeedSMCController structure which is redundant with the
AspeedSMCClass. Move all attributes under the class and adapt the code
to use class attributes instead.
This is a large change but it is functionally equivalent.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The MacOS driver expects a 60.15Hz vertical blank interrupt to be generated by
the framebuffer which in turn schedules the mouse driver via the Vertical Retrace
Manager.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211007221253.29024-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The monitor modes table is found by experimenting with the Monitors Control
Panel in MacOS and analysing the reads/writes. From this it can be found that
the mode is controlled by writes to the DAFB_MODE_CTRL1 and DAFB_MODE_CTRL2
registers.
Implement the first block of DAFB registers as a register array including the
existing sense register, the newly discovered control registers above, and also
the DAFB_MODE_VADDR1 and DAFB_MODE_VADDR2 registers which are used by NetBSD to
determine the current video mode.
These experiments also show that the offset of the start of video RAM and the
stride can change depending upon the monitor mode, so update macfb_draw_graphic()
and both the BI_MAC_VADDR and BI_MAC_VROW bootinfo for the q800 machine
accordingly.
Finally update macfb_common_realize() so that only the resolution and depth
supported by the display type can be specified on the command line, and add an
error hint showing the list of supported resolutions and depths if the user tries
to specify an invalid display mode.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211007221253.29024-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Since the available resolutions and colour depths are determined by the attached
display type, add a qdev property to allow the display type to be specified.
The main resolutions of interest are high resolution 1152x870 with 8-bit colour
and SVGA resolution up to 800x600 with 24-bit colour so update the q800 machine
to allow high resolution mode if specified and otherwise fall back to SVGA.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211007221253.29024-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The MacOS toolbox ROM uses the monitor sense to detect the display type and then
offer a fixed set of resolutions and colour depths accordingly. Implement the
monitor sense using information found in Apple Technical Note HW26: "Macintosh
Quadra Built-In Video" along with some local experiments.
Since the default configuration is 640 x 480 with 8-bit colour then hardcode
the sense register to return MACFB_DISPLAY_VGA for now.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211007221253.29024-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
v2: add small fix by Stefano, Hanna's series fixed
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEi5wmzbL9FHyIDoahVh8kwfGfefsFAmFfEVMACgkQVh8kwfGf
eftAVA//WLtOaiVYPSjEl5EK80kry39VknZkQyeYUzyV7JNr/FRMlbJaIF2sOjH5
KPRpfwBiuijOc8R0s34HY0BpyweRd1rbypHblZkO7EO4XwHx1FLF5kNHF6yV7wPL
c9W564sZpc6Z96wSMgC4Is9QHJ6JbO4TJNJsG8v/PHEqGQV/yCYkgBox4loJckww
uSAZ7l63IWA8uPSq/rOu34bREKN9s0kHkvFq0JNWk2HtOBLDiRDUYmbSfdjfT4jz
np7ojKiffcAJED9JA28Zo2Y+MSId+FyoO4lbt+deMNzIHboy2oVlHouoHHprr61x
dIO7Qt1IoMk5IBIfkPRYkReMwxxSVKuIJcWm8Qqtkcg2X0g5ayNUmHwpBMd50h2z
XPjrr0YdOixhxMHoBnqlkPlWU0Y/B+YJIQ+mjqp+vRNkk94NoXhsXnCod1ajkgWO
zjc/dztew7HvNStJaMM0rnEjanLhzFZKtlMO4WwZHQp2yZG2AINkPStswo2f3AmL
FI+2By/UhFKm3BEemf0wYWDPWrPHU+BOiu16KjSKeS0GA9t7GXBUDRxNYPhUheXJ
eJKIpNsGbseNxKrAbLyRhAB75Fa/ReZqqybmEcLyal/ball3R/cNF3gaMHeX0o1n
HTGIAF5JOAXNGApS5TilkXPZ7jHFOVPh/Fi6/16/08tcgxjVfro=
=TVTu
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/vsementsov/tags/pull-jobs-2021-10-07-v2' into staging
mirror: Handle errors after READY cancel
v2: add small fix by Stefano, Hanna's series fixed
# gpg: Signature made Thu 07 Oct 2021 08:25:07 AM PDT
# gpg: using RSA key 8B9C26CDB2FD147C880E86A1561F24C1F19F79FB
# gpg: Good signature from "Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8B9C 26CD B2FD 147C 880E 86A1 561F 24C1 F19F 79FB
* remotes/vsementsov/tags/pull-jobs-2021-10-07-v2:
iotests: Add mirror-ready-cancel-error test
mirror: Do not clear .cancelled
mirror: Stop active mirroring after force-cancel
mirror: Check job_is_cancelled() earlier
mirror: Use job_is_cancelled()
job: Add job_cancel_requested()
job: Do not soft-cancel after a job is done
jobs: Give Job.force_cancel more meaning
job: @force parameter for job_cancel_sync()
job: Force-cancel jobs in a failed transaction
mirror: Drop s->synced
mirror: Keep s->synced on error
job: Context changes in job_completed_txn_abort()
block/aio_task: assert `max_busy_tasks` is greater than 0
block/backup: avoid integer overflow of `max-workers`
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination. For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.
A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause. This "cancellation" (which is actually
a kind of completion) may take an indefinite amount of time, and so
should behave like any job during normal operation. For example, with
on-target-error=stop, the job should stop on write errors. (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)
Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_requested() as the general variant, which returns true for
any jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.
Finally, here is a justification for how different job_is_cancelled()
invocations are treated by this patch:
- block/mirror.c (mirror_run()):
- The first invocation is a while loop that should loop until the job
has been cancelled or scheduled for completion. What kind of cancel
does not matter, only the fact that the job is supposed to end.
- The second invocation wants to know whether the job has been
soft-cancelled. Calling job_cancel_requested() is a bit too broad,
but if the job were force-cancelled, we should leave the main loop
as soon as possible anyway, so this should not matter here.
- The last two invocations already check force_cancel, so they should
continue to use job_is_cancelled().
- block/backup.c, block/commit.c, block/stream.c, anything in tests/:
These jobs know only force-cancel, so there is no difference between
job_is_cancelled() and job_cancel_requested(). We can continue using
job_is_cancelled().
- job.c:
- job_pause_point(), job_yield(), job_sleep_ns(): Only force-cancelled
jobs should be prevented from being paused. Continue using job_is_cancelled().
- job_update_rc(), job_finalize_single(), job_finish_sync(): These
functions are all called after the job has left its main loop. The
mirror job (the only job that can be soft-cancelled) will clear
.cancelled before leaving the main loop if it has been
soft-cancelled. Therefore, these functions will observe .cancelled
to be true only if the job has been force-cancelled. We can
continue to use job_is_cancelled().
(Furthermore, conceptually, a soft-cancelled mirror job should not
report to have been cancelled. It should report completion (see
also the block-job-cancel QAPI documentation). Therefore, it makes
sense for these functions not to distinguish between a
soft-cancelled mirror job and a job that has completed as normal.)
- job_completed_txn_abort(): All jobs other than @job have been
force-cancelled. job_is_cancelled() must be true for them.
Regarding @job itself: job_completed_txn_abort() is mostly called
when the job's return value is not 0. A soft-cancelled mirror has a
return value of 0, and so will not end up here then.
However, job_cancel() invokes job_completed_txn_abort() if the job
has been deferred to the main loop, which is mostly the case for
completed jobs (which skip the assertion), but not for sure.
To be safe, use job_cancel_requested() in this assertion.
- job_complete(): This is function eventually invoked by the user
(through qmp_block_job_complete() or qmp_job_complete(), or
job_complete_sync(), which comes from qemu-img). The intention here
is to prevent a user from invoking job-complete after the job has
been cancelled. This should also apply to soft cancelling: After a
mirror job has been soft-cancelled, the user should not be able to
decide otherwise and have it complete as normal (i.e. pivoting to
the target).
- job_cancel(): Both functions are equivalent (see comment there), but
we want to use job_is_cancelled(), because this shows that we call
job_completed_txn_abort() only for force-cancelled jobs. (As
explained for job_update_rc(), soft-cancelled jobs should be treated
as if they have completed as normal.)
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006151940.214590-9-hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
We largely have two cancel modes for jobs:
First, there is actual cancelling. The job is terminated as soon as
possible, without trying to reach a consistent result.
Second, we have mirror in the READY state. Technically, the job is not
really cancelled, but it just is a different completion mode. The job
can still run for an indefinite amount of time while it tries to reach a
consistent result.
We want to be able to clearly distinguish which cancel mode a job is in
(when it has been cancelled). We can use Job.force_cancel for this, but
right now it only reflects cancel requests from the user with
force=true, but clearly, jobs that do not even distinguish between
force=false and force=true are effectively always force-cancelled.
So this patch has Job.force_cancel signify whether the job will
terminate as soon as possible (force_cancel=true) or whether it will
effectively remain running despite being "cancelled"
(force_cancel=false).
To this end, we let jobs that provide JobDriver.cancel() tell the
generic job code whether they will terminate as soon as possible or not,
and for jobs that do not provide that method we assume they will.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211006151940.214590-7-hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Callers should be able to specify whether they want job_cancel_sync() to
force-cancel the job or not.
In fact, almost all invocations do not care about consistency of the
result and just want the job to terminate as soon as possible, so they
should pass force=true. The replication block driver is the exception,
specifically the active commit job it runs.
As for job_cancel_sync_all(), all callers want it to force-cancel all
jobs, because that is the point of it: To cancel all remaining jobs as
quickly as possible (generally on process termination). So make it
invoke job_cancel_sync() with force=true.
This changes some iotest outputs, because quitting qemu while a mirror
job is active will now lead to it being cancelled instead of completed,
which is what we want. (Cancelling a READY mirror job with force=false
may take an indefinite amount of time, which we do not want when
quitting. If users want consistent results, they must have all jobs be
done before they quit qemu.)
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006151940.214590-6-hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
- Embed SerialMM in MchpPfSoCMMUartState and QOM-initialize it
- Alias SERIAL_MM 'chardev' property on MCHP_PFSOC_UART
- Forward SerialMM sysbus IRQ in mchp_pfsoc_mmuart_realize()
- Add DeviceReset() method
- Add vmstate structure for migration
- Register device in 'input' category
- Keep mchp_pfsoc_mmuart_create() behavior
Note, serial_mm_init() calls qdev_set_legacy_instance_id().
This call is only needed for backwards-compatibility of incoming
migration data with old versions of QEMU which implemented migration
of devices with hand-rolled code. Since this device didn't previously
handle migration at all, then it doesn't need to set the legacy
instance ID.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bin.meng@windriver.com>
Tested-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210925133407.1259392-4-f4bug@amsat.org
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Our device have 2 different I/O regions:
- a 16550 UART mapped for 32-bit accesses
- 13 extra registers
Instead of mapping each region on the main bus, introduce
a container, map the 2 devices regions on the container,
and map the container on the main bus.
Before:
(qemu) info mtree
...
0000000020100000-000000002010001f (prio 0, i/o): serial
0000000020100020-000000002010101f (prio 0, i/o): mchp.pfsoc.mmuart
0000000020102000-000000002010201f (prio 0, i/o): serial
0000000020102020-000000002010301f (prio 0, i/o): mchp.pfsoc.mmuart
0000000020104000-000000002010401f (prio 0, i/o): serial
0000000020104020-000000002010501f (prio 0, i/o): mchp.pfsoc.mmuart
0000000020106000-000000002010601f (prio 0, i/o): serial
0000000020106020-000000002010701f (prio 0, i/o): mchp.pfsoc.mmuart
After:
(qemu) info mtree
...
0000000020100000-0000000020100fff (prio 0, i/o): mchp.pfsoc.mmuart
0000000020100000-000000002010001f (prio 0, i/o): serial
0000000020100020-0000000020100fff (prio 0, i/o): mchp.pfsoc.mmuart.regs
0000000020102000-0000000020102fff (prio 0, i/o): mchp.pfsoc.mmuart
0000000020102000-000000002010201f (prio 0, i/o): serial
0000000020102020-0000000020102fff (prio 0, i/o): mchp.pfsoc.mmuart.regs
0000000020104000-0000000020104fff (prio 0, i/o): mchp.pfsoc.mmuart
0000000020104000-000000002010401f (prio 0, i/o): serial
0000000020104020-0000000020104fff (prio 0, i/o): mchp.pfsoc.mmuart.regs
0000000020106000-0000000020106fff (prio 0, i/o): mchp.pfsoc.mmuart
0000000020106000-000000002010601f (prio 0, i/o): serial
0000000020106020-0000000020106fff (prio 0, i/o): mchp.pfsoc.mmuart.regs
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Bin Meng <bin.meng@windriver.com>
Message-id: 20210925133407.1259392-3-f4bug@amsat.org
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
The current MCHP_PFSOC_MMUART_REG_SIZE definition represent the
size occupied by all the registers. However all registers are
32-bit wide, and the MemoryRegionOps handlers are restricted to
32-bit:
static const MemoryRegionOps mchp_pfsoc_mmuart_ops = {
.read = mchp_pfsoc_mmuart_read,
.write = mchp_pfsoc_mmuart_write,
.impl = {
.min_access_size = 4,
.max_access_size = 4,
},
Avoid being triskaidekaphobic, simplify by using the number of
registers.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bin.meng@windriver.com>
Tested-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210925133407.1259392-2-f4bug@amsat.org
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
sources, IOV_MAX in POSIX). Because of this, on some host adapters
requests with many iovecs are rejected with -EINVAL by the
io_submit() or readv()/writev() system calls.
In fact, the same limit applies to SG_IO as well. To fix both the
EINVAL and the possible performance issues from using fewer iovecs
than allowed by Linux (some HBAs have max_segments as low as 128),
introduce a separate entry in BlockLimits to hold the max_segments
value from sysfs. This new limit is used only for SG_IO and clamped
to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
bs->bl.max_transfer.
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org
Cc: qemu-stable@nongnu.org
Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do not round to power of 2", 2021-06-25)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210923130436.1187591-1-pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add version of bdrv_new_open_driver() that supports QDict options.
We'll use it in further commit.
Simply add one more argument to bdrv_new_open_driver() is worse, as
there are too many invocations of bdrv_new_open_driver() to update
then.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210920115538.264372-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are a couple of errors in bdrv_drained_begin header comment:
- block_job_pause does not exist anymore, it has been replaced
with job_pause in b15de82867
- job_pause is automatically invoked as a .drained_begin callback
(child_job_drained_begin) by the child_job BdrvChildClass struct
in blockjob.c. So no additional pause should be required.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210903113800.59970-1-eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Despite the comment, the members were not kept at the end.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the MemOpIdx directly, rather than the rearrangement
of the same bits currently done by the trace infrastructure.
Pass in enum qemu_plugin_mem_rw so that we are able to treat
read-modify-write operations as a single operation.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move this code from tcg/tcg.h to its own header.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We're about to move this out of tcg.h, so rename it
as we did when moving MemOp.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have lacked expressive support for memory sizes larger
than 64-bits for a while. Fixing that requires adjustment
to several points where we used this for array indexing,
and two places that develop -Wswitch warnings after the change.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
dup_const always generates a uint64_t, which may exceed the size of a
target_long (generating warnings with recent-enough compilers).
To ensure that we can use dup_const both for 64bit and 32bit targets,
this adds dup_const_tl, which either maps back to dup_const (for 64bit
targets) or provides a similar implementation using 32bit constants.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Message-Id: <20211003214243.3813425-1-philipp.tomsich@vrull.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Change caf108bc58 ("hw/i386/acpi-build: Add ACPI PCI hot-plug methods to Q35")
selects an IO address range for acpi based PCI hotplug for q35 arbitrarily. It
starts at address 0x0cc4 and ends at 0x0cdb. At the time when the patch was
written but the final version of the patch was not yet pushed upstream, this
address range was free and did not conflict with any other IO address ranges.
However, with the following change, this address range was no
longer conflict free as in this change, the IO address range
(value of ACPI_PCIHP_SIZE) was incremented by four bytes:
b32bd763a1 ("pci: introduce acpi-index property for PCI device")
This can be seen from the output of QMP command 'info mtree' :
0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt
0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt
0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr
0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0
0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi
0000000000000cc4-0000000000000cdb (prio 0, i/o): acpi-pci-hotplug
0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-cpu-hotplug
It shows that there is a region of conflict between IO regions of acpi
pci hotplug and acpi cpu hotplug.
Unfortunately, the change caf108bc58 did not update the IO address range
appropriately before it was pushed upstream to accommodate the increased
length of the IO address space introduced in change b32bd763a1.
Due to this bug, windows guests complain 'This device cannot find
enough free resources it can use' in the device manager panel for extended
IO buses. This issue also breaks the correct functioning of pci hotplug as the
following shows that the IO space for pci hotplug has been truncated:
(qemu) info mtree -f
FlatView #0
AS "I/O", root: io
Root memory region: io
0000000000000cc4-0000000000000cd7 (prio 0, i/o): acpi-pci-hotplug
0000000000000cd8-0000000000000cf7 (prio 0, i/o): acpi-cpu-hotplug
Therefore, in this fix, we adjust the IO address range for the acpi pci
hotplug so that it does not conflict with cpu hotplug and there is no
truncation of IO spaces. The starting IO address of PCI hotplug region
has been decremented by four bytes in order to accommodate four byte
increment in the IO address space introduced by change
b32bd763a1 ("pci: introduce acpi-index property for PCI device")
After fixing, the following are the corrected IO ranges:
0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt
0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt
0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr
0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0
0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi
0000000000000cc0-0000000000000cd7 (prio 0, i/o): acpi-pci-hotplug
0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-cpu-hotplug
This change has been tested using a Windows Server 2019 guest VM. Windows
no longer complains after this change.
Fixes: caf108bc58 ("hw/i386/acpi-build: Add ACPI PCI hot-plug methods to Q35")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/561
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20210916132838.3469580-3-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-36-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-35-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian
conversions when building table and use endian agnostic
build_append_int_noprefix() API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-34-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
while at it, replace packed structure with endian agnostic
build_append_FOO() API.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-33-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
while at it, replace packed structure with endian agnostic
build_append_FOO() API.
PS:
Spec is Microsoft hosted, however 1.02 is no where to be found
(MS lists only the current revision) and the current revision is 1.07,
so bring comments in line with 1.07 as this is the only available spec.
There is no content change between originally implemented 1.02
(using QEMU code as reference) and 1.07. The only change is renaming
'Reserved2' field to 'Language', with the same 0 value.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-32-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian conversions
when building IORT table use endian agnostic build_append_int_noprefix()
API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210924122802.1455362-30-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-29-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian conversions
when building MADT table for arm/x86 and use endian agnostic
build_append_int_noprefix() API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-26-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian conversions
when building MADT table for arm/x86 and use endian agnostic
build_append_int_noprefix() API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-25-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Instead of composing disabled _MAT entry and then later on
patching it to enabled for hotpluggbale CPUs in DSDT,
set it to enabled at the time _MAT entry is built.
It will allow to drop usage of packed structures in
following patches when build_madt() is switched to use
build_append_int_noprefix() API.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-24-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-23-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-22-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
While at it switch to build_append_int_noprefix() to build
table entries tables.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-19-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop usage of packed structures and explicit endian conversions
when building SRAT tables for arm/x86 and use endian agnostic
build_append_int_noprefix() API to build it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-18-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
While at it switch to build_append_int_noprefix() to build
table entries (which also removes some manual offset
calculations)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-17-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
While at it switch to build_append_int_noprefix() to build
table entries (which also removes some manual offset
calculations).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-16-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
while at it convert build_hpet() to endian agnostic
build_append_FOO() API
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210924122802.1455362-15-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offsets magic from API user.
While at it switch to build_append_int_noprefix() to build
entries to other tables (which also removes some manual offset
calculations).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-4-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it replaces error-prone pointer arithmetic for build_header() API,
with 2 calls to start and finish table creation,
which hides offests magic from API user.
While at it switch to build_append_int_noprefix() to build
entries to other tables (which also removes some manual offset
calculations).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Patch introduces acpi_table_begin()/ acpi_table_end() API
that hides pointer/offset arithmetic from user as opposed
to build_header(), to prevent errors caused by it [1].
acpi_table_begin():
initializes table header and keeps track of
table data/offsets
acpi_table_end():
sets actual table length and tells bios loader
where table is for the later initialization on
guest side.
1) commits
bb9feea431 x86: acpi: use offset instead of pointer when using build_header()
4d027afeb3 Virt: ACPI: fix qemu assert due to re-assigned table data address
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20210924122802.1455362-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
virtio-vsock features, like VIRTIO_VSOCK_F_SEQPACKET, can be handled
by vhost-vsock-common parent class. In this way, we can reuse the
same code for all virtio-vsock backends (i.e. vhost-vsock,
vhost-user-vsock).
Let's move `seqpacket` property to vhost-vsock-common class, add
vhost_vsock_common_get_features() used by children, and disable
`seqpacket` for vhost-user-vsock device for machine types < 6.2.
The behavior of vhost-vsock device doesn't change; vhost-user-vsock
device now supports `seqpacket` property.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210921161642.206461-3-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
enabled the SEQPACKET feature bit.
This commit is released with QEMU 6.1, so if we try to migrate a VM where
the host kernel supports SEQPACKET but machine type version is less than
6.1, we get the following errors:
Features 0x130000002 unsupported. Allowed features: 0x179000000
Failed to load virtio-vhost_vsock:virtio
error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
load of migration failed: Operation not permitted
Let's disable the feature bit for machine types < 6.1.
We add a new OnOffAuto property for this, called `seqpacket`.
When it is `auto` (default), QEMU behaves as before, trying to enable the
feature, when it is `on` QEMU will fail if the backend (vhost-vsock
kernel module) doesn't support it.
Fixes: 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
Cc: qemu-stable@nongnu.org
Reported-by: Jiang Wang <jiang.wang@bytedance.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210921161642.206461-2-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now we have a common structure SMPCompatProps used to store information
about SMP compatibility stuff, so we can also move smp_prefer_sockets
there for cleaner code.
No functional change intended.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-15-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now we have a generic smp parser for all arches, and there will
not be any other arch specific ones, so let's remove the callback
from MachineClass and call the parser directly.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-14-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently the only difference between smp_parse and pc_smp_parse
is the support of dies parameter and the related error reporting.
With some arch compat variables like "bool dies_supported", we can
make smp_parse generic enough for all arches and the PC specific
one can be removed.
Making smp_parse() generic enough can reduce code duplication and
ease the code maintenance, and also allows extending the topology
with more arch specific members (e.g., clusters) in the future.
Suggested-by: Andrew Jones <drjones@redhat.com>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-13-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that all the possible topology parameters are integrated in struct
CpuTopology, tweak the order of topology members to be "cpus/sockets/
dies/cores/threads/maxcpus" for readability and consistency. We also
tweak the comment by adding explanation of dies parameter.
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210929025816.21076-12-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the real SMP hardware topology world, it's much more likely that
we have high cores-per-socket counts and few sockets totally. While
the current preference of sockets over cores in smp parsing results
in a virtual cpu topology with low cores-per-sockets counts and a
large number of sockets, which is just contrary to the real world.
Given that it is better to make the virtual cpu topology be more
reflective of the real world and also for the sake of compatibility,
we start to prefer cores over sockets over threads in smp parsing
since machine type 6.2 for different arches.
In this patch, a boolean "smp_prefer_sockets" is added, and we only
enable the old preference on older machines and enable the new one
since type 6.2 for all arches by using the machine compat mechanism.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210929025816.21076-10-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Here's the next batch of ppc related patches for qemu-6.2. Highlights
are:
* Fixes for several TCG math instructions from the El Dorado Institute
* A number of improvements to the powernv machine type
* Support for a new DEVICE_UNPLUG_GUEST_ERROR QAPI event from Daniel
Barboza
* Support for the new FORM2 PAPR NUMA representation. This allows
more specific NUMA distances, as well as asymmetric configurations
* Fix for 64-bit decrementer (used on MicroWatt CPUs)
* Assorted fixes and cleanups
* A number of updates to MAINTAINERS
Note that the DEVICE_UNPLUG_GUEST_ERROR stuff includes changes to
files outside my normal area, but has suitable Acks.
The MAINTAINERS updates are mostly about marking minor platforms
unmaintained / orphaned, and moving some pieces away from myself and
Greg. As we move onto other projects, we're going to need to drop
more of the ppc maintainership, though we're hoping we can avoid too
abrupt a change.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmFVTlEACgkQbDjKyiDZ
s5IATRAAi3qB5riIXQEO41SoG5EjwJ9+2VQmagMkgwyNTQp5SyrqiWpTJ7Cq8IMF
inEgV8vHv40GFJLdXXjTvdsRkAxh1siVZR7okmbd144ZdwL/ntJ4FFiGG5BjFgf2
DajbDW/RgNDsWkTFbnhKVDNTzFbxHgeGf/g+KOPmQ5MtPPJk/Og2jJVOVE+CCY+Q
FsF7xe9ehffrGP4ObkwKM8pNMWEaJ8KP+Z1ZV5WQwQSHPZD4x1qb6aaH8JQWv7Fl
pqep5nlUv0KNsRPFj4hGhzjB9JUMMnWtSEuw5GNL0bQAj1v6tPNESRMw6fInAR/A
JlvG6iYba2ZKZ7+SGipwYyVzBPnE7ur9i2E5MaANTSQF1zgsSwcYT6J5dYSCZmNH
XrEZYYm1MyCTem8PNKrP44QRQAQz+vfHb8hUkwzNsZC7DLLsZ3JQ2AWcAvWFOKRe
/s3WyODeXa3gZMCBFbvCWVAPiKNNFGQMsa2/bkF7FpvNGl99xL3o6MDskA2dSdTK
0oreMT9OKdDS2W2LpDCasq6HHVsoJ+hc+TkWCEshEEPb7jUAIOnKqhHh/aqlXVf7
J3r68FFbNbBoPR0SgdLc4teBmUKDJIGpZT8gaJ7Vbx6IiHXH21ZzOHKuTjbJZWWN
u1vcjBBqvbQmrwhCyi6Fe5gGM63GcB/ulizM1rNyqIpTUuzyx9I=
=shYQ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.2-20210930' into staging
ppc patch queue for 2021-09-30
Here's the next batch of ppc related patches for qemu-6.2. Highlights
are:
* Fixes for several TCG math instructions from the El Dorado Institute
* A number of improvements to the powernv machine type
* Support for a new DEVICE_UNPLUG_GUEST_ERROR QAPI event from Daniel
Barboza
* Support for the new FORM2 PAPR NUMA representation. This allows
more specific NUMA distances, as well as asymmetric configurations
* Fix for 64-bit decrementer (used on MicroWatt CPUs)
* Assorted fixes and cleanups
* A number of updates to MAINTAINERS
Note that the DEVICE_UNPLUG_GUEST_ERROR stuff includes changes to
files outside my normal area, but has suitable Acks.
The MAINTAINERS updates are mostly about marking minor platforms
unmaintained / orphaned, and moving some pieces away from myself and
Greg. As we move onto other projects, we're going to need to drop
more of the ppc maintainership, though we're hoping we can avoid too
abrupt a change.
# gpg: Signature made Thu 30 Sep 2021 06:42:41 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dg-gitlab/tags/ppc-for-6.2-20210930: (44 commits)
MAINTAINERS: Demote sPAPR from "Supported" to "Maintained"
MAINTAINERS: Add information for OpenPIC
MAINTAINERS: Remove David & Greg as reviewers/co-maintainers of powernv
MAINTAINERS: Orphan obscure ppc platforms
MAINTAINERS: Remove David & Greg as reviewers for a number of boards
MAINTAINERS: Remove machine specific files from ppc TCG CPUs entry
spapr/xive: Fix kvm_xive_source_reset trace event
spapr_numa.c: fixes in spapr_numa_FORM2_write_rtas_tables()
hw/intc: openpic: Clean up the styles
hw/intc: openpic: Drop Raven related codes
hw/intc: openpic: Correct the reset value of IPIDR for FSL chipset
target/ppc: Fix 64-bit decrementer
target/ppc: Convert debug to trace events (decrementer and IRQ)
spapr_numa.c: handle auto NUMA node with no distance info
spapr_numa.c: FORM2 NUMA affinity support
spapr: move FORM1 verifications to post CAS
spapr_numa.c: rename numa_assoc_array to FORM1_assoc_array
spapr_numa.c: parametrize FORM1 macros
spapr_numa.c: scrap 'legacy_numa' concept
spapr_numa.c: split FORM1 code into helpers
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Provide a name field for all the memory listeners. It can be used to identify
which memory listener is which.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210817013553.30584-2-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Libvirt can use query-sgx-capabilities to get the host
sgx capabilities to decide how to allocate SGX EPC size to VM.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210910102258.46648-3-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The QMP and HMP interfaces can be used by monitor or QMP tools to retrieve
the SGX information from VM side when SGX is enabled on Intel platform.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210910102258.46648-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add helpers to detect if SGX EPC exists above 4g, and if so, where SGX
EPC above 4g ends. Use the helpers to adjust the device memory range
if SGX EPC exists above 4g.
For multiple virtual EPC sections, we just put them together physically
contiguous for the simplicity because we don't support EPC NUMA affinity
now. Once the SGX EPC NUMA support in the kernel SGX driver, we will
support this in the future.
Note that SGX EPC is currently hardcoded to reside above 4g.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-18-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose SGX to the guest if and only if KVM is enabled and supports
virtualization of SGX. While the majority of ENCLS can be emulated to
some degree, because SGX uses a hardware-based root of trust, the
attestation aspects of SGX cannot be emulated in software, i.e.
ultimately emulation will fail as software cannot generate a valid
quote/report. The complexity of partially emulating SGX in Qemu far
outweighs the value added, e.g. an SGX specific simulator for userspace
applications can emulate SGX for development and testing purposes.
Note, access to the PROVISIONKEY is not yet advertised to the guest as
KVM blocks access to the PROVISIONKEY by default and requires userspace
to provide additional credentials (via ioctl()) to expose PROVISIONKEY.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-13-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because SGX EPC is enumerated through CPUID, EPC "devices" need to be
realized prior to realizing the vCPUs themselves, i.e. long before
generic devices are parsed and realized. From a virtualization
perspective, the CPUID aspect also means that EPC sections cannot be
hotplugged without paravirtualizing the guest kernel (hardware does
not support hotplugging as EPC sections must be locked down during
pre-boot to provide EPC's security properties).
So even though EPC sections could be realized through the generic
-devices command, they need to be created much earlier for them to
actually be usable by the guest. Place all EPC sections in a
contiguous block, somewhat arbitrarily starting after RAM above 4g.
Ensuring EPC is in a contiguous region simplifies calculations, e.g.
device memory base, PCI hole, etc..., allows dynamic calculation of the
total EPC size, e.g. exposing EPC to guests does not require -maxmem,
and last but not least allows all of EPC to be enumerated in a single
ACPI entry, which is expected by some kernels, e.g. Windows 7 and 8.
The new compound properties command for sgx like below:
......
-object memory-backend-epc,id=mem1,size=28M,prealloc=on \
-object memory-backend-epc,id=mem2,size=10M \
-M sgx-epc.0.memdev=mem1,sgx-epc.1.memdev=mem2
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-6-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SGX EPC is enumerated through CPUID, i.e. EPC "devices" need to be
realized prior to realizing the vCPUs themselves, which occurs long
before generic devices are parsed and realized. Because of this,
do not allow 'sgx-epc' devices to be instantiated after vCPUS have
been created.
The 'sgx-epc' device is essentially a placholder at this time, it will
be fully implemented in a future patch along with a dedicated command
to create 'sgx-epc' devices.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-5-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
EPC (Enclave Page Cahe) is a specialized type of memory used by Intel
SGX (Software Guard Extensions). The SDM desribes EPC as:
The Enclave Page Cache (EPC) is the secure storage used to store
enclave pages when they are a part of an executing enclave. For an
EPC page, hardware performs additional access control checks to
restrict access to the page. After the current page access checks
and translations are performed, the hardware checks that the EPC
page is accessible to the program currently executing. Generally an
EPC page is only accessed by the owner of the executing enclave or
an instruction which is setting up an EPC page.
Because of its unique requirements, Linux manages EPC separately from
normal memory. Similar to memfd, the device /dev/sgx_vepc can be
opened to obtain a file descriptor which can in turn be used to mmap()
EPC memory.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-3-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new RAMBlock flag to denote "protected" memory, i.e. memory that
looks and acts like RAM but is inaccessible via normal mechanisms,
including DMA. Use the flag to skip protected memory regions when
mapping RAM for DMA in VFIO.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The function ide_bus_new() does an in-place initialization. Rename
it to ide_bus_init() to follow our _init vs _new convention.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Acked-by: John Snow <jsnow@redhat.com> (Feel free to merge.)
Message-id: 20210923121153.23754-7-peter.maydell@linaro.org
Rename the "allocate and return" qbus creation function to
qbus_new(), to bring it into line with our _init vs _new convention.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Message-id: 20210923121153.23754-6-peter.maydell@linaro.org
Rename qbus_create_inplace() to qbus_init(); this is more in line
with our usual naming convention for functions that in-place
initialize objects.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-5-peter.maydell@linaro.org
Rename the pci_root_bus_new_inplace() function to
pci_root_bus_init(); this brings the bus type in to line with a
"_init for in-place init, _new for allocate-and-return" convention.
To do this we need to rename the implementation-internal function
that was using the pci_root_bus_init() name to
pci_root_bus_internal_init().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-4-peter.maydell@linaro.org
Rename ipack_bus_new_inplace() to ipack_bus_init(), to bring it in to
line with a "_init for in-place init, _new for allocate-and-return"
convention. Drop the 'name' argument, because the only caller does
not pass in a name. If a future caller does need to specify the bus
name, we should create an ipack_bus_init_named() function at that
point.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-3-peter.maydell@linaro.org
The function scsi_bus_new() creates a new SCSI bus; callers can
either pass in a name argument to specify the name of the new bus, or
they can pass in NULL to allow the bus to be given an automatically
generated unique name. Almost all callers want to use the
autogenerated name; the only exception is the virtio-scsi device.
Taking a name argument that should almost always be NULL is an
easy-to-misuse API design -- it encourages callers to think perhaps
they should pass in some standard name like "scsi" or "scsi-bus". We
don't do this anywhere for SCSI, but we do (incorrectly) do it for
other bus types such as i2c.
The function name also implies that it will return a newly allocated
object, when it in fact does in-place allocation. We more commonly
name such functions foo_init(), with foo_new() being the
allocate-and-return variant.
Replace all the scsi_bus_new() callsites with either:
* scsi_bus_init() for the usual case where the caller wants
an autogenerated bus name
* scsi_bus_init_named() for the rare case where the caller
needs to specify the bus name
and document that for the _named() version it's then the caller's
responsibility to think about uniqueness of bus names.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210923121153.23754-2-peter.maydell@linaro.org
Connect the support for ZynqMP eFUSE one-time field-programmable
bit array.
The command argument:
-drive if=pflash,index=3,...
Can be used to optionally connect the bit array to a
backend storage, such that field-programmed values
in one invocation can be made available to next
invocation.
The backend storage must be a seekable binary file, and
its size must be 768 bytes or larger. A file with all
binary 0's is a 'blank'.
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-9-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Connect the support for Xilinx ZynqMP Battery-Backed RAM (BBRAM)
The command argument:
-drive if=pflash,index=2,...
Can be used to optionally connect the bbram to a backend
storage, such that field-programmed values in one
invocation can be made available to next invocation.
The backend storage must be a seekable binary file, and
its size must be 36 bytes or larger. A file with all
binary 0's is a 'blank'.
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-8-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Connect the support for Versal eFUSE one-time field-programmable
bit array.
The command argument:
-drive if=pflash,index=1,...
Can be used to optionally connect the bit array to a
backend storage, such that field-programmed values
in one invocation can be made available to next
invocation.
The backend storage must be a seekable binary file, and
its size must be 3072 bytes or larger. A file with all
binary 0's is a 'blank'.
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-7-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Connect the support for Versal Battery-Backed RAM (BBRAM)
The command argument:
-drive if=pflash,index=0,...
Can be used to optionally connect the bbram to a backend
storage, such that field-programmed values in one
invocation can be made available to next invocation.
The backend storage must be a seekable binary file, and
its size must be 36 bytes or larger. A file with all
binary 0's is a 'blank'.
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-6-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This device is present in Versal and ZynqMP product
families to store a 256-bit encryption key.
Co-authored-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Co-authored-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-5-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This implements the Xilinx ZynqMP eFuse, an one-time
field-programmable non-volatile storage device. There is
only one such device in the Xilinx ZynqMP product family.
Co-authored-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Co-authored-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-4-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This implements the Xilinx Versal eFuse, an one-time
field-programmable non-volatile storage device. There is
only one such device in the Xilinx Versal product family.
This device has two separate mmio interfaces, a controller
and a flatten readback.
The controller provides interfaces for field-programming,
configuration, control, and status.
The flatten readback is a cache to provide a byte-accessible
read-only interface to efficiently read efuse array.
Co-authored-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Co-authored-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-3-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This introduces the QOM for Xilinx eFuse, an one-time
field-programmable storage bit array.
The actual mmio interface to the array varies by device
families and will be provided in different change-sets.
Co-authored-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Co-authored-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Tong Ho <tong.ho@xilinx.com>
Message-id: 20210917052400.1249094-2-tong.ho@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Vladimir Sementsov-Ogievskiy: Rework coroutines of qemu NBD client
to improve reconnect support
- Eric Blake: Relax server in regards to NBD_OPT_LIST_META_CONTEXT
- Vladimir Sementsov-Ogievskiy: Plumb up 64-bit bulk-zeroing support
in block layer, in preparation for future NBD spec extensions
- Nir Soffer: Default to writeback cache in qemu-nbd
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmFU1a4ACgkQp6FrSiUn
Q2obEggAq4KgnBILBip6TsFc9p7yzVpK9qWzri0vhJHOI7p+dE5Bxqt8sLBEdMQP
OiV5WE/ZRHogGULXXJChBmUTK2/z0U57AXVvbCpKWgr48kSj818dk8uhuUjBeWaM
5Qc8PsH+/Rij5WRlVmTePu7QMwXp1h3+gkw3fmLlHRkvzO4MmOBn1lrOqnCxSGbo
xnFvfdeplNiexmpImdO6QSaHfDsmnUqOFpBPZUsKXfdHOiRqg2I3eU1ibv8eQH7B
oTtBxWI0KHc6kWaJUWqZbEe4ChJTHvAsVdZxmB+il1diIl46lq+s/Zn7nT+0HXTD
pS/Fws9rrGKS/7PnGeFOfskVVCRnIw==
=VGqR
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2021-09-27-v2' into staging
nbd patches for 2021-09-27
- Vladimir Sementsov-Ogievskiy: Rework coroutines of qemu NBD client
to improve reconnect support
- Eric Blake: Relax server in regards to NBD_OPT_LIST_META_CONTEXT
- Vladimir Sementsov-Ogievskiy: Plumb up 64-bit bulk-zeroing support
in block layer, in preparation for future NBD spec extensions
- Nir Soffer: Default to writeback cache in qemu-nbd
# gpg: Signature made Wed 29 Sep 2021 22:07:58 BST
# gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg: aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2021-09-27-v2:
block/nbd: check that received handle is valid
block/nbd: drop connection_co
block/nbd: refactor nbd_recv_coroutines_wake_all()
block/nbd: move nbd_recv_coroutines_wake_all() up
block/nbd: nbd_channel_error() shutdown channel unconditionally
nbd/client-connection: nbd_co_establish_connection(): fix non set errp
nbd/server: Allow LIST_META_CONTEXT without STRUCTURED_REPLY
block/io: allow 64bit discard requests
block: use int64_t instead of int in driver discard handlers
block: make BlockLimits::max_pdiscard 64bit
block/io: allow 64bit write-zeroes requests
block: use int64_t instead of int in driver write_zeroes handlers
block: make BlockLimits::max_pwrite_zeroes 64bit
block: use int64_t instead of uint64_t in copy_range driver handlers
block: use int64_t instead of uint64_t in driver write handlers
block: use int64_t instead of uint64_t in driver read handlers
qcow2: check request on vmstate save/load path
block/io: bring request check to bdrv_co_(read,write)v_vmstate
qemu-nbd: Change default cache mode to writeback
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Correct the multi-line comment format. No functional changes.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20210918032653.646370-3-bin.meng@windriver.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There is no machine that uses Motorola MCP750 (aka Raven) model.
Drop the related codes.
While we are here, drop the mentioning of Intel GW80314 I/O
companion chip in the comments as it has been obsolete for years,
and correct a typo too.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20210918032653.646370-2-bin.meng@windriver.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The main feature of FORM2 affinity support is the separation of NUMA
distances from ibm,associativity information. This allows for a more
flexible and straightforward NUMA distance assignment without relying on
complex associations between several levels of NUMA via
ibm,associativity matches. Another feature is its extensibility. This base
support contains the facilities for NUMA distance assignment, but in the
future more facilities will be added for latency, performance, bandwidth
and so on.
This patch implements the base FORM2 affinity support as follows:
- the use of FORM2 associativity is indicated by using bit 2 of byte 5
of ibm,architecture-vec-5. A FORM2 aware guest can choose to use FORM1
or FORM2 affinity. Setting both forms will default to FORM2. We're not
advertising FORM2 for pseries-6.1 and older machine versions to prevent
guest visible changes in those;
- ibm,associativity-reference-points has a new semantic. Instead of
being used to calculate distances via NUMA levels, it's now used to
indicate the primary domain index in the ibm,associativity domain of
each resource. In our case it's set to {0x4}, matching the position
where we already place logical_domain_id;
- two new RTAS DT artifacts are introduced: ibm,numa-lookup-index-table
and ibm,numa-distance-table. The index table is used to list all the
NUMA logical domains of the platform, in ascending order, and allows for
spartial NUMA configurations (although QEMU ATM doesn't support that).
ibm,numa-distance-table is an array that contains all the distances from
the first NUMA node to all other nodes, then the second NUMA node
distances to all other nodes and so on;
- get_max_dist_ref_points(), get_numa_assoc_size() and get_associativity()
now checks for OV5_FORM2_AFFINITY and returns FORM2 values if the guest
selected FORM2 affinity during CAS.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-7-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
FORM2 NUMA affinity is prepared to deal with empty (memory/cpu less)
NUMA nodes. This is used by the DAX KMEM driver to locate a PAPR SCM
device that has a different latency than the original NUMA node from the
regular memory. FORM2 is also able to deal with asymmetric NUMA
distances gracefully, something that our FORM1 implementation doesn't
do.
Move these FORM1 verifications to a new function and wait until after
CAS, when we're sure that we're sticking with FORM1, to enforce them.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-6-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introducing a new NUMA affinity, FORM2, requires a new mechanism to
switch between affinity modes after CAS. Also, we want FORM2 data
structures and functions to be completely separated from the existing
FORM1 code, allowing us to avoid adding new code that inherits the
existing complexity of FORM1.
The idea of switching values used by the write_dt() functions in
spapr_numa.c was already introduced in the previous patch, and
the same approach will be used when dealing with the FORM1 and FORM2
arrays.
We can accomplish that by that by renaming the existing numa_assoc_array
to FORM1_assoc_array, which now is used exclusively to handle FORM1 affinity
data. A new helper get_associativity() is then introduced to be used by the
write_dt() functions to retrieve the current ibm,associativity array of
a given node, after considering affinity selection that might have been
done during CAS. All code that was using numa_assoc_array now needs to
retrieve the array by calling this function.
This will allow for an easier plug of FORM2 data later on.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-5-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The next preliminary step to introduce NUMA FORM2 affinity is to make
the existing code independent of FORM1 macros and values, i.e.
MAX_DISTANCE_REF_POINTS, NUMA_ASSOC_SIZE and VCPU_ASSOC_SIZE. This patch
accomplishes that by doing the following:
- move the NUMA related macros from spapr.h to spapr_numa.c where they
are used. spapr.h gets instead a 'NUMA_NODES_MAX_NUM' macro that is used
to refer to the maximum number of NUMA nodes, including GPU nodes, that
the machine can support;
- MAX_DISTANCE_REF_POINTS and NUMA_ASSOC_SIZE are renamed to
FORM1_DIST_REF_POINTS and FORM1_NUMA_ASSOC_SIZE. These FORM1 specific
macros are used in FORM1 init functions;
- code that uses MAX_DISTANCE_REF_POINTS now retrieves the
max_dist_ref_points value using get_max_dist_ref_points().
NUMA_ASSOC_SIZE is replaced by get_numa_assoc_size() and VCPU_ASSOC_SIZE
is replaced by get_vcpu_assoc_size(). These functions are used by the
generic device tree functions and h_home_node_associativity() and will
allow them to switch between FORM1 and FORM2 without changing their core
logic.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210920174947.556324-4-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver discard handlers bytes parameter to int64_t.
The only caller of all updated function is bdrv_co_pdiscard in
block/io.c. It is already prepared to work with 64bit requests, but
pass at most max(bs->bl.max_pdiscard, INT_MAX) to the driver.
Let's look at all updated functions:
blkdebug: all calculations are still OK, thanks to
bdrv_check_qiov_request().
both rule_check and bdrv_co_pdiscard are 64bit
blklogwrites: pass to blk_loc_writes_co_log which is 64bit
blkreplay, copy-on-read, filter-compress: pass to bdrv_co_pdiscard, OK
copy-before-write: pass to bdrv_co_pdiscard which is 64bit and to
cbw_do_copy_before_write which is 64bit
file-posix: one handler calls raw_account_discard() is 64bit and both
handlers calls raw_do_pdiscard(). Update raw_do_pdiscard, which pass
to RawPosixAIOData::aio_nbytes, which is 64bit (and calls
raw_account_discard())
gluster: somehow, third argument of glfs_discard_async is size_t.
Let's set max_pdiscard accordingly.
iscsi: iscsi_allocmap_set_invalid is 64bit,
!is_byte_request_lun_aligned is 64bit.
list.num is uint32_t. Let's clarify max_pdiscard and
pdiscard_alignment.
mirror_top: pass to bdrv_mirror_top_do_write() which is
64bit
nbd: protocol limitation. max_pdiscard is alredy set strict enough,
keep it as is for now.
nvme: buf.nlb is uint32_t and we do shift. So, add corresponding limits
to nvme_refresh_limits().
preallocate: pass to bdrv_co_pdiscard() which is 64bit.
rbd: pass to qemu_rbd_start_co() which is 64bit.
qcow2: calculations are still OK, thanks to bdrv_check_qiov_request(),
qcow2_cluster_discard() is 64bit.
raw-format: raw_adjust_offset() is 64bit, bdrv_co_pdiscard too.
throttle: pass to bdrv_co_pdiscard() which is 64bit and to
throttle_group_co_io_limits_intercept() which is 64bit as well.
test-block-iothread: bytes argument is unused
Great! Now all drivers are prepared to handle 64bit discard requests,
or else have explicit max_pdiscard limits.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-11-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We are going to support 64 bit discard requests. Now update the
limit variable. It's absolutely safe. The variable is set in some
drivers, and used in bdrv_co_pdiscard().
Update also max_pdiscard variable in bdrv_co_pdiscard(), so that
bdrv_co_pdiscard() is now prepared for 64bit requests. The remaining
logic including num, offset and bytes variables is already
supporting 64bit requests.
So the only thing that prevents 64 bit requests is limiting
max_pdiscard variable to INT_MAX in bdrv_co_pdiscard().
We'll drop this limitation after updating all block drivers.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210903102807.27127-10-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver write_zeroes handlers bytes parameter to int64_t.
The only caller of all updated function is bdrv_co_do_pwrite_zeroes().
bdrv_co_do_pwrite_zeroes() itself is of course OK with widening of
callee parameter type. Also, bdrv_co_do_pwrite_zeroes()'s
max_write_zeroes is limited to INT_MAX. So, updated functions all are
safe, they will not get "bytes" larger than before.
Still, let's look through all updated functions, and add assertions to
the ones which are actually unprepared to values larger than INT_MAX.
For these drivers also set explicit max_pwrite_zeroes limit.
Let's go:
blkdebug: calculations can't overflow, thanks to
bdrv_check_qiov_request() in generic layer. rule_check() and
bdrv_co_pwrite_zeroes() both have 64bit argument.
blklogwrites: pass to blk_log_writes_co_log() with 64bit argument.
blkreplay, copy-on-read, filter-compress: pass to
bdrv_co_pwrite_zeroes() which is OK
copy-before-write: Calls cbw_do_copy_before_write() and
bdrv_co_pwrite_zeroes, both have 64bit argument.
file-posix: both handler calls raw_do_pwrite_zeroes, which is updated.
In raw_do_pwrite_zeroes() calculations are OK due to
bdrv_check_qiov_request(), bytes go to RawPosixAIOData::aio_nbytes
which is uint64_t.
Check also where that uint64_t gets handed:
handle_aiocb_write_zeroes_block() passes a uint64_t[2] to
ioctl(BLKZEROOUT), handle_aiocb_write_zeroes() calls do_fallocate()
which takes off_t (and we compile to always have 64-bit off_t), as
does handle_aiocb_write_zeroes_unmap. All look safe.
gluster: bytes go to GlusterAIOCB::size which is int64_t and to
glfs_zerofill_async works with off_t.
iscsi: Aha, here we deal with iscsi_writesame16_task() that has
uint32_t num_blocks argument and iscsi_writesame16_task() has
uint16_t argument. Make comments, add assertions and clarify
max_pwrite_zeroes calculation.
iscsi_allocmap_() functions already has int64_t argument
is_byte_request_lun_aligned is simple to update, do it.
mirror_top: pass to bdrv_mirror_top_do_write which has uint64_t
argument
nbd: Aha, here we have protocol limitation, and NBDRequest::len is
uint32_t. max_pwrite_zeroes is cleanly set to 32bit value, so we are
OK for now.
nvme: Again, protocol limitation. And no inherent limit for
write-zeroes at all. But from code that calculates cdw12 it's obvious
that we do have limit and alignment. Let's clarify it. Also,
obviously the code is not prepared to handle bytes=0. Let's handle
this case too.
trace events already 64bit
preallocate: pass to handle_write() and bdrv_co_pwrite_zeroes(), both
64bit.
rbd: pass to qemu_rbd_start_co() which is 64bit.
qcow2: offset + bytes and alignment still works good (thanks to
bdrv_check_qiov_request()), so tail calculation is OK
qcow2_subcluster_zeroize() has 64bit argument, should be OK
trace events updated
qed: qed_co_request wants int nb_sectors. Also in code we have size_t
used for request length which may be 32bit. So, let's just keep
INT_MAX as a limit (aligning it down to pwrite_zeroes_alignment) and
don't care.
raw-format: Is OK. raw_adjust_offset and bdrv_co_pwrite_zeroes are both
64bit.
throttle: Both throttle_group_co_io_limits_intercept() and
bdrv_co_pwrite_zeroes() are 64bit.
vmdk: pass to vmdk_pwritev which is 64bit
quorum: pass to quorum_co_pwritev() which is 64bit
Hooray!
At this point all block drivers are prepared to support 64bit
write-zero requests, or have explicitly set max_pwrite_zeroes.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-8-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: use <= rather than < in assertions relying on max_pwrite_zeroes]
Signed-off-by: Eric Blake <eblake@redhat.com>
We are going to support 64 bit write-zeroes requests. Now update the
limit variable. It's absolutely safe. The variable is set in some
drivers, and used in bdrv_co_do_pwrite_zeroes().
Update also max_write_zeroes variable in bdrv_co_do_pwrite_zeroes(), so
that bdrv_co_do_pwrite_zeroes() is now prepared to 64bit requests. The
remaining logic including num, offset and bytes variables is already
supporting 64bit requests.
So the only thing that prevents 64 bit requests is limiting
max_write_zeroes variable to INT_MAX in bdrv_co_do_pwrite_zeroes().
We'll drop this limitation after updating all block drivers.
Ah, we also have bdrv_check_request32() in bdrv_co_pwritev_part(). It
will be modified to do bdrv_check_request() for write-zeroes path.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-7-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver copy_range handlers parameters which are already
64bit to signed type.
Now let's consider all callers. Simple
git grep '\->bdrv_co_copy_range'
shows the only caller:
bdrv_co_copy_range_internal(), which does bdrv_check_request32(),
so everything is OK.
Still, the functions may be called directly, not only by drv->...
Let's check:
git grep '\.bdrv_co_copy_range_\(from\|to\)\s*=' | \
awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \
while read func; do git grep "$func(" | \
grep -v "$func(BlockDriverState"; done
shows no more callers. So, we are done.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210903102807.27127-6-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>