Coverity caught a malloc() call that was never freed. This patch ensures
that we free the memory but also updates the allocation to use
g_strdup_printf() instead of malloc().
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Add the spapr cap SPAPR_CAP_NESTED_KVM_HV to be used to control the
availability of nested kvm-hv to the level 1 (L1) guest.
Assuming a hypervisor with support enabled an L1 guest can be allowed to
use the kvm-hv module (and thus run it's own kvm-hv guests) by setting:
-machine pseries,cap-nested-hv=true
or disabled with:
-machine pseries,cap-nested-hv=false
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Coverity points out in CID 1390588 that the test for sh == 0
in sdram_size() can never fire, because we calculate sh with
sh = 1024 - ((bcr >> 6) & 0x3ff);
which must result in a value between 1 and 1024 inclusive.
Without the relevant manual for the SoC, we're not completely
sure of the correct behaviour here, but we can remove the
dead code without changing how QEMU currently behaves.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In ppc_core99_init(), we allocate an openpic_irqs array, which
we then use to collect up the various qemu_irqs which we're
going to connect to the interrupt controller. Once we've
called sysbus_connect_irq() to connect them all up, the
array is no longer required, but we forgot to free it.
Since board init is only run once at startup, the memory
leak is not a significant one.
Spotted by Coverity: CID 1192916.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Fix missing terminator in VMStateDescription
Fixes: d811d61fbc
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When allocating an array, it is a recommended coding practice to call
g_new(FooType, n) instead of g_malloc(n * sizeof(FooType)) because
it takes care to avoid overflow when calculating the size of the
allocated block and it returns FooType *, which allows the compiler
to perform type checking.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr-rng device is suboptimal when compared to virtio-rng, so
users might want to disable it in their builds. Thus let's introduce
a proper CONFIG switch to allow us to compile QEMU without this device.
The function spapr_rng_populate_dt is required for linking, so move it
to a different location.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Pass other sense, such as UNIT_ATTENTION or BUSY, directly to the
guest.
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Emulation of the block limits VPD page called back into scsi-disk.c,
which however expected the request to be for a SCSIDiskState and
accessed a scsi-generic device outside the bounds of its struct
(namely to retrieve s->max_unmap_size and s->max_io_size).
To avoid this, move the emulation code to a separate function that
takes a new SCSIBlockLimits struct and marshals it into the VPD
response format.
Reported-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A device can report an excessive number of VPD pages when asked for a
list; this can cause an out-of-bounds access to buf in
scsi_generic_set_vpd_bl_emulation. It should not happen, but
it is technically not incorrect so handle it: do not check any byte
past the allocation length that was sent to the INQUIRY command.
Reported-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block limits emulation is just placing 0xb0 as the final byte of the
VPD pages list. However, VPD page numbers must be sorted, so change
that to an in-place insert. Since I couldn't find any disk that triggered
the loop more than once, this was tested by adding manually 0xb1
at the end of the list and checking that 0xb0 was added before.
Reported-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While writing a message in 'lsi_do_msgin', message length value
in 'msg_len' could be invalid due to an invalid migration stream.
Add an assertion to avoid an out of bounds access, and reject
the incoming migration data if it contains an invalid message
length.
Discovered by Deja vu Security. Reported by Oracle.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20181026194314.18663-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_new() returns a new backend with refcount == 1 and
then later object_property_add_child() increases refcount to 2
So when ivshmem is destroyed, the backend it has created isn't
destroyed along with it as children cleanup will bring
backend's refcount only to 1, which leaks backend including
resources it is using.
Drop the original reference from object_new() once backend
is attached to its parent.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1541069086-167036-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Fixes: 5503e28504
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'q35' machine type implements an Intel Series 3 chipset,
of which there are several variants:
https://www.intel.com/Assets/PDF/datasheet/316966.pdf
The key difference between the 82P35 MCH ('p35', PCI device ID 0x29c0)
and 82Q35 GMCH ('q35', PCI device ID 0x29b0) variants is that the latter
has an integrated graphics adapter. QEMU does not implement integrated
graphics, so uses the PCI ID for the 82P35 chipset, despite calling the
machine type 'q35'. Thus we rename the PCI device ID constant to reflect
reality, to avoid confusing future developers. The new name more closely
matches what pci.ids reports it to be:
$ grep P35 /usr/share/hwdata/pci.ids | grep 29
29c0 82G33/G31/P35/P31 Express DRAM Controller
29c1 82G33/G31/P35/P31 Express PCI Express Root Port
29c4 82G33/G31/P35/P31 Express MEI Controller
29c5 82G33/G31/P35/P31 Express MEI Controller
29c6 82G33/G31/P35/P31 Express PT IDER Controller
29c7 82G33/G31/P35/P31 Express Serial KT Controller
$ grep Q35 /usr/share/hwdata/pci.ids | grep 29
29b0 82Q35 Express DRAM Controller
29b1 82Q35 Express PCI Express Root Port
29b2 82Q35 Express Integrated Graphics Controller
29b3 82Q35 Express Integrated Graphics Controller
29b4 82Q35 Express MEI Controller
29b5 82Q35 Express MEI Controller
29b6 82Q35 Express PT IDER Controller
29b7 82Q35 Express Serial KT Controller
Arguably the QEMU machine type should be named 'p35'. At this point in
time, however, it is not worth the churn for management applications &
documentation to worry about renaming it.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180830105757.10577-1-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Remove can't-happen if() from handle_vec_simd_shli()
* hw/arm/exynos4210: Zero memory allocated for Exynos4210State
* Set S and PTW in 64-bit PAR format
* Fix ATS1Hx instructions
* milkymist: Check for failure trying to load BIOS image
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJb4Xz6AAoJEDwlJe0UNgzecj4P/ApIjAZbPAYLPjhMoA9SZmY1
O2tlcPieYZQZu+GM28TW8URMEK3ter/6rBMuwDsX6qsvyaGE65MV2UDCTjDq2yIt
dE29xOYn8wrSz/xiuMnDU/V26kjyZYtteYo6lS/TWmFeftuhdSXyFX3Lq7KyQ7fE
dOkb9eI6/XNiRajw9kC3OubUIgMpkGz641i9wMa1RY2lq4TtJH1qR6WWl3hHR9o2
riY3BupD7COcFW8ZUjGC+0dl1AqlokVuLojjDGYmzCY6sBnAYyTnxheZmtgcK0RC
Bju5Gnrhw7N9YEiIdThni4hp9s45+u2VifxOXp8Bpa6bkriomkArJYBn+GqNvuox
WDp9ijWlesl3N3B3BlSN3cq2l8I+HRtX/yzWvSgjtEjjQ5IkOHgoPPYCL0N38jlI
2/nLKkTB1ltoGLkj15Vco3bmDhx1fonlr9PdtHoEoJWQZVzNk8JU72wQiKlqIzMt
GbiabB66IaWNgcoXIwu/6NiG7QSZ4zbo4HYqwY9lXS9pCLCAZQF23nfOJmNmqm6E
lrZOjnkO+Q8u6J6Km+zpd6V3Ml2aPJGuRoMsx5qLhSm3poyYgW29jojYNzSAGdDs
k4jhfT1ZfHyjK/hRJGIVr5fEbNpJMHAXdlVxveU04QysL/NPtGfCAtmqia/R7tye
gjcQ4+TzPQY0Ad03/GdW
=7I2F
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20181106' into staging
target-arm queue:
* Remove can't-happen if() from handle_vec_simd_shli()
* hw/arm/exynos4210: Zero memory allocated for Exynos4210State
* Set S and PTW in 64-bit PAR format
* Fix ATS1Hx instructions
* milkymist: Check for failure trying to load BIOS image
# gpg: Signature made Tue 06 Nov 2018 11:37:30 GMT
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20181106:
target/arm: Fix ATS1Hx instructions
target/arm: Set S and PTW in 64-bit PAR format
hw/arm/exynos4210: Zero memory allocated for Exynos4210State
milkymist: Check for failure trying to load BIOS image
target/arm: Remove can't-happen if() from handle_vec_simd_shli()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
AMD IOMMU VAPIC support + fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJb4IrKAAoJECgfDbjSjVRp9xEIAIT25r0SeThU32cl8955dBu3
L2q2e+4du4KcwrC1a65mhBeATFtRthL/cWFHf1rvmwsp1t6ib+uVBH/3ezH1b48o
rhrPjysYGbX+M/gxHv8uBM01JnMnmsaZVJv2iAifkO1fjJ5VCWXqJt89y7VryeUz
LRzN1Zzq84umDXUuqptBKI8MF8ySwqnRHCE6YrbpTAppaJRY8zIyWkQzMd+Ls9m/
Rwuo6QiySD4z5WrnL2hpvUCQw2qDTct9xDNrlGpxL1JVvOgo5Y5VFkF2X9IP7qap
TIC7Y9cfUjGNf8ferYsydgzpyTjFrBMUqqcu65HjUlpACXwwwrLHPScfpT37VJI=
=WPCi
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio: fixes, features
AMD IOMMU VAPIC support + fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 05 Nov 2018 18:24:10 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (33 commits)
vhost-scsi: prevent using uninitialized vqs
piix_pci: fix i440fx data sheet link
piix: use TYPE_FOO constants than string constats
i440fx: use ARRAY_SIZE for pam_regions
pci_bridge: fix typo in comment
hw/pci: Add missing include
hw/pci-bridge/ioh3420: Remove unuseful header
hw/pci-bridge/xio3130: Remove unused functions
tests/bios-tables-test: add 64-bit PCI MMIO aperture round-up test on Q35
bios-tables-test: prepare expected files for mmio64
hw/pci-host/x86: extend the 64-bit PCI hole relative to the fw-assigned base
hw/pci-host/x86: extract get_pci_hole64_start_value() helpers
pci-testdev: add optional memory bar
MAINTAINERS: list "tests/acpi-test-data" files in ACPI/SMBIOS section
x86_iommu/amd: Enable Guest virtual APIC support
x86_iommu/amd: Add interrupt remap support when VAPIC is enabled
i386: acpi: add IVHD device entry for IOAPIC
x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled
x86_iommu/amd: Prepare for interrupt remap support
x86_iommu/amd: make the address space naming consistent with intel-iommu
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In exynos4210_init() we allocate memory for an Exynos4210State
struct. Generally devices can assume that the memory allocated
for their state struct is zero-initialized; we broke that
assumption here by using g_new(). Use g_new0() instead.
(In particular, some code assumes that the various irq arrays
in the Exynos4210Irq sub-struct are zero-initialized.)
In the longer term, this code should be QOMified, and then
the struct memory will be allocated elsewhere and by functions
which always zero-initalize it; but for 3.1 this is a
simple fix.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181105151132.13884-1-peter.maydell@linaro.org
Check the return value from load_image_targphys(), which tells us
whether our attempt to load the BIOS image into RAM failed.
(Spotted by Coverity, CID 1190305.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael Walle <michael@walle.cc>
Message-id: 20181030170032.1844-1-peter.maydell@linaro.org
There are 3 virtqueues (ctrl, event and cmd) for virtio scsi device,
but seabios will only set the physical address for the 3rd one (cmd).
Then in vhost_virtqueue_start(), virtio_queue_get_desc_addr()
will be 0 for ctrl and event vq.
In this case, ctrl and event vq are not initialized.
vhost_verify_ring_mappings may use uninitialized vhost_virtqueue
such that vhost_verify_ring_part_mapping returns ENOMEM.
When encountered this problem, we got the following logs:
qemu-system-x86_64: Unable to map available ring for ring 0
qemu-system-x86_64: Verify ring failure on region 0
Signed-off-by: Forrest Liu <forrestl@synology.com>
Signed-off-by: yuchenlin <yuchenlin@synology.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It seems that the intel link is unavailable, change it to point to the
qemu site.
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Make them more QOMConventional.
Cc:qemu-trivial@nongnu.org
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-trivial@nongnu.org
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduced in 48ebf2f90f and faf1e708d5, these functions
were never used. Remove them.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In commit 9fa99d2519 ("hw/pci-host: Fix x86 Host Bridges 64bit PCI
hole", 2017-11-16), we meant to expose such a 64-bit PCI MMIO aperture in
the ACPI DSDT that would be at least as large as the new "pci-hole64-size"
property (2GB on i440fx, 32GB on q35). The goal was to offer "enough"
64-bit MMIO aperture to the guest OS for hotplug purposes.
In that commit, we added or modified five functions:
- pc_pci_hole64_start(): shared between i440fx and q35. Provides a default
64-bit base, which starts beyond the cold-plugged 64-bit RAM, and skips
the DIMM hotplug area too (if any).
- i440fx_pcihost_get_pci_hole64_start(), q35_host_get_pci_hole64_start():
board-specific 64-bit base property getters called abstractly by the
ACPI generator. Both of these fall back to pc_pci_hole64_start() if the
firmware didn't program any 64-bit hole (i.e. if the firmware didn't
assign a 64-bit GPA to any MMIO BAR on any device). Otherwise, they
honor the firmware's BAR assignments (i.e., they treat the lowest 64-bit
GPA programmed by the firmware as the base address for the aperture).
- i440fx_pcihost_get_pci_hole64_end(), q35_host_get_pci_hole64_end():
these intended to extend the aperture to our size recommendation,
calculated relative to the base of the aperture.
Despite the original intent, i440fx_pcihost_get_pci_hole64_end() and
q35_host_get_pci_hole64_end() currently only extend the aperture relative
to the default base (pc_pci_hole64_start()), ignoring any programming done
by the firmware. This means that our size recommendation may not be met.
Fix it by honoring the firmware's address assignments.
The strange extension sizes were spotted by Alex, in the log of a guest
kernel running on top of OVMF (which prefers to assign 64-bit GPAs to
64-bit BARs).
This change only affects DSDT generation, therefore no new compat property
is being introduced.
Using an i440fx OVMF guest with 5GB RAM, an example _CRS change is:
> @@ -881,9 +881,9 @@
> QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite,
> 0x0000000000000000, // Granularity
> 0x0000000800000000, // Range Minimum
> - 0x000000080001C0FF, // Range Maximum
> + 0x000000087FFFFFFF, // Range Maximum
> 0x0000000000000000, // Translation Offset
> - 0x000000000001C100, // Length
> + 0x0000000080000000, // Length
> ,, , AddressRangeMemory, TypeStatic)
> })
> Device (GPE0)
(On i440fx, the low RAM split is at 3GB, in this case. Therefore, with 5GB
guest RAM and no DIMM hotplug range, pc_pci_hole64_start() returns 4 +
(5-3) = 6 GB. Adding the 2GB extension to that yields 8GB, which is below
the firmware-programmed base of 32GB, before the patch. Therefore, before
the patch, the extension is ineffective. After the patch, we add the 2GB
extension to the firmware-programmed base, namely 32GB.)
Using a q35 OVMF guest with 5GB RAM, an example _CRS change is:
> @@ -3162,9 +3162,9 @@
> QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite,
> 0x0000000000000000, // Granularity
> 0x0000000800000000, // Range Minimum
> - 0x00000009BFFFFFFF, // Range Maximum
> + 0x0000000FFFFFFFFF, // Range Maximum
> 0x0000000000000000, // Translation Offset
> - 0x00000001C0000000, // Length
> + 0x0000000800000000, // Length
> ,, , AddressRangeMemory, TypeStatic)
> })
> Device (GPE0)
(On Q35, the low RAM split is at 2GB. Therefore, with 5GB guest RAM and no
DIMM hotplug range, pc_pci_hole64_start() returns 4 + (5-2) = 7 GB. Adding
the 32GB extension to that yields 39GB (0x0000_0009_BFFF_FFFF + 1), before
the patch. After the patch, we add the 32GB extension to the
firmware-programmed base, namely 32GB.)
The ACPI test data for the bios-tables-test case that we added earlier in
this series are corrected too, as follows:
> @@ -3339,9 +3339,9 @@
> QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite,
> 0x0000000000000000, // Granularity
> 0x0000000200000000, // Range Minimum
> - 0x00000009BFFFFFFF, // Range Maximum
> + 0x00000009FFFFFFFF, // Range Maximum
> 0x0000000000000000, // Translation Offset
> - 0x00000007C0000000, // Length
> + 0x0000000800000000, // Length
> ,, , AddressRangeMemory, TypeStatic)
> })
> Device (GPE0)
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Fixes: 9fa99d2519
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Expose the calculated "hole64 start" GPAs as plain uint64_t values,
extracting the internals of the current property getters.
This patch doesn't change behavior.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add memory bar to pci-testdev. Size is configurable using the membar
property. Setting the size to zero (default) turns it off. Can be used
to check whether guests handle large pci bars correctly.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that amd-iommu support interrupt remapping, enable the GASup in IVRS
table and GASup in extended feature register to indicate that IOMMU
support guest virtual APIC mode. GASup provides option to guest OS to
make use of 128-bit IRTE.
Note that the GAMSup is set to zero to indicate that amd-iommu does not
support guest virtual APIC mode (aka AVIC) which would be used for the
nested VMs.
See Table 21 from IOMMU spec for interrupt virtualization controls
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Emulate the interrupt remapping support when guest virtual APIC is
enabled.
For more information refer: IOMMU spec rev 3.0 (section 2.2.5.2)
When VAPIC is enabled, it uses interrupt remapping as defined in
Table 22 and Figure 17 from IOMMU spec.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When interrupt remapping is enabled, add a special IVHD device
(type IOAPIC).
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Emulate the interrupt remapping support when guest virtual APIC is
not enabled.
For more info Refer: AMD IOMMU spec Rev 3.0 - section 2.2.5.1
When VAPIC is not enabled, it uses interrupt remapping as defined in
Table 20 and Figure 15 from IOMMU spec.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Register the interrupt remapping callback and read/write ops for the
amd-iommu-ir memory region.
amd-iommu-ir is set to higher priority to ensure that this region won't
be masked out by other memory regions.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To be consistent with intel-iommu:
- rename the address space to use '_' instead of '-'
- update the memory region relationships
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, the amdvi_validate_dte() assumes that a valid DTE will
always have V=1. This is not true. The V=1 means that bit[127:1] are
valid. A valid DTE can have IV=1 and V=0 (i.e address translation
disabled and interrupt remapping enabled)
Remove the V=1 check from amdvi_validate_dte(), make the caller
responsible to check for V or IV bits.
This also fixes a bug in existing code that when error is
detected during the translation we'll fail the translation
instead of assuming a passthrough mode.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The vtd_generate_msi_message() in intel-iommu is used to construct a MSI
Message from IRQ. A similar function will be needed when we add interrupt
remapping support in amd-iommu. Moving the function in common file to
avoid the code duplication. Rename it to x86_iommu_irq_to_msi_message().
There is no logic changes in the code flow.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Interrupt remapping needs kernel-irqchip={off|split} on both Intel and AMD
platforms. Move the check in common place.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some old guests (before commit 7a11370e5: "virtio_blk: enable VQs early")
kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK. This violates
the virtio spec. But virtio 1.0 transitional devices support this behaviour.
So we should start vhost when guest kicks in this case.
Signed-off-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Chai Wen <chaiwen@baidu.com>
Signed-off-by: Ni Xun <nixun@baidu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We should handle VTD_FR_CONTEXT_ENTRY_P properly when synchronizing
shadow page tables. Having invalid context entry there is perfectly
valid when we move a device out of an existing domain. When that
happens, instead of posting an error we invalidate the whole region.
Without this patch, QEMU will crash if we do these steps:
(1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
(2) bind the NICs with vfio-pci in the guest
(3) start testpmd with the NICs applied
(4) stop testpmd
(5) rebind the NIC back to ixgbe kernel driver
The patch should fix it.
Reported-by: Pei Zhang <pezhang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There are two callers for vtd_sync_shadow_page_table_range(): one
provided a valid context entry and one not. Move that fetching
operation into the caller vtd_sync_shadow_page_table() where we need to
fetch the context entry.
Meanwhile, remove the error_report_once() directly since we're already
tracing all the error cases in the previous call. Instead, return error
number back to caller. This will not change anything functional since
callers are dropping it after all.
We do this move majorly because we want to do something more later in
vtd_sync_shadow_page_table().
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU is not handling the global DMAR switch well, especially when from
"on" to "off".
Let's first take the example of system reset.
Assuming that a guest has IOMMU enabled. When it reboots, we will drop
all the existing DMAR mappings to handle the system reset, however we'll
still keep the existing memory layouts which has the IOMMU memory region
enabled. So after the reboot and before the kernel reloads again, there
will be no mapping at all for the host device. That's problematic since
any software (for example, SeaBIOS) that runs earlier than the kernel
after the reboot will assume the IOMMU is disabled, so any DMA from the
software will fail.
For example, a guest that boots on an assigned NVMe device might fail to
find the boot device after a system reboot/reset and we'll be able to
observe SeaBIOS errors if we capture the debugging log:
WARNING - Timeout at nvme_wait:144!
Meanwhile, we should see DMAR errors on the host of that NVMe device.
It's the DMA fault that caused a NVMe driver timeout.
The correct fix should be that we do proper switching of device DMA
address spaces when system resets, which will setup correct memory
regions and notify the backend of the devices. This might not affect
much on non-assigned devices since QEMU VT-d emulation will assume a
default passthrough mapping if DMAR is not enabled in the GCMD
register (please refer to vtd_iommu_translate). However that's required
for an assigned devices, since that'll rebuild the correct GPA to HPA
mapping that is needed for any DMA operation during guest bootstrap.
Besides the system reset, we have some other places that might change
the global DMAR status and we'd better do the same thing there. For
example, when we change the state of GCMD register, or the DMAR root
pointer. Do the same refresh for all these places. For these two
places we'll also need to explicitly invalidate the context entry cache
and iotlb cache.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173
CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Cong Li <coli@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
--
v2:
- do the same for GCMD write, or root pointer update [Alex]
- test is carried out by me this time, by observing the
vtd_switch_address_space tracepoint after system reboot
v3:
- rewrite commit message as suggested by Alex
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Provide the function and use it in vtd_init(). Used to reset both
context entry cache and iotlb cache for the whole IOMMU unit.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Here should be submit_requests, there is no submit_merged_requests
function.
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>