kvm_has_pit_state2() is only defined for x86 targets (in
target/i386/kvm/kvm.c). Its declaration is pointless on
all other targets. Have it return a boolean.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-13-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Call kvm_enabled() before kvm_hv_vpindex_settable()
to let the compiler elide its call.
kvm-stub.c is now empty, remove it.
Suggested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-9-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Call kvm_enabled() before kvm_enable_x2apic() to let the compiler elide
its call. Cleanup the code by simplifying "!xen_enabled() &&
kvm_enabled()" to just "kvm_enabled()".
Suggested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-8-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
fw_cfg_build_feature_control() uses CPUID_EXT_VMX which is
defined in "target/i386/cpu.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-4-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both pc_piix.c and pc_q35.c files use CPU_VERSION_LEGACY
which is defined in "target/i386/cpu.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-3-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commit 6f529b7534 ("target/i386: move FERR handling
to target/i386") pc_q35_init() calls tcg_enabled() which
is declared in "sysemu/tcg.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-2-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvmclock_create() is only implemented in hw/i386/kvm/clock.h.
Restrict the "hw/kvm/clock.h" header to i386 by moving it to
hw/i386/.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230620083228.88796-3-philmd@linaro.org>
We shouldn't call kvmclock_create() when KVM is not available
or disabled:
- check for kvm_enabled() before calling it
- assert KVM is enabled once called
Since the call is elided when KVM is not available, we can
remove the stub (it is never compiled).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230620083228.88796-2-philmd@linaro.org>
The buttons value use macros instead of direct numbers.
If request relative mode, have to add this for
guest vmmouse driver to judge this is a relative packet.
otherwise,vmmouse driver will not match
the condition 'status & VMMOUSE_RELATIVE_PACKET',
and can't report events on the correct(relative) input device,
result to relative mode unuseful.
Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn>
Message-ID: <20230413081526.2229916-1-zhouzongmin@kylinos.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The values in "msg" are assembled in host endian byte order (the other
field are also not swapped), so we must not swap the __addr_head here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-6-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
The values in "addr" are populated locally in this function in host
endian byte order, so we must not swap the index_l field here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-5-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
On big endian hosts, we need to reverse the bitfield order in the
struct VTDInvDescIEC, just like it is already done for the other
bitfields in the various structs of the intel-iommu device.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-4-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
The code already tries to do some endianness handling here, but
currently fails badly:
- While it already swaps the data when logging errors / tracing, it fails
to byteswap the value before e.g. accessing entry->irte.present
- entry->irte.source_id is swapped with le32_to_cpu(), though this is
a 16-bit value
- The whole union is apparently supposed to be swapped via the 64-bit
data[2] array, but the struct is a mixture between 32 bit values
(the first 8 bytes) and 64 bit values (the second 8 bytes), so this
cannot work as expected.
Fix it by converting the struct to two proper 64-bit bitfields, and
by swapping the values only once for everybody right after reading
the data from memory.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-3-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
After reading the guest memory with dma_memory_read(), we have
to make sure that we byteswap the little endian data to the host's
byte order.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-2-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
ACPI spec (since 2.0a) says
"
A device object must contain either an _HID object or
an _ADR object, but can contain both.
"
_ADR is used when device is attached to an ennumerable bus,
however hostbridge is not and uses dedicated _HID for
discovery, drop _ADR field.
It doesn't seem that having _ADR has a negative effects
OSes manage to tolerate that, but there is no point of
having it there. (only pc/q35 has it hostbridge description,
while others (microvm/arm) don't)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230720133858.1974024-6-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it seems that Windows is unable to handle variable references
making it choke up when accessing ASUN during _DSM call
when device is hotplugged (it lists package elements as DataAlias
but despite that later on it misbehaves) with following error
shown up in AMLI debugger (WS2012r2):
Store(ShiftLeft(One,Arg1="ASUN",) AMLI_ERROR(c0140008): Unexpected argument type
ValidateArgTypes: expected Arg1 to be type Integer (Type=String)
Similar outcome with WS2022.
Issue is not fatal but as result acpi-index/"PCI Label ID" property
is either not shown in device details page or shows incorrect value.
Fix it by doing assignment of BSEL/ASUN values to package
elements manually after package declaration.
Fix was tested with: WS2012r2, WS2022, RHEL9
Fixes: 467d099a29 (x86: acpi: _DSM: use Package to pass parameters)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230720133858.1974024-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fuzzing showed that a guest could bind an interdomain port to itself, by
guessing the next port to be allocated and putting that as the 'remote'
port number. By chance, that works because the newly-allocated port has
type EVTCHNSTAT_unbound. It shouldn't.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230801175747.145906-4-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Coverity points out (CID 1508128) a bounds checking error. We need to check
for gsi >= IOAPIC_NUM_PINS, not just greater-than.
Also fix up an assert() that has the same problem, that Coverity didn't see.
Fixes: 4f81baa33e ("hw/xen: Support GSI mapping to PIRQ")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230801175747.145906-2-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The IDE unplug function needs to reset the entire PCI device, to make
sure all state is initialized to defaults. This is done by calling
pci_device_reset, which resets not only the chip specific registers, but
also all PCI state. This fixes "unplug" in a Xen HVM domU with the
modular legacy xenlinux PV drivers.
Commit ee358e919e ("hw/ide/piix: Convert reset handler to
DeviceReset") changed the way how the the disks are unplugged. Prior
this commit the PCI device remained unchanged. After this change,
piix_ide_reset is exercised after the "unplug" command, which was not
the case prior that commit. This function resets the command register.
As a result the ata_piix driver inside the domU will see a disabled PCI
device. The generic PCI code will reenable the PCI device. On the qemu
side, this runs pci_default_write_config/pci_update_mappings. Here a
changed address is returned by pci_bar_address, this is the address
which was truncated in piix_ide_reset. In case of a Xen HVM domU, the
address changes from 0xc120 to 0xc100. This truncation was a bug in
piix_ide_reset, which was fixed in commit 230dfd9257 ("hw/ide/piix:
properly initialize the BMIBA register"). If pci_xen_ide_unplug had used
pci_device_reset, the PCI registers would have been properly reset, and
commit ee358e919e would have not introduced a regression for this
specific domU environment.
While the unplug is supposed to hide the IDE disks, the changed BMIBA
address broke the UHCI device. In case the domU has an USB tablet
configured, to recive absolute pointer coordinates for the GUI, it will
cause a hang during device discovery of the partly discovered USB hid
device. Reading the USBSTS word size register will fail. The access ends
up in the QEMU piix-bmdma device, instead of the expected uhci device.
Here a byte size request is expected, and a value of ~0 is returned. As
a result the UCHI driver sees an error state in the register, and turns
off the UHCI controller.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230720072950.20198-1-olaf@aepfle.de>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Coverity points out (CID 1513106, 1513107) that MemoryListener is a
192 byte struct which we are passing around by value. Switch to
passing a const pointer into xen_register_ioreq() and then to
xen_do_ioreq_register(). We can also make the file-scope
MemoryListener variables const, since nothing changes them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230718101057.1110979-1-peter.maydell@linaro.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Coverity was unhappy (CID 1508359) because we didn't check the return of
init_walk_op() in transaction_commit(), despite doing so at every other
call site.
Strictly speaking, this is a false positive since it can never fail. It
only fails for invalid user input (transaction ID or path), and both of
those are hard-coded to known sane values in this invocation.
But Coverity doesn't know that, and neither does the casual reader of the
code.
Returning an error here would be weird, since the transaction *is*
committed by this point; all the walk_op is doing is firing watches on
the newly-committed changed nodes. So make it a g_assert(!ret), since
it really should never happen.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20076888f6bdf06a65aafc5cf954260965d45b97.camel@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Let's factor out (un)plug handling, to be reused from arm/virt code.
Provide stubs for the case that CONFIG_VIRTIO_MD is not selected because
neither virtio-mem nor virtio-pmem is enabled. While this cannot
currently happen for x86, it will be possible for arm/virt.
Message-ID: <20230711153445.514112-3-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
There are no remaining users in the tree. Libvirt never used that
property and a quick internet search revealed no other users.
Further, we renamed that property already in commit f2ffbe2b7d
("pc: rename "hotplug memory" terminology to "device memory"") without
anybody complaining.
So let's just get rid of it.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230623124553.400585-9-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
We're already looking at machine->device_memory when calling
build_srat_memory(), so let's simply avoid going via
PC_MACHINE_DEVMEM_REGION_SIZE to get the size and rely on
machine->device_memory directly.
Once machine->device_memory is set, we know that the size > 0. The code now
looks much more similar the hw/arm/virt-acpi-build.c variant.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230623124553.400585-8-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's use our new helper and stop always allocating ms->device_memory.
Once allcoated, we're sure that the size > 0 and that the base was
initialized.
Adjust the code in pc_memory_init() to check for machine->device_memory
instead of pcmc->has_reserved_memory and machine->device_memory->base.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230623124553.400585-7-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
There is also pci_new() which creates non-multifunction PCI devices.
Accordingly the parameter is always set to true when a multi function PCI
device is to be created.
The reason for the parameter's existence seems to be that it is used in the
internal PCI code as well which is the only location where it gets set to
false. This one usage can be resolved by factoring out an internal helper
function.
Remove this redundant, error-prone parameter.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230304114043.121024-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is also pci_create_simple() which creates non-multifunction PCI
devices. Accordingly the parameter is always set to true when a multi
function PCI device is to be created.
The reason for the parameter's existence seems to be that it is used in the
internal PCI code as well which is the only location where it gets set to
false. This one usage can be replaced by trivial code.
Remove this redundant, error-prone parameter.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230304114043.121024-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
I440FX realization is currently mixed with PIIX3 creation. Furthermore, it is
common practice to only set properties between a device's qdev_new() and
qdev_realize(). Clean up to resolve both issues.
Since I440FX spawns a PCI bus let's also move the pci_bus initialization there.
Note that when running `qemu-system-x86_64 -M pc -S` before and after this
patch, `info mtree` in the QEMU console doesn't show any differences except that
the ordering is different.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-18-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
i440fx_init() is a legacy init function. The previous patches worked towards
TYPE_I440FX_PCI_HOST_BRIDGE to be instantiated the QOM way. Do this now by
transforming the parameters passed to i440fx_init() into property assignments.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-17-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Introduce the properties in anticipation of QOM'ification; Q35 has the same
properties.
Note that we want to avoid a "ram size" property in the QOM interface since it
seems redundant to both properties introduced in this change. Thus the removal
of the ram_size parameter. We assume the invariant of both properties to sum up
to "ram size" which is already asserted in pc_memory_init(). Under Xen the
invariant seems to hold as well, so we now also check it there.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-15-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The parent-child relation is usually established near a child's qdev_new(). For
i440fx this allows for reusing the machine parameter, thus avoiding
qdev_get_machine() which relies on a global variable.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-9-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eliminates an else branch.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-8-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230630073720.21297-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The Q35 PCI host already has a PCI_HOST_BYPASS_IOMMU property. However, the
host initializes this property itself by accessing global machine state,
thereby assuming it to be a PC machine. Avoid this by having board code
set this property.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The Q35 PCI host currently sets the PC machine's PCI bus attribute
through global state, thereby assuming the machine to be a PC machine.
The Q35 machine code already holds on to Q35's pci bus attribute, so can
easily set its own property while preserving encapsulation.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The variable is redundant to "phb" and is never used by its real type.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630073720.21297-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add MEMORY_LISTNER_PRIORITY_ACCEL for the symbolic value for the memory
listener to replace the hard-coded value 10 for accel.
No functional change intended.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <feebe423becc6e2aa375f59f6abce9a85bc15abb.1687279702.git.isaku.yamahata@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
During address space unmap, corresponding IOVA tree entries are
also removed. But DMAMap is set beyond notifier's scope by 1, so
in theory there is possibility to remove a continuous entry above
the notifier's scope but falling in adjacent notifier's scope.
There is no issue currently as no use cases allocate notifiers
continuously, but let's be robust.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230615032626.314476-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Replay doesn't notify registered notifiers but the one passed
to it. So it's meaningless to check the registered notifier's
synthetic flag.
There is no issue currently as all replay use cases have MAP
flag set, but let's be robust.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230615032626.314476-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Peter Xu found a potential issue:
"The other thing is when I am looking at the new code I found that we
actually extended the replay() to be used also in dirty tracking of vfio,
in vfio_sync_dirty_bitmap(). For that maybe it's already broken if
unmap_all() because afaiu log_sync() can be called in migration thread
anytime during DMA so I think it means the device is prone to DMA with the
IOMMU pgtable quickly erased and rebuilt here, which means the DMA could
fail unexpectedly. Copy Alex, Kirti and Neo."
Fix it by replacing the unmap_all() to only evacuate the iova tree
(keeping all host mappings untouched, IOW, don't notify UNMAP), and
do a full resync in page walk which will notify all existing mappings
as MAP. This way we don't interrupt with any existing mapping if there
is (e.g. for the dirty sync case), meanwhile we keep sync too to latest
(for moving a vfio device into an existing iommu group).
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230615032626.314476-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To use the newly introduced PC machine class local variable.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230609164107.23404-1-suravee.suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since KVM_MAX_VCPUS is currently defined to 1024 for x86 as shown in
arch/x86/include/asm/kvm_host.h, update QEMU limits to the same number.
In case KVM could not support the specified number of vcpus, QEMU would
return the following error message:
qemu-system-x86_64: kvm_init_vcpu: kvm_get_vcpu failed (xxx): Invalid argument
Also, keep max_cpus at 288 for machine version 8.0 and older.
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230607205717.737749-3-suravee.suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Currently, pc-q35 and pc-i44fx machine models are default to use SMBIOS 2.8
(32-bit entry point). Since SMBIOS 3.0 (64-bit entry point) is now fully
supported since QEMU 7.0, default to use SMBIOS 3.0 for newer machine
models. This is necessary to avoid the following message when launching
a VM with large number of vcpus.
"SMBIOS 2.1 table length 66822 exceeds 65535"
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230607205717.737749-2-suravee.suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Since we *might* have user emulation with softmmu,
use the clearer 'CONFIG_SYSTEM_ONLY' key to check
for system emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-9-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch does following:
1. creates arch_handle_ioreq() and arch_xen_set_memory(). This is done in
preparation for moving most of xen-hvm code to an arch-neutral location,
move the x86-specific portion of xen_set_memory to arch_xen_set_memory.
Also, move handle_vmport_ioreq to arch_handle_ioreq.
2. Pure code movement: move common functions to hw/xen/xen-hvm-common.c
Extract common functionalities from hw/i386/xen/xen-hvm.c and move them to
hw/xen/xen-hvm-common.c. These common functions are useful for creating
an IOREQ server.
xen_hvm_init_pc() contains the architecture independent code for creating
and mapping a IOREQ server, connecting memory and IO listeners, initializing
a xen bus and registering backends. Moved this common xen code to a new
function xen_register_ioreq() which can be used by both x86 and ARM machines.
Following functions are moved to hw/xen/xen-hvm-common.c:
xen_vcpu_eport(), xen_vcpu_ioreq(), xen_ram_alloc(), xen_set_memory(),
xen_region_add(), xen_region_del(), xen_io_add(), xen_io_del(),
xen_device_realize(), xen_device_unrealize(),
cpu_get_ioreq_from_shared_memory(), cpu_get_ioreq(), do_inp(),
do_outp(), rw_phys_req_item(), read_phys_req_item(),
write_phys_req_item(), cpu_ioreq_pio(), cpu_ioreq_move(),
cpu_ioreq_config(), handle_ioreq(), handle_buffered_iopage(),
handle_buffered_io(), cpu_handle_ioreq(), xen_main_loop_prepare(),
xen_hvm_change_state_handler(), xen_exit_notifier(),
xen_map_ioreq_server(), destroy_hvm_domain() and
xen_shutdown_fatal_error()
3. Removed static type from below functions:
1. xen_region_add()
2. xen_region_del()
3. xen_io_add()
4. xen_io_del()
5. xen_device_realize()
6. xen_device_unrealize()
7. xen_hvm_change_state_handler()
8. cpu_ioreq_pio()
9. xen_exit_notifier()
4. Replace TARGET_PAGE_SIZE with XC_PAGE_SIZE to match the page side with Xen.
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
In preparation to moving most of xen-hvm code to an arch-neutral location, move:
- shared_vmport_page
- log_for_dirtybit
- dirty_bitmap
- suspend
- wakeup
out of XenIOState struct as these are only used on x86, especially the ones
related to dirty logging.
Updated XenIOState can be used for both aarch64 and x86.
Also, remove free_phys_offset as it was unused.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
In preparation to moving most of xen-hvm code to an arch-neutral location,
move non IOREQ references to:
- xen_get_vmport_regs_pfn
- xen_suspend_notifier
- xen_wakeup_notifier
- xen_ram_init
towards the end of the xen_hvm_init_pc() function.
This is done to keep the common ioreq functions in one place which will be
moved to new function in next patch in order to make it common to both x86 and
aarch64 machines.
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Paul Durrant <paul@xen.org>
xen-mapcache.c contains common functions which can be used for enabling Xen on
aarch64 with IOREQ handling. Moving it out from hw/i386/xen to hw/xen to make it
accessible for both aarch64 and x86.
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Allows the struct to be embedded directly into device models without additional
allocation.
Suggested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230612081238.1742-3-shentey@gmail.com>
[PMD: Update MAINTAINERS entry and use SPDX license identifier]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Use object_dynamic_cast() to determine if 'dev' is a TYPE_VIRTIO_MMIO.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
During the last patches, TYPE_PIIX3_XEN_DEVICE turned into a clone of
TYPE_PIIX3_DEVICE. Remove this redundancy.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Tested-by: Chuck Zmudzinski <brchuckz@aol.com>
Message-Id: <20230312120221.99183-7-shentey@gmail.com>
Message-Id: <20230403074124.3925-8-shentey@gmail.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Subscribe to pci_bus_fire_intx_routing_notifier() instead which allows for
having a common piix3_write_config() for the PIIX3 device models.
While at it, move the subscription into machine code to facilitate resolving
TYPE_PIIX3_XEN_DEVICE.
In a possible future followup, pci_bus_fire_intx_routing_notifier() could
be adjusted in such a way that subscribing to it doesn't require
knowledge of the device firing it.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Tested-by: Chuck Zmudzinski <brchuckz@aol.com>
Message-Id: <20230312120221.99183-5-shentey@gmail.com>
Message-Id: <20230403074124.3925-6-shentey@gmail.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
xen_intx_set_irq() doesn't depend on PIIX3State. In order to resolve
TYPE_PIIX3_XEN_DEVICE and in order to make Xen agnostic about the
precise south bridge being used, set up Xen's PCI IRQ handling of PIIX3
in the board.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Tested-by: Chuck Zmudzinski <brchuckz@aol.com>
Message-Id: <20230312120221.99183-4-shentey@gmail.com>
Message-Id: <20230403074124.3925-5-shentey@gmail.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
xen_piix3_set_irq() isn't PIIX specific: PIIX is a single PCI device
while xen_piix3_set_irq() maps multiple PCI devices to their respective
IRQs, which is board-specific. Rename xen_piix3_set_irq() to communicate
this.
Also rename XEN_PIIX_NUM_PIRQS to XEN_IOAPIC_NUM_PIRQS since the Xen's
IOAPIC rather than PIIX has this many interrupt routes.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Tested-by: Chuck Zmudzinski <brchuckz@aol.com>
Message-Id: <20230312120221.99183-2-shentey@gmail.com>
Message-Id: <20230403074124.3925-2-shentey@gmail.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Coverity points out that if (!s && !s->impl) isn't really what we intended
to do here. CID 1508131.
Fixes: 0324751272 ("hw/xen: Add emulated implementation of XenStore operations")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230412185102.441523-6-dwmw2@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
I initially put the basic platform init (overlay pages, grant tables,
event channels) into mc->kvm_type because that was the earliest place
that could sensibly test for xen_mode==XEN_EMULATE.
The intent was to do this early enough that we could then initialise the
XenBus and other parts which would have depended on them, from a generic
location for both Xen and KVM/Xen in the PC-specific code, as seen in
https://lore.kernel.org/qemu-devel/20230116221919.1124201-16-dwmw2@infradead.org/
However, then the Xen on Arm patches came along, and *they* wanted to
do the XenBus init from a 'generic' Xen-specific location instead:
https://lore.kernel.org/qemu-devel/20230210222729.957168-4-sstabellini@kernel.org/
Since there's no generic location that covers all three, I conceded to
do it for XEN_EMULATE mode in pc_basic_devices_init().
And now there's absolutely no point in having some of the platform init
done from pc_machine_kvm_type(); we can move it all up to live in a
single place in pc_basic_devices_init(). This has the added benefit that
we can drop the separate xen_evtchn_connect_gsis() function completely,
and pass just the system GSIs in directly to xen_evtchn_create().
While I'm at it, it does no harm to explicitly pass in the *number* of
said GSIs, because it does make me twitch a bit to pass an array of
impicit size. During the lifetime of the KVM/Xen patchset, that had
already changed (albeit just cosmetically) from GSI_NUM_PINS to
IOAPIC_NUM_PINS.
And document a bit better that this is for the *output* GSI for raising
CPU0's events when the per-CPU vector isn't available. The fact that
we create a whole set of them and then only waggle the one we're told
to, instead of having a single output and only *connecting* it to the
GSI that it should be connected to, is still non-intuitive for me.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230412185102.441523-2-dwmw2@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Both TYPE_KVM_I8254 and TYPE_I8254 have their own but same implementation of
the "iobase" property. The storage for the property already resides in
PITCommonState, so also move the property definition there.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230523195608.125820-2-shentey@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
All callers now pass is_external=false to aio_set_fd_handler() and
aio_set_event_notifier(). The aio_disable_external() API that
temporarily disables fd handlers that were registered is_external=true
is therefore dead code.
Remove aio_disable_external(), aio_enable_external(), and the
is_external arguments to aio_set_fd_handler() and
aio_set_event_notifier().
The entire test-fdmon-epoll test is removed because its sole purpose was
testing aio_disable_external().
Parts of this patch were generated using the following coccinelle
(https://coccinelle.lip6.fr/) semantic patch:
@@
expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque;
@@
- aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque)
+ aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque)
@@
expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready;
@@
- aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready)
+ aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready)
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-21-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no need to suspend activity between aio_disable_external() and
aio_enable_external(), which is mainly used for the block layer's drain
operation.
This is part of ongoing work to remove the aio_disable_external() API.
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-9-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Don't try to instantiate the parallel port if it has not been
enabled in the build configuration.
Message-Id: <20230512124033.502654-10-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We are going to re-use this setting for other targets, so let's
move this to the main MachineClass.
Message-Id: <20230512124033.502654-4-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The "isapc" machine can also be run without VGA card, so there
is no need for a hard requirement with a "select" here - "imply"
is enough.
Message-Id: <20230512124033.502654-2-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Now that the RTC is created as part of the southbridges it doesn't need
to be an out-parameter any longer.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230519084734.220480-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Just like in the real hardware (and in PIIX4), create the RTC
controllers in the south bridges.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230519084734.220480-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Going through pc_memory_init() seems quite complicated for a simple
assignment.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213162004.2797-7-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-6-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
No need to repeat the descriptions.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230213162004.2797-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Allowing guests to read unplugged memory simplified the bring-up of
virtio-mem in Linux guests -- which was limited to x86-64 only. On arm64
(which was added later), we never had legacy guests and don't even allow
to configure it, essentially always having "unplugged-inaccessible=on".
At this point, all guests we care about
should be supporting VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, so let's
change the default for the 8.1 machine.
This change implies that also memory that supports the shared zeropage
(private anonymous memory) will now require
VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE in the driver in order to be usable by
the guest -- as default, one can still manually set the
unplugged-inaccessible property.
Disallowing the guest to read unplugged memory will be important for
some future features, such as memslot optimizations or protection of
unplugged memory, whereby we'll actually no longer allow the guest to
even read from unplugged memory.
At some point, we might want to deprecate and remove that property.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <eduardo@habkost.net>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230503182352.792458-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently i386 QEMU generates MADT revision 3, and reports
MADT revision 1. Set .revision to 3 to match reality.
Link: https://lore.kernel.org/linux-acpi/20230327191026.3454-1-eric.devolder@ora
cle.com/T/#t
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230517162545.2191-3-eric.devolder@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Commit 1b2b12376c ("intel-iommu: PASID support") takes PASID into
account when calculating iotlb hash like:
static guint vtd_iotlb_hash(gconstpointer v)
{
const struct vtd_iotlb_key *key = v;
return key->gfn | ((key->sid) << VTD_IOTLB_SID_SHIFT) |
(key->level) << VTD_IOTLB_LVL_SHIFT |
(key->pasid) << VTD_IOTLB_PASID_SHIFT;
}
This turns out to be problematic since:
- the shift will lose bits if not converting to uint64_t
- level should be off by one in order to fit into 2 bits
- VTD_IOTLB_PASID_SHIFT is 30 but PASID is 20 bits which will waste
some bits
- the hash result is uint64_t so we will lose bits when converting to
guint
So this patch fixes them by
- converting the keys into uint64_t before doing the shift
- off level by one to make it fit into two bits
- change the sid, lvl and pasid shift to 26, 42 and 44 in order to
take the full width of uint64_t
- perform an XOR to the top 32bit with the bottom 32bit for the final
result to fit guint
Fixes: Coverity CID 1508100
Fixes: 1b2b12376c ("intel-iommu: PASID support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230412073510.7158-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
i440fx machine versions 2.3 and newer supports dynamic ram
resizing. See commit a1666142db ("acpi-build: make ROMs RAM blocks resizeable") .
Currently supported all q35 machine types (versions 2.4 and newer) supports
resizable RAM/ROM blocks.Therefore the warning generated when the ACPI table
size exceeds a pre-defined value does not apply to those machine versions.
Add a check limiting the warning message to only those machines that does not
support expandable ram blocks (that is, i440fx machines with version 2.2
and older).
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230329045726.14028-1-anisinha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add 8.1 machine types for arm/i440fx/m68k/q35/s390x/spapr.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20230314173009.152667-1-cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Aside the Frankenstein model of a SysBusDevice realizing a PCIDevice,
QOM parents shouldn't access children internals. In this particular
case, amdvi_sysbus_realize() is just open-coding TYPE_AMD_IOMMU_PCI's
DeviceRealize() handler. Factor it out.
Declare QOM-cast macros with OBJECT_DECLARE_SIMPLE_TYPE() so we can
cast the AMDVIPCIState in amdvi_pci_realize().
Note this commit removes the single use in the repository of
pci_add_capability() and msi_init() on a *realized* QDev instance.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230313153031.86107-7-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Set PCI static/const fields once in amdvi_pci_class_init.
They will be propagated via DeviceClassRealize handler via
pci_qdev_realize() -> do_pci_register_device() -> pci_config_set*().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230313153031.86107-6-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The 'PCI capability offset' is a *PCI* notion. Since AMDVIPCIState
inherits PCIDevice and hold PCI-related fields, move capab_offset
from AMDVIState to AMDVIPCIState.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230313153031.86107-5-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
AMDVIState::devid is only accessed by build_amd_iommu() which
has access to the PCIDevice state. Directly get the property
calling object_property_get_int() there.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230313153031.86107-4-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
By accessing MemoryRegion internals, amdvi_init() gives the false
idea that the PCI BAR can be modified. However this isn't true
(at least the model isn't ready for that): the device is explicitly
maps at the BAR at the fixed AMDVI_BASE_ADDR address in
amdvi_sysbus_realize(). Since the SysBus API isn't designed to
remap regions, directly use the fixed address in amdvi_init().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230313153031.86107-3-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 6da2434186
("memory: Optimize replay of guest mapping").
This change breaks the mps3-an547 board under TCG (and
probably other TCG boards using an IOMMU), which now
assert:
$ ./build/x86/qemu-system-arm --machine mps3-an547 -serial stdio
-kernel /tmp/an547-mwe/build/test.elf
qemu-system-arm: ../../softmmu/memory.c:1903:
memory_region_register_iommu_notifier: Assertion `n->end <=
memory_region_size(mr)' failed.
This is because tcg_register_iommu_notifier() registers
an IOMMU notifier which covers the entire address space,
so the assertion added in this commit is not correct.
For the 8.0 release, just revert this commit as it is
only an optimization.
Fixes: 6da2434186 ("memory: Optimize replay of guest mapping")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 917c1c552b2d1b732f9a86c6a90684c3a5e4cada.1680640587.git.mst@redhat.com
This had been pulled in via qemu/plugin.h from hw/core/cpu.h,
but that will be removed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230310195252.210956-5-richard.henderson@linaro.org>
[AJB: add various additional cases shown by CI]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230315174331.2959-15-alex.bennee@linaro.org>
Reviewed-by: Emilio Cota <cota@braap.org>
Several features that landed at the last possible moment:
Passthrough HDM decoder emulation
Refactor cryptodev
RAS error emulation and injection
acpi-index support on non-hotpluggable slots
Dynamically switch to vhost shadow virtqueues at vdpa net migration
Plus a couple of bugfixes that look important to have in the release.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQJ8TYPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp37YIAMpQA5/ddmKKz/ABtBMHB5JX/SVYcG+1xkBR
j9IFYusOfmmDfmgAhv0Qxi9+Wik95lszVZUnphvocSGd0PXH47pK7yv9RZ1ttaYX
oAbXrGqXo8rUhl1ksQsJ8Iasj2di1BLP0byPuozbRkg1Kkz5TqRd9+hBqSBGEx21
tsP5708UVCDAriwYYO78Cx0ZasmB9bqqeom5FdEsg9sYJ5aElOOvitp9YO1p2xhU
gRvhD+k/aqNi+mfOUF7qGDBanxKgx75VV/KU1cjjS9R1vNtwRhfc/26PBrROY00a
wkZWnAxmzDFKRS6cEfeb+eDGEVjC3IqLAjcFeuAIT/78CwdvIiY=
=e1yv
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, fixes
Several features that landed at the last possible moment:
Passthrough HDM decoder emulation
Refactor cryptodev
RAS error emulation and injection
acpi-index support on non-hotpluggable slots
Dynamically switch to vhost shadow virtqueues at vdpa net migration
Plus a couple of bugfixes that look important to have in the release.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQJ8TYPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp37YIAMpQA5/ddmKKz/ABtBMHB5JX/SVYcG+1xkBR
# j9IFYusOfmmDfmgAhv0Qxi9+Wik95lszVZUnphvocSGd0PXH47pK7yv9RZ1ttaYX
# oAbXrGqXo8rUhl1ksQsJ8Iasj2di1BLP0byPuozbRkg1Kkz5TqRd9+hBqSBGEx21
# tsP5708UVCDAriwYYO78Cx0ZasmB9bqqeom5FdEsg9sYJ5aElOOvitp9YO1p2xhU
# gRvhD+k/aqNi+mfOUF7qGDBanxKgx75VV/KU1cjjS9R1vNtwRhfc/26PBrROY00a
# wkZWnAxmzDFKRS6cEfeb+eDGEVjC3IqLAjcFeuAIT/78CwdvIiY=
# =e1yv
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 Mar 2023 14:46:14 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (72 commits)
virtio: fix reachable assertion due to stale value of cached region size
hw/virtio/vhost-user: avoid using unitialized errp
hw/pxb-cxl: Support passthrough HDM Decoders unless overridden
hw/pci: Add pcie_count_ds_port() and pcie_find_port_first() helpers
hw/mem/cxl_type3: Add CXL RAS Error Injection Support.
hw/pci/aer: Make PCIE AER error injection facility available for other emulation to use.
hw/cxl: Fix endian issues in CXL RAS capability defaults / masks
hw/mem/cxl-type3: Add AER extended capability
hw/pci-bridge/cxl_root_port: Wire up MSI
hw/pci-bridge/cxl_root_port: Wire up AER
hw/pci/aer: Add missing routing for AER errors
hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register
pcihp: add ACPI PCI hotplug specific is_hotpluggable_bus() callback
pcihp: move fields enabling hotplug into AcpiPciHpState
acpi: pci: move out ACPI PCI hotplug generator from generic slot generator build_append_pci_bus_devices()
acpi: pci: move BSEL into build_append_pcihp_slots()
acpi: pci: drop BSEL usage when deciding that device isn't hotpluggable
pci: move acpi-index uniqueness check to generic PCI device code
tests: acpi: update expected blobs
tests: acpi: add non zero function device with acpi-index on non-hotpluggble bus
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-33-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Generic PCI enumeration code doesn't really need access to
BSEL value, it is only used as means to decide if hotplug
enumerator should be called.
Use stateless object_property_find() to do that, and move
the rest of BSEL handling into build_append_pcihp_slots()
where it belongs.
This cleans up generic code a bit from hotplug stuff
and follow up patch will remove remaining call to
build_append_pcihp_slots() from generic code, making
it possible to use without ACPI PCI hotplug dependencies.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-32-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
previous commit ("pci: fix 'hotplugglable' property behavior") fixed
pcie root port's 'hotpluggable' property to behave consistently.
So we don't need a BSEL crutch anymore to see of device is not
hotpluggable, drop it from 'generic' PCI slots description handling.
BSEL is still used to decide if hotplug part should be called
but that will be moved out of generic code to hotplug one by
followup patches.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-31-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
describing all present devices on functions other than
0 was complicated when non hotplug and hotplug code
was intermixed. So QEMU has been excluding non zero
functions since they are not supported by hotplug code,
then a condition to whitelist coldplugged bridges was
added and later whitelisting of devices that advertise
presence of their own AML description.
With non hotplug and hotplug code separated, it is
possible to relax rules and allow describing all
non-hotpluggble functions and hence simplify
conditions whether PCI device should be enumerated by
generic (non-hotplug) code.
Price of that simplification is an extra few Device()
descriptors in DSDT exposing built-in chipset functions,
which has no functional effect on guest side.
Apart from that, the enumeration of non zero functions,
allows to attach more NICs with acpi-index enabled
directly on hostbridge (if hotplug is not required).
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-25-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Inject static _DSM (EDSM) if non-hotpluggable device has
acpi-index configured on it.
It lets use acpi-index non-hotpluggable devices / devices
attached to non-hotpluggable bus.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-22-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it's a helper method for acpi-index support on PCI buses
that do no support or have disabled ACPI PCI hotplug
or for non-hotpluggble endpoint devices.
(like non-hotpluggble NICs, integrated endpoints and
later for machines that do not support ACPI PCI hotplug)
no functional change, commit adds only EDSM method in DSDT
without any users. (the follow up patches will use it)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-18-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it will be reused by follow up patches that will implement
static _DSM for non-hotpluggable devices.
no functional AML change, only context one, where 'cap' (Local1)
initialization is moved after UUID/revision checks.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-15-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Beside BSEL numbers change (due to 2 extra root-ports in q35/miltibridge test),
following change is expected:
Scope (\_SB.PCI0)
{
...
+ Scope (S50)
+ {
+ Scope (S00)
+ {
+ Method (PCNT, 0, NotSerialized)
+ {
+ BNUM = Zero
+ DVNT (PCIU, One)
+ DVNT (PCID, 0x03)
+ }
+ }
+
+ Method (PCNT, 0, NotSerialized)
+ {
+ ^S00.PCNT
+ }
+ }
...
Method (PCNT, 0, NotSerialized)
{
+ ^S50.PCNT ()
^S13.PCNT ()
^S12.PCNT ()
^S11.PCNT ()
I practice [1] hasn't broke anything since on hardware side we unset
hotplug_handler on such intermediate port => hotplug behind it has
not been properly wired and as result not worked.
1)
Fixes: ddab4d3fae ("pcihp: compose PCNT callchain right before its user _GPE._E01")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230302161543.286002-8-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that all the work is done to enable the PV backends to work without
actual Xen, instantiate the bus from pc_basic_device_init() for emulated
mode.
This allows us finally to launch an emulated Xen guest with PV disk.
qemu-system-x86_64 -serial mon:stdio -M q35 -cpu host -display none \
-m 1G -smp 2 -accel kvm,xen-version=0x4000a,kernel-irqchip=split \
-kernel bzImage -append "console=ttyS0 root=/dev/xvda1" \
-drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
-device xen-disk,drive=disk,vdev=xvda
If we use -M pc instead of q35, we can even add an IDE disk and boot a
guest image normally through grub. But q35 gives us AHCI and that isn't
unplugged by the Xen magic, so the guests ends up seeing "both" disks.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).
Some more work on the actual PV back ends and xen-bus code is going to be
needed to really make soft reset and migration fully functional, and this
part is the basis for that.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
We don't actually access the guest's page through the grant, because
this isn't real Xen, and we can just use the page we gave it in the
first place. Map the grant anyway, mostly for cosmetic purposes so it
*looks* like it's in use in the guest-visible grant table.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Now that we have an internal implementation of XenStore, we can populate
the xenstore_backend_ops to allow PV backends to talk to it.
Watches can't be processed with immediate callbacks because that would
call back into XenBus code recursively. Defer them to a QEMUBH to be run
as appropriate from the main loop. We use a QEMUBH per XS handle, and it
walks all the watches (there shouldn't be many per handle) to fire any
which have pending events. We *could* have done it differently but this
allows us to use the same struct watch_event as we have for the guest
side, and keeps things relatively simple.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This is limited to mapping a single grant at a time, because under Xen the
pages are mapped *contiguously* into qemu's address space, and that's very
hard to do when those pages actually come from anonymous mappings in qemu
in the first place.
Eventually perhaps we can look at using shared mappings of actual objects
for system RAM, and then we can make new mappings of the same backing
store (be it deleted files, shmem, whatever). But for now let's stick to
a page at a time.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
We provided the backend-facing evtchn functions very early on as part of
the core Xen platform support, since things like timers and xenstore need
to use them.
By what may or may not be an astonishing coincidence, those functions
just *happen* all to have exactly the right function prototypes to slot
into the evtchn_backend_ops table and be called by the PV backends.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This header is now only for native Xen code, not PV backends that may be
used in Xen emulation. Since the toolstack libraries may depend on the
specific version of Xen headers that they pull in (and will set the
__XEN_TOOLS__ macro to enable internal definitions that they depend on),
the rule is that xen_native.h (and thus the toolstack library headers)
must be included *before* any of the headers in include/hw/xen/interface.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The existing implementation calling into the real libxenevtchn moves to
a new file hw/xen/xen-operations.c, and is called via a function table
which in a subsequent commit will also be able to invoke the emulated
event channel support.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>