This is in preparation for the next commit where the nitro-enclave
machine type will need to instead use a memfd backend, for the built-in
vhost-user-vsock device to work.
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-5-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
An EIF (Enclave Image Format)[1] file is used to boot an AWS nitro
enclave[2] virtual machine. The EIF file contains the necessary kernel,
cmdline, ramdisk(s) sections to boot.
Some helper functions have been introduced for extracting the necessary
sections from an EIF file and then writing them to temporary files as
well as computing SHA384 hashes from the section data. These will be
used in the following commit to add support for nitro-enclave machine
type in QEMU.
The files added in this commit are not compiled yet but will be added
to the hw/core/meson.build file in the following commit where
CONFIG_NITRO_ENCLAVE will be introduced.
[1] https://github.com/aws/aws-nitro-enclaves-image-format
[2] https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-4-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_initialize() can use modules (it was added there because
virtio-gpu-device is a child device of virtio-gpu-pci; commit
64f7aece8e, "object_initialize: try module load", 2020-09-15).
object_new() cannot; make things consistent.
qdev_new() is now just a simple wrapper that returns DeviceState.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A small optimization/code simplification, that also makes it clear that
we won't look for a type in a not-loaded-yet module---the module will
have been loaded by a call to module_object_class_by_name(), if present.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Parameter @id is no longer used, drop. Return a bool to indicate
success / failure, as recommended by qapi/error.h.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20241010150144.986655-4-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Currently the QemuLockCnt data structure and associated functions are
in the include/qemu/thread.h header. Move them to their own
qemu/lockcnt.h. The main reason for doing this is that it means we
can autogenerate the documentation comments into the docs/devel
documentation.
The copyright/author in the new header is drawn from lockcnt.c,
since the header changes were added in the same commit as
lockcnt.c; since neither thread.h nor lockcnt.c state an explicit
license, the standard default of GPL-2-or-later applies.
We include the new header (and the .c file, which was accidentally
omitted previously) in the "RCU" part of MAINTAINERS, since that
is where the lockcnt.rst documentation is categorized.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20240816132212.3602106-7-peter.maydell@linaro.org
Expose the clock period via the QOM 'qtest-clock-period' property so it
can be used in QTests. This property is only accessible in QTests (not
via HMP).
Signed-off-by: Inès Varhol <ines.varhol@telecom-paris.fr>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20241003081105.40836-3-ines.varhol@telecom-paris.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
At e72a7f65c1 (hw: Move declaration of IRQState to header and add init
function, 2024-06-29), we've changed qemu_allocate_irq() to use a
combination of g_new() + object_initialize() instead of
IRQ(object_new()). The latter sets obj->free, so that that the memory is
properly cleaned when the object is finalized, but the former doesn't.
Fixes: e72a7f65c1 (hw: Move declaration of IRQState to header and add init function)
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Brian Cain <bcain@quicinc.com>
Message-id: 1723deb603afec3fa69a75970cef9aac62d57d62.1726674185.git.quic_mathbern@quicinc.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, both qemu_devices_reset() and MachineClass::reset() use
ShutdownCause for the reason of the reset. However, the Resettable
interface uses ResetState, so ShutdownCause needs to be translated to
ResetType somewhere. Translating it qemu_devices_reset() makes adding
new reset types harder, as they cannot always be matched to a single
ShutdownCause here, and devices may need to check the ResetType to
determine what to reset and if to reset at all.
This patch moves this translation up in the call stack to
qemu_system_reset() and updates all MachineClass children to use the
ResetType instead.
Message-ID: <20240904103722.946194-2-jmarcin@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
The 'GPL-2.0+' license identifier has been deprecated since license
list version 2.0rc2 [1] and replaced by the 'GPL-2.0-or-later' [2]
tag.
[1] https://spdx.org/licenses/GPL-2.0+.html
[2] https://spdx.org/licenses/GPL-2.0-or-later.html
Mechanical patch running:
$ sed -i -e s/GPL-2.0+/GPL-2.0-or-later/ \
$(git grep -lP 'SPDX-License-Identifier: \W+GPL-2.0\+[ $]' \
| egrep -v '^linux-headers|^include/standard-headers')
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The last use of sysbus_mmio_unmap was removed by
981b1c6266 ("spapr/xive: rework the mapping the KVM memory regions")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use of assert(false) can trip spurious control flow warnings from
some versions of GCC (i.e. using -fsanitize=thread with gcc-12):
error: control reaches end of non-void function [-Werror=return-type]
default:
assert(0);
| }
| ^
Solve that by unifying the code base on g_assert_not_reached() instead.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240910221606.1817478-6-pierrick.bouvier@linaro.org>
[PMD: Add description suggested by Eric Blake]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
We used to need the transitional_function machinery to handle bus
classes and device classes which still used their legacy reset
handling. We have now converted all bus classes to three phase
reset, and simplified the device class legacy reset so it is just an
adapting wrapper function around registration of a hold phase method.
There are therefore no more users of the transitional_function
machinery and we can remove it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240830145812.1967042-12-peter.maydell@linaro.org
Now that all devices which still implement a the legacy reset method
register it via device_class_legacy_reset(), we can simplify the
handling of these devices. Instead of using the complex
Resettable::get_transitional_function machinery, we register a hold
phase method which invokes the DeviceClass::legacy_reset method.
This will allow us to remove all the get_transitional_function
handling from resettable.c.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240830145812.1967042-11-peter.maydell@linaro.org
Currently we have transitional machinery between legacy reset
and three phase reset that works in two directions:
* if you invoke three phase reset on a device which has set
the DeviceClass::legacy_reset method, we detect this in
device_get_transitional_reset() and arrange that we call
the legacy_reset method during the hold phase of reset
* if you invoke legacy reset on a device which implements
three phase reset, the default legacy_reset method is
device_phases_reset(), which does a three-phase reset
of the device
However, we have now eliminated all the places which could invoke
legacy reset on a device, which means that the function
device_phases_reset() is never called -- it serves only as the value
of DeviceClass::legacy_reset that indicates that the subclass never
overrode the legacy reset method. So we can delete it, and instead
check for legacy_reset != NULL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240830145812.1967042-10-peter.maydell@linaro.org
Rename the DeviceClass::reset field to legacy_reset; this is helpful
both in flagging up that it's best not used in new code and in
making it easy to search for where it's being used still.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240830145812.1967042-9-peter.maydell@linaro.org
Use device_class_set_legacy_reset() instead of opencoding an
assignment to DeviceClass::reset. This change was produced
with:
spatch --macro-file scripts/cocci-macro-file.h \
--sp-file scripts/coccinelle/device-reset.cocci \
--keep-comments --smpl-spacing --in-place --dir hw
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240830145812.1967042-8-peter.maydell@linaro.org
Define a device_class_set_legacy_reset() function which
sets the DeviceClass::reset field. This serves two purposes:
* it makes it clearer to the person writing code that
DeviceClass::reset is now legacy and they should look for
the new alternative (which is Resettable)
* it makes it easier to rename the reset field (which in turn
makes it easier to find places that call it)
The Coccinelle script can be used to automatically convert code that
was doing an open-coded assignment to DeviceClass::reset to call
device_class_set_legacy_reset() instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240830145812.1967042-7-peter.maydell@linaro.org
There are no callers of device_class_set_parent_reset() left in the tree,
as they've all been converted to use three-phase reset and the
corresponding resettable_class_set_parent_phases() function.
Remove device_class_set_parent_reset().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240830145812.1967042-5-peter.maydell@linaro.org
i286 acpi speedup by precomputing _PRT by Ricardo Ribalda
vhost_net speedup by using MR transactions by Zuo Boqun
ich9 gained support for periodic and swsmi timer by Dominic Prinz
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbhoCUPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRptpUH/iR5AmJFpvAItqlPOvJiYDEch46C73tyrSws
Kk/1EbGSL7mFFD5sfdSSV4Rw8CQBsmM/Dt5VDkJKsWnOLjkBQ2CYH0MYHktnrKcJ
LlSk32HnY5p1DsXnJhgm5M7St8T3mV/oFdJCJAFgCmpx5uT8IRLrKETN8+30OaiY
xo35xAKOAS296+xsWeVubKkMq7H4y2tdZLE/22gb8rlA8d96BJIeVLQ3y3IjeUPR
24q6c7zpObzGhYNZ/PzAKOn+YcVsV/lLAzKRZJTzTUPyG24BcjJTyyr/zNSYAgfk
lLXzIZID3GThBmrCAiDZ1z6sfo3MRg2wNS/FBXtK6fPIuFxed+8=
=ySRy
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, fixes, cleanups
i286 acpi speedup by precomputing _PRT by Ricardo Ribalda
vhost_net speedup by using MR transactions by Zuo Boqun
ich9 gained support for periodic and swsmi timer by Dominic Prinz
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbhoCUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRptpUH/iR5AmJFpvAItqlPOvJiYDEch46C73tyrSws
# Kk/1EbGSL7mFFD5sfdSSV4Rw8CQBsmM/Dt5VDkJKsWnOLjkBQ2CYH0MYHktnrKcJ
# LlSk32HnY5p1DsXnJhgm5M7St8T3mV/oFdJCJAFgCmpx5uT8IRLrKETN8+30OaiY
# xo35xAKOAS296+xsWeVubKkMq7H4y2tdZLE/22gb8rlA8d96BJIeVLQ3y3IjeUPR
# 24q6c7zpObzGhYNZ/PzAKOn+YcVsV/lLAzKRZJTzTUPyG24BcjJTyyr/zNSYAgfk
# lLXzIZID3GThBmrCAiDZ1z6sfo3MRg2wNS/FBXtK6fPIuFxed+8=
# =ySRy
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Sep 2024 14:50:29 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
hw/acpi/ich9: Add periodic and swsmi timer
virtio-mem: don't warn about THP sizes on a kernel without THP support
hw/audio/virtio-sound: fix heap buffer overflow
hw/cxl: fix physical address field in get scan media results output
virtio-pci: Add lookup subregion of VirtIOPCIRegion MR
vhost_net: configure all host notifiers in a single MR transaction
tests/acpi: pc: update golden masters for DSDT
hw/i386/acpi-build: Return a pre-computed _PRT table
tests/acpi: pc: allow DSDT acpi table changes
intel_iommu: Make PASID-cache and PIOTLB type invalid in legacy mode
intel_iommu: Fix invalidation descriptor type field
virtio: rename virtio_split_packed_update_used_idx
hw/pci/pci-hmp-cmds: Avoid displaying bogus size in 'info pci'
pci: don't skip function 0 occupancy verification for devfn auto assign
hw/isa/vt82c686.c: Embed i8259 irq in device state instead of allocating
hw: Move declaration of IRQState to header and add init function
virtio: Always reset vhost devices
virtio: Allow .get_vhost() without vhost_started
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
To allow embedding a qemu_irq in a struct move its definition to the
header and add a function to init it in place without allocating it.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <e3ffd0f6ef8845d0f7247c9b6ff33f7ee8b432cf.1719690591.git.balaton@eik.bme.hu>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
QAPI's 'prefix' feature can make the connection between enumeration
type and its constants less than obvious. It's best used with
restraint.
CpuS390Entitlement has a 'prefix' to change the generated enumeration
constants' prefix from CPU_S390_ENTITLEMENT to S390_CPU_ENTITLEMENT.
Rename the type to S390CpuEntitlement, so that 'prefix' is not needed.
Likewise change CpuS390Polarization to S390CpuPolarization, and
CpuS390State to S390CpuState.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240904111836.3273842-10-armbru@redhat.com>
Recent commit "qapi: Smarter camel_to_upper() to reduce need for
'prefix'" added a temporary 'prefix' to delay changing the generated
code.
Revert it. This improves HmatLBDataType's generated enumeration
constant prefix from HMATLB_DATA_TYPE to HMAT_LB_DATA_TYPE, and
HmatLBMemoryHierarchy's from HMATLB_MEMORY_HIERARCHY to
HMAT_LB_MEMORY_HIERARCHY.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240904111836.3273842-8-armbru@redhat.com>
Adds support for 'qatzip' as an option for the multifd compression
method parameter, and implements using QAT for 'qatzip' compression and
decompression.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com>
Signed-off-by: Hao Xiang <hao.xiang@linux.dev>
Signed-off-by: Yichen Wang <yichen.wang@bytedance.com>
Link: https://lore.kernel.org/r/20240830232722.58272-5-yichen.wang@bytedance.com
Signed-off-by: Peter Xu <peterx@redhat.com>
memory_region_find() returns an MR which it is the caller's
responsibility to unref, but platform_bus_map_mmio() was
forgetting to do so, thus leaking the MR.
Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
Message-id: 20240829131005.9196-1-gaoshiyuan@baidu.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The real period is zero when both period and period_frac are zero.
Check the method ptimer_set_freq, if freq is larger than 1000 MHz,
the period is zero, but the period_frac is not, in this case, the
ptimer will work but the current code incorrectly recognizes that
the ptimer is disabled.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2306
Signed-off-by: JianZhou Yue <JianZhou.Yue@verisilicon.com>
Message-id: 3DA024AEA8B57545AF1B3CAA37077D0FB75E82C8@SHASXM03.verisilicon.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
=k8e9
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: features,fixes
pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
# zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
# yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
# wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
# VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
# M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
# =k8e9
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 10:16:31 AM AEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (61 commits)
hw/nvme: Add SPDM over DOE support
backends: Initial support for SPDM socket support
hw/pci: Add all Data Object Types defined in PCIe r6.0
tests/acpi: Add expected ACPI AML files for RISC-V
tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V
tests/acpi: Add empty ACPI data files for RISC-V
tests/qtest/bios-tables-test.c: Remove the fall back path
tests/acpi: update expected DSDT blob for aarch64 and microvm
acpi/gpex: Create PCI link devices outside PCI root bridge
tests/acpi: Allow DSDT acpi table changes for aarch64
hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART
hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC
virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain
hw/vfio/common: Add vfio_listener_region_del_iommu trace event
virtio-iommu: Remove the end point on detach
virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices
virtio-iommu: Remove probe_done
Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"
gdbstub: Add helper function to unregister GDB register space
physmem: Add helper function to destroy CPU AddressSpace
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add common function to help unregister the GDB register space. This shall be
done in context to the CPU unrealization.
Note: These are common functions exported to arch specific code. For example,
for ARM this code is being referred in associated arch specific patch-set:
Link: https://lore.kernel.org/qemu-devel/20230926103654.34424-1-salil.mehta@huawei.com/
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240716111502.202344-8-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently QEMU describes initial[1] RAM* in SMBIOS as a series of
virtual DIMMs (capped at 16Gb max) using type 17 structure entries.
Which is fine for the most cases. However when starting guest
with terabytes of RAM this leads to too many memory device
structures, which eventually upsets linux kernel as it reserves
only 64K for these entries and when that border is crossed out
it runs out of reserved memory.
Instead of partitioning initial RAM on 16Gb DIMMs, use maximum
possible chunk size that SMBIOS spec allows[2]. Which lets
encode RAM in lower 31 bits of 32bit field (which amounts upto
2047Tb per DIMM).
As result initial RAM will generate only one type 17 structure
until host/guest reach ability to use more RAM in the future.
Compat changes:
We can't unconditionally change chunk size as it will break
QEMU<->guest ABI (and migration). Thus introduce a new machine
class field that would let older versioned machines to use
legacy 16Gb chunks, while new(er) machine type[s] use maximum
possible chunk size.
PS:
While it might seem to be risky to rise max entry size this large
(much beyond of what current physical RAM modules support),
I'd not expect it causing much issues, modulo uncovering bugs
in software running within guest. And those should be fixed
on guest side to handle SMBIOS spec properly, especially if
guest is expected to support so huge RAM configs.
In worst case, QEMU can reduce chunk size later if we would
care enough about introducing a workaround for some 'unfixable'
guest OS, either by fixing up the next machine type or
giving users a CLI option to customize it.
1) Initial RAM - is RAM configured with help '-m SIZE' CLI option/
implicitly defined by machine. It doesn't include memory
configured with help of '-device' option[s] (pcdimm,nvdimm,...)
2) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
PS:
* tested on 8Tb host with RHEL6 guest, which seems to parse
type 17 SMBIOS table entries correctly (according to 'dmidecode').
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240715122417.4059293-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
"make check SPEED=slow" is currently failing the device-introspect-test on
older machine types since introspecting "scsi-block" is causing an abort:
$ ./qemu-system-x86_64 -M pc-q35-8.0 -monitor stdio
QEMU 9.0.50 monitor - type 'help' for more information
(qemu) device_add scsi-block,help
Unexpected error in object_property_find_err() at
../../devel/qemu/qom/object.c:1357:
can't apply global scsi-disk-base.migrate-emulated-scsi-request=false:
Property 'scsi-block.migrate-emulated-scsi-request' not found
Aborted (core dumped)
The problem is that the compat code tries to change the
"migrate-emulated-scsi-request" property for all devices that are
derived from "scsi-block", but the property has only been added
to "scsi-hd" and "scsi-cd" via the DEFINE_SCSI_DISK_PROPERTIES macro.
Thus let's fix the problem by only changing the property on the devices
that really have this property.
Fixes: b4912afa5f ("scsi-disk: Fix crash for VM configured with USB CDROM after live migration")
Message-ID: <20240703090904.909720-1-thuth@redhat.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
* SEV: Don't allow automatic fallback to legacy KVM_SEV_INIT,
but also don't use it by default
* scsi: honor bootindex again for legacy drives
* hpet, utils, scsi, build, cpu: miscellaneous bugfixes
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaWoP0UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOqfggAg3jxUp6B8dFTEid5aV6qvT4M6nwD
TAYcAl5kRqTOklEmXiPCoA5PeS0rbr+5xzWLAKgkumjCVXbxMoYSr0xJHVuDwQWv
XunUm4kpxJBLKK3uTGAIW9A21thOaA5eAoLIcqu2smBMU953TBevMqA7T67h22rp
y8NnZWWdyQRH0RAaWsCBaHVkkf+DuHSG5LHMYhkdyxzno+UWkTADFppVhaDO78Ba
Egk49oMO+G6of4+dY//p1OtAkAf4bEHePKgxnbZePInJrkgHzr0TJWf9gERWFzdK
JiM0q6DeqopZm+vENxS+WOx7AyDzdN0qOrf6t9bziXMg0Rr2Z8bu01yBCQ==
=cZhV
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* target/i386/tcg: fixes for seg_helper.c
* SEV: Don't allow automatic fallback to legacy KVM_SEV_INIT,
but also don't use it by default
* scsi: honor bootindex again for legacy drives
* hpet, utils, scsi, build, cpu: miscellaneous bugfixes
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaWoP0UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOqfggAg3jxUp6B8dFTEid5aV6qvT4M6nwD
# TAYcAl5kRqTOklEmXiPCoA5PeS0rbr+5xzWLAKgkumjCVXbxMoYSr0xJHVuDwQWv
# XunUm4kpxJBLKK3uTGAIW9A21thOaA5eAoLIcqu2smBMU953TBevMqA7T67h22rp
# y8NnZWWdyQRH0RAaWsCBaHVkkf+DuHSG5LHMYhkdyxzno+UWkTADFppVhaDO78Ba
# Egk49oMO+G6of4+dY//p1OtAkAf4bEHePKgxnbZePInJrkgHzr0TJWf9gERWFzdK
# JiM0q6DeqopZm+vENxS+WOx7AyDzdN0qOrf6t9bziXMg0Rr2Z8bu01yBCQ==
# =cZhV
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 17 Jul 2024 02:34:05 AM AEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
target/i386/tcg: save current task state before loading new one
target/i386/tcg: use X86Access for TSS access
target/i386/tcg: check for correct busy state before switching to a new task
target/i386/tcg: Compute MMU index once
target/i386/tcg: Introduce x86_mmu_index_{kernel_,}pl
target/i386/tcg: Reorg push/pop within seg_helper.c
target/i386/tcg: use PUSHL/PUSHW for error code
target/i386/tcg: Allow IRET from user mode to user mode with SMAP
target/i386/tcg: Remove SEG_ADDL
target/i386/tcg: fix POP to memory in long mode
hpet: fix HPET_TN_SETVAL for high 32-bits of the comparator
hpet: fix clamping of period
docs: Update description of 'user=username' for '-run-with'
qemu/timer: Add host ticks function for LoongArch
scsi: fix regression and honor bootindex again for legacy drives
hw/scsi/lsi53c895a: bump instruction limit in scripts processing to fix regression
disas: Fix build against Capstone v6
cpu: Free queued CPU work
Revert "qemu-char: do not operate on sources from finalize callbacks"
i386/sev: Don't allow automatic fallback to legacy KVM_SEV*_INIT
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
load_image_gzipped() does not seem to be used anywhere. Remove it.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240711072448.32673-1-anisinha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The read() syscall is not guaranteed to return all data from a file. The
default ROM loader implementation currently does not take this into account,
instead failing if all bytes are not read at once. This change loads the ROM
using g_file_get_contents() instead, which correctly reads all data using
multiple calls to read() while also returning the loaded ROM size.
Signed-off-by: Gregor Haas <gregorhaas1997@gmail.com>
Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240628182706.99525-1-gregorhaas1997@gmail.com>
[PMD: Use gsize with g_file_get_contents()]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Running qemu-system-aarch64 -M virt -nographic and terminating it will
result in a LeakSanitizer error due to remaining queued CPU work so
free it.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20240714-cpu-v1-1-19c2f8de2055@daynix.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recent SDHCI expect cards to support the v3.01 spec
to negociate lower I/O voltage. Select it by default.
Versioned machine types with a version of 9.0 or
earlier retain the old default (spec v2.00).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240703134356.85972-2-philmd@linaro.org>
Calling qemu_plugin_vcpu_init__async() on the vCPU thread
is a detail of plugins, not relevant to TCG vCPU management.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240606124010.2460-4-philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240705084047.857176-30-alex.bennee@linaro.org>
cpu::plugin_state is allocated in cpu_common_initfn() when
the vCPU state is created. Release it in cpu_common_finalize()
when we are done.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240606124010.2460-3-philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240705084047.857176-29-alex.bennee@linaro.org>
Really the problem here is the return values of fit_load_[kernel|fdt]() are a
little all over the place. However we don't want to somehow get
through not having set kernel_end and having it just be random unused
data.
The compiler complained on an --enable-gcov build:
In file included from ../../hw/core/loader-fit.c:20:
/home/alex/lsrc/qemu.git/include/qemu/osdep.h: In function ‘load_fit’:
/home/alex/lsrc/qemu.git/include/qemu/osdep.h:486:45: error: ‘kernel_end’ may be used uninitialized [-Werror=maybe-uninitialized]
486 | #define ROUND_UP(n, d) ROUND_DOWN((n) + (d) - 1, (d))
| ^
../../hw/core/loader-fit.c:270:12: note: ‘kernel_end’ was declared here
270 | hwaddr kernel_end;
| ^~~~~~~~~~
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Aleksandar Rikalo <arikalo@gmail.com>
Message-Id: <20240705084047.857176-5-alex.bennee@linaro.org>
In current code, when guest does S3, virtio-gpu are reset due to the
bit No_Soft_Reset is not set. After resetting, the display resources
of virtio-gpu are destroyed, then the display can't come back and only
show blank after resuming.
Implement No_Soft_Reset bit of PCI_PM_CTRL register, then guest can check
this bit, if this bit is set, the devices resetting will not be done, and
then the display can work after resuming.
No_Soft_Reset bit is implemented for all virtio devices, and was tested
only on virtio-gpu device. Set it false by default for safety.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Message-Id: <20240606102205.114671-3-Jiqian.Chen@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We should call inflateEnd() like on success path to cleanup state in s
variable.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since a4c2735f35 (cpu: move Qemu[Thread|Cond] setup into common code,
2024-05-30) these fields are now allocated at cpu_common_initfn(). So
let's make sure we also free them at cpu_common_finalize().
Furthermore, the code also frees these on round robin, but we missed
'halt_cond'.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is a counterpart to the HMP "info pic" command. It is being
added with an "x-" prefix because this QMP command is intended as an
adhoc debugging tool and will thus not be modelled in QAPI as fully
structured data, nor will it have long term guaranteed stability.
The existing HMP command is rewritten to call the QMP command.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20240610063518.50680-3-philmd@linaro.org>
* scsi-disk: migrate emulated requests
* i386/sev: fix Coverity issues
* i386/tcg: more conversions to new decoder
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZv6kMUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOn4Af/evnpsae1fm8may1NQmmezKiks/4X
cR0GaQ7w75Oas05jKsG7Xnrq3Vn6p5wllf3Wf00p7F1iJX18azY9rQgIsUVUgVem
/EIZk1eM6+mDxuIG0taPxc5Aw3cfIBWAjUmzsXrSr55e/wyiIxZCeUo2zk8Il+iL
Z4ceNzY5PZzc2Fl10D3cGs/+ynfiDM53ucwe3ve2T6NrxEVfKQPp5jkIUkBUba6z
zM5O4Q5KTEZYVth1gbDTB/uUJLUFjQ12kCQfRCNX+bEPDHwARr0UWr/Oxtz0jZSd
FvXohz7tI+v+ph0xHyE4tEFqryvLCII1td2ohTAYZZXNGkjK6XZildngBw==
=m4BE
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* i386: fix issue with cache topology passthrough
* scsi-disk: migrate emulated requests
* i386/sev: fix Coverity issues
* i386/tcg: more conversions to new decoder
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZv6kMUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOn4Af/evnpsae1fm8may1NQmmezKiks/4X
# cR0GaQ7w75Oas05jKsG7Xnrq3Vn6p5wllf3Wf00p7F1iJX18azY9rQgIsUVUgVem
# /EIZk1eM6+mDxuIG0taPxc5Aw3cfIBWAjUmzsXrSr55e/wyiIxZCeUo2zk8Il+iL
# Z4ceNzY5PZzc2Fl10D3cGs/+ynfiDM53ucwe3ve2T6NrxEVfKQPp5jkIUkBUba6z
# zM5O4Q5KTEZYVth1gbDTB/uUJLUFjQ12kCQfRCNX+bEPDHwARr0UWr/Oxtz0jZSd
# FvXohz7tI+v+ph0xHyE4tEFqryvLCII1td2ohTAYZZXNGkjK6XZildngBw==
# =m4BE
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 17 Jun 2024 12:48:19 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (25 commits)
target/i386: SEV: do not assume machine->cgs is SEV
target/i386: convert CMPXCHG to new decoder
target/i386: convert XADD to new decoder
target/i386: convert LZCNT/TZCNT/BSF/BSR/POPCNT to new decoder
target/i386: convert SHLD/SHRD to new decoder
target/i386: adapt gen_shift_count for SHLD/SHRD
target/i386: pull load/writeback out of gen_shiftd_rm_T1
target/i386: convert non-grouped, helper-based 2-byte opcodes
target/i386: split X86_CHECK_prot into PE and VM86 checks
target/i386: finish converting 0F AE to the new decoder
target/i386: fix bad sorting of entries in the 0F table
target/i386: replace read_crN helper with read_cr8
target/i386: convert MOV from/to CR and DR to new decoder
target/i386: fix processing of intercept 0 (read CR0)
target/i386: replace NoSeg special with NoLoadEA
target/i386: change X86_ENTRYwr to use T0, use it for moves
target/i386: change X86_ENTRYr to use T0
target/i386: put BLS* input in T1, use generic flag writeback
target/i386: rewrite flags writeback for ADCX/ADOX
target/i386: remove CPUX86State argument from generator functions
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
add the Query Processing Library (QPL) compression method
Introduce the qpl as a new multifd migration compression method, it can
use In-Memory Analytics Accelerator(IAA) to accelerate compression and
decompression, which can not only reduce network bandwidth requirement
but also reduce host compression and decompression CPU overhead.
How to enable qpl compression during migration:
migrate_set_parameter multifd-compression qpl
There is no qpl compression level parameter added since it only supports
level one, users do not need to specify the qpl compression level.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed docs spacing in migration.json]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
For VMs configured with the USB CDROM device:
-drive file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
-device usb-storage,drive=drive-usb-disk0,id=usb-disk0...
QEMU process may crash after live migration, to reproduce the issue,
configure VM (Guest OS ubuntu 20.04 or 21.10) with the following XML:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/path/to/share_fs/cdrom.iso'/>
<target dev='sda' bus='usb'/>
<readonly/>
<address type='usb' bus='0' port='2'/>
</disk>
<controller type='usb' index='0' model='piix3-uhci'/>
Do the live migration repeatedly, crash may happen after live migratoin,
trace log at the source before live migration is as follows:
324808@1711972823.521945:usb_uhci_frame_start nr 319
324808@1711972823.521978:usb_uhci_qh_load qh 0x35cb5400
324808@1711972823.521989:usb_uhci_qh_load qh 0x35cb5480
324808@1711972823.521997:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 0x0, token 0xffe07f69
324808@1711972823.522010:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
324808@1711972823.522022:usb_uhci_qh_load qh 0x35cb5680
324808@1711972823.522030:usb_uhci_td_load qh 0x35cb5680, td 0x75ac5180, ctrl 0x19800000, token 0x3c903e1
324808@1711972823.522045:usb_uhci_packet_add token 0x103e1, td 0x75ac5180
324808@1711972823.522056:usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state undef -> setup
324808@1711972823.522079:usb_msd_cmd_submit lun 0, tag 0x472, flags 0x00000080, len 10, data-len 8
324808@1711972823.522107:scsi_req_parsed target 0 lun 0 tag 1138 command 74 dir 1 length 8
324808@1711972823.522124:scsi_req_parsed_lba target 0 lun 0 tag 1138 command 74 lba 4096
324808@1711972823.522139:scsi_req_alloc target 0 lun 0 tag 1138
324808@1711972823.522169:scsi_req_continue target 0 lun 0 tag 1138
324808@1711972823.522181:scsi_req_data target 0 lun 0 tag 1138 len 8
324808@1711972823.522194:usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state setup -> complete
324808@1711972823.522209:usb_uhci_packet_complete_success token 0x103e1, td 0x75ac5180
324808@1711972823.522219:usb_uhci_packet_del token 0x103e1, td 0x75ac5180
324808@1711972823.522232:usb_uhci_td_complete qh 0x35cb5680, td 0x75ac5180
trace log at the destination after live migration is as follows:
3286206@1711972823.951646:usb_uhci_frame_start nr 320
3286206@1711972823.951663:usb_uhci_qh_load qh 0x35cb5100
3286206@1711972823.951671:usb_uhci_qh_load qh 0x35cb5480
3286206@1711972823.951680:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 0x1000000, token 0xffe07f69
3286206@1711972823.951693:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
3286206@1711972823.951702:usb_uhci_qh_load qh 0x35cb5700
3286206@1711972823.951709:usb_uhci_td_load qh 0x35cb5700, td 0x75ac5240, ctrl 0x39800000, token 0xe08369
3286206@1711972823.951727:usb_uhci_queue_add token 0x8369
3286206@1711972823.951735:usb_uhci_packet_add token 0x8369, td 0x75ac5240
3286206@1711972823.951746:usb_packet_state_change bus 0, port 2, ep 1, packet 0x56066b2fb5a0, state undef -> setup
3286206@1711972823.951766:usb_msd_data_in 8/8 (scsi 8)
2024-04-01 12:00:24.665+0000: shutting down, reason=crashed
The backtrace reveals the following:
Program terminated with signal SIGSEGV, Segmentation fault.
0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
312 movq -8(%rsi,%rdx), %rcx
[Current thread is 1 (Thread 0x7f0a9025fc00 (LWP 3286206))]
(gdb) bt
0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
1 memcpy (__len=8, __src=<optimized out>, __dest=<optimized out>) at /usr/include/bits/string_fortified.h:34
2 iov_from_buf_full (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, buf=0x0, bytes=bytes@entry=8) at ../util/iov.c:33
3 iov_from_buf (bytes=8, buf=<optimized out>, offset=<optimized out>, iov_cnt=<optimized out>, iov=<optimized out>)
at /usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49
4 usb_packet_copy (p=p@entry=0x56066b2fb5a0, ptr=<optimized out>, bytes=bytes@entry=8) at ../hw/usb/core.c:636
5 usb_msd_copy_data (s=s@entry=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at ../hw/usb/dev-storage.c:186
6 usb_msd_handle_data (dev=0x56066c62c770, p=0x56066b2fb5a0) at ../hw/usb/dev-storage.c:496
7 usb_handle_packet (dev=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at ../hw/usb/core.c:455
8 uhci_handle_td (s=s@entry=0x56066bd5f210, q=0x56066bb7fbd0, q@entry=0x0, qh_addr=qh_addr@entry=902518530, td=td@entry=0x7fffe6e788f0, td_addr=<optimized out>,
int_mask=int_mask@entry=0x7fffe6e788e4) at ../hw/usb/hcd-uhci.c:885
9 uhci_process_frame (s=s@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1061
10 uhci_frame_timer (opaque=opaque@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1159
11 timerlist_run_timers (timer_list=0x56066af26bd0) at ../util/qemu-timer.c:642
12 qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at ../util/qemu-timer.c:656
13 qemu_clock_run_all_timers () at ../util/qemu-timer.c:738
14 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:542
15 qemu_main_loop () at ../softmmu/runstate.c:739
16 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:52
(gdb) frame 5
(gdb) p ((SCSIDiskReq *)s->req)->iov
$1 = {iov_base = 0x0, iov_len = 0}
(gdb) p/x s->req->tag
$2 = 0x472
When designing the USB mass storage device model, QEMU places SCSI disk
device as the backend of USB mass storage device. In addition, USB mass
device driver in Guest OS conforms to the "Universal Serial Bus Mass
Storage Class Bulk-Only Transport" specification in order to simulate
the transform behavior between a USB controller and a USB mass device.
The following shows the protocol hierarchy:
+----------------+
CDROM driver | scsi command | CDROM
+----------------+
+-----------------------+
USB mass | USB Mass Storage Class| USB mass
storage driver | Bulk-Only Transport | storage device
+-----------------------+
+----------------+
USB Controller | USB Protocol | USB device
+----------------+
In the USB protocol layer, between the USB controller and USB device, at
least two USB packets will be transformed when guest OS send a
read operation to USB mass storage device:
1. The CBW packet, which will be delivered to the USB device's Bulk-Out
endpoint. In order to simulate a read operation, the USB mass storage
device parses the CBW and converts it to a SCSI command, which would be
executed by CDROM(represented as SCSI disk in QEMU internally), and store
the result data of the SCSI command in a buffer.
2. The DATA-IN packet, which will be delivered from the USB device's
Bulk-In endpoint(fetched directly from the preceding buffer) to the USB
controller.
We consider UHCI to be the controller. The two packets mentioned above may
have been processed by UHCI in two separate frame entries of the Frame List
, and also described by two different TDs. Unlike the physical environment,
a virtualized environment requires the QEMU to make sure that the result
data of CBW is not lost and is delivered to the UHCI controller.
Currently, these types of SCSI requests are not migrated, so QEMU cannot
ensure the result data of the IO operation is not lost if there are
inflight emulated SCSI requests during the live migration.
Assume for the moment that the USB mass storage device is processing the
CBW and storing the result data of the read operation to a buffre, live
migration happens and moves the VM to the destination while not migrating
the result data of the read operation.
After migration, when UHCI at the destination issues a DATA-IN request to
the USB mass storage device, a crash happens because USB mass storage device
fetches the result data and get nothing.
The scenario this patch addresses is this one.
Theoretically, any device that uses the SCSI disk as a back-end would be
affected by this issue. In this case, it is the USB CDROM.
To fix it, inflight emulated SCSI request be migrated during live migration,
similar to the DMA SCSI request.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Message-ID: <878c8f093f3fc2f584b5c31cb2490d9f6a12131a.1716531409.git.yong.huang@smartx.com>
[Do not bump migration version, introduce compat property instead. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Detect early unsupported MADV_MERGEABLE and MADV_DONTDUMP, and print a clearer
error message that points to the deficiency of the host.
Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>