add qpl to compression method test for multifd migration
the qpl compression supports software path and hardware
path(IAA device), and the hardware path is used first by
default. If the hardware path is unavailable, it will
automatically fallback to the software path for testing.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
QPL compression and decompression will use IAA hardware path if the IAA
hardware is available. Otherwise the QPL library software path is used.
The hardware path will automatically fall back to QPL software path if
the IAA queues are busy. In some scenarios, this may happen frequently,
such as configuring 4 channels but only one IAA device is available. In
the case of insufficient IAA hardware resources, retry and fallback can
help optimize performance:
1. Retry + SW fallback:
total time: 14649 ms
downtime: 25 ms
throughput: 17666.57 mbps
pages-per-second: 1509647
2. No fallback, always wait for work queues to become available
total time: 18381 ms
downtime: 25 ms
throughput: 13698.65 mbps
pages-per-second: 859607
If both the hardware and software paths fail, the uncompressed page is
sent directly.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
during initialization, a software job is allocated to each channel
for software path fallabck when the IAA hardware is unavailable or
the hardware job submission fails. If the IAA hardware is available,
multiple hardware jobs are allocated for batch processing.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add the Query Processing Library (QPL) compression method
Introduce the qpl as a new multifd migration compression method, it can
use In-Memory Analytics Accelerator(IAA) to accelerate compression and
decompression, which can not only reduce network bandwidth requirement
but also reduce host compression and decompression CPU overhead.
How to enable qpl compression during migration:
migrate_set_parameter multifd-compression qpl
There is no qpl compression level parameter added since it only supports
level one, users do not need to specify the qpl compression level.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed docs spacing in migration.json]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add --enable-qpl and --disable-qpl options to enable and disable
the QPL compression method for multifd migration.
The Query Processing Library (QPL) is an open-source library
that supports data compression and decompression features. It
is based on the deflate compression algorithm and use Intel
In-Memory Analytics Accelerator(IAA) hardware for compression
and decompression acceleration.
For more live migration with IAA, please refer to the document
docs/devel/migration/qpl-compression.rst
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Different compression methods may require different numbers of IOVs.
Based on streaming compression of zlib and zstd, all pages will be
compressed to a data block, so two IOVs are needed for packet header
and compressed data block.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Similar to other archs, build a custom bios memory updater. Running the
test with OF code is a cool trick, but SLOF takes a long time to boot.
This reduces test time by around 3x (150s to 50s).
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
ppc64 with TCG seems to no longer be failing this test, perhaps since
commit 03bfc2188f ("physmem: Fix migration dirty bitmap coherency
with TCG memory access") which is not ppc specific but was seen to hit
ppc64 quite easily.
Let's enable it again.
The s390x problem has been identified so mention it while we are
adjusting the comment.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The spapr QEMU machine defaults is useful outside libqos, so create a
new header for ppc specific qtests and move it there.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
* Make qtests more flexible with regards to non-available CPU models
* Improvements for the test-smp-parse unit test
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmZpoEoRHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbVF6g/+JYTRKmaIduQIP9g2+NkM+qMTbjI9Ow47
8Vdj/ePMXNWOZsgMPkCUdisYeMZEC+XMcDN1xvwZXwLTMTJRacZCSFRpeN4P0m0W
6aQ28+tPNgx+B9Eh2kc4TpxbiqSH8u5u4GEN4Y07rcX/3YbYyjFgZD8orRu/nJ+H
0wV7Riq9csi1BkLxrgKaHocFSOl4eOga4OFi+u4wIn/xoW3MN0laxe4iuoQRMZPf
gJLPRhEija4lto8iIKNxJbTABB0wEcWRWtgqcbHxdatqh1lPTPBpWxmdD/v1LJn+
H/eO+oh05NQdlhw7+xfWF9PD+MpIePbZ28oNb3X3uURROTdcxpBAgpPipv07FsT4
LmU2nIBQ4FcpDOkhLnLmBmFBNO6uDCzuGzxFRhX1SIiGMABqTDOKynBQSgQI2iB0
5J47XUwHtnOoCvf4SRA/MZG8zNSQZdJbnuOBLgZ+vsCG14mWM2NbfSUwRkH6pd/J
fEbODuzHZoYgUTxjR9+WMbINAbNjMy+SP2sGZIBzcAIIkybKynOy58LoCyNT684U
ean9bnc65908PJxEfsQ6k9kNwkK4GwOqZi+X383nVgMJ9+3dDw8M76IVU59hsq1n
wnz4VgFcRdXMYhj9zghaCgH2Ezw8gZHILXH+RlX0Bav4LQ5vSZQ6tRNwM4+rfXBe
okF1Sxmz31U=
=s7+V
-----END PGP SIGNATURE-----
Merge tag 'pull-request-2024-06-12' of https://gitlab.com/thuth/qemu into staging
* Fix loongarch64 avocado test
* Make qtests more flexible with regards to non-available CPU models
* Improvements for the test-smp-parse unit test
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmZpoEoRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVF6g/+JYTRKmaIduQIP9g2+NkM+qMTbjI9Ow47
# 8Vdj/ePMXNWOZsgMPkCUdisYeMZEC+XMcDN1xvwZXwLTMTJRacZCSFRpeN4P0m0W
# 6aQ28+tPNgx+B9Eh2kc4TpxbiqSH8u5u4GEN4Y07rcX/3YbYyjFgZD8orRu/nJ+H
# 0wV7Riq9csi1BkLxrgKaHocFSOl4eOga4OFi+u4wIn/xoW3MN0laxe4iuoQRMZPf
# gJLPRhEija4lto8iIKNxJbTABB0wEcWRWtgqcbHxdatqh1lPTPBpWxmdD/v1LJn+
# H/eO+oh05NQdlhw7+xfWF9PD+MpIePbZ28oNb3X3uURROTdcxpBAgpPipv07FsT4
# LmU2nIBQ4FcpDOkhLnLmBmFBNO6uDCzuGzxFRhX1SIiGMABqTDOKynBQSgQI2iB0
# 5J47XUwHtnOoCvf4SRA/MZG8zNSQZdJbnuOBLgZ+vsCG14mWM2NbfSUwRkH6pd/J
# fEbODuzHZoYgUTxjR9+WMbINAbNjMy+SP2sGZIBzcAIIkybKynOy58LoCyNT684U
# ean9bnc65908PJxEfsQ6k9kNwkK4GwOqZi+X383nVgMJ9+3dDw8M76IVU59hsq1n
# wnz4VgFcRdXMYhj9zghaCgH2Ezw8gZHILXH+RlX0Bav4LQ5vSZQ6tRNwM4+rfXBe
# okF1Sxmz31U=
# =s7+V
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Jun 2024 06:19:06 AM PDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
* tag 'pull-request-2024-06-12' of https://gitlab.com/thuth/qemu:
tests/tcg/s390x: Allow specifying extra QEMU options on the command line
tests/unit/test-smp-parse: Test the full 8-levels topology hierarchy
tests/unit/test-smp-parse: Test "modules" and "dies" combination case
tests/unit/test-smp-parse: Test "modules" parameter in -smp
tests/unit/test-smp-parse: Make test cases aware of module level
tests/unit/test-smp-parse: Use default parameters=0 when not set in -smp
tests/unit/test-smp-parse: Fix an invalid topology case
tests/unit/test-smp-parse: Fix comment of parameters=1 case
tests/unit/test-smp-parse: Fix comments of drawers and books case
test: Remove libibumad dependence
meson: Remove libibumad dependence
tests/qtest/x86: check for availability of older cpu models before running tests
tests/qtest/libqtest: add qtest_has_cpu_model() api
qtest/x86/numa-test: do not use the obsolete 'pentium' cpu
tests/avocado: Update LoongArch bios file
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The use case for this is `make check-tcg EXTFLAGS="-accel kvm"`,
which allows validating the system TCG testcases on real hardware.
EXTFLAGS name is borrowed from tests/tcg/xtensa/Makefile.softmmu-target.
While at it, use += instead of = in order to be consistent with the
other architectures.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240522184116.35975-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
With module level, QEMU now support 8-levels topology hierarchy.
Cover "modules" in SMP_CONFIG_WITH_FULL_TOPO related cases.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-9-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since i386 PC machine supports both "modules" and "dies" in -smp, add the
"modules" and "dies" combination test case to match the actual topology
usage scenario.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-8-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Cover the module cases in test-smp-parse.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-7-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Currently, -smp supports module level.
It is necessary to consider the effects of module in the test cases to
ensure that the calculations are correct. This is also the preparation
to add module test cases.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-6-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Since -smp allows parameters=1 whether the level is supported by
machine, to avoid the test scenarios where the parameter defaults to 1
cause some errors to be masked, explicitly set undesired parameters to
0.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-5-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Adjust the "cpus" parameter to match the comment configuration.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-4-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
SMP_CONFIG_WITH_FULL_TOPO hasn't support module level, so the parameter
should indicate the "clusters".
Additionally, reorder the parameters of -smp to match the topology
hierarchy order.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-3-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Fix the comments to match the actual configurations.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Message-ID: <20240529061925.350323-2-zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Remove libibumad dependence from the test environment.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240611105427.61395-3-pizhenwei@bytedance.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
RDMA based migration has no dependence on libumad. libibverbs and
librdmacm are enough.
libumad was used by rdmacm-mux which has been already removed. It's
remained mistakenly.
Fixes: 1dfd42c426 ("hw/rdma: Remove deprecated pvrdma device and rdmacm-mux helper")
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240611105427.61395-2-pizhenwei@bytedance.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It is better to check if some older cpu models like 486, athlon, pentium,
penryn, phenom, core2duo etc are available before running their corresponding
tests. Some downstream distributions may no longer support these older cpu
models.
Signature of add_feature_test() has been modified to return void as
FeatureTestArgs* was not used by the caller.
One minor correction. Replaced 'phenom' with '486' in the test
'x86/cpuid/auto-level/phenom/arat' matching the cpu used.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240610155303.7933-4-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Added a new test api qtest_has_cpu_model() in order to check availability of
some cpu models in the current QEMU binary. The specific architecture of the
QEMU binary is selected using the QTEST_QEMU_BINARY environment variable.
This api would be useful to run tests against some older cpu models after
checking if QEMU actually supported these models.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240610155303.7933-3-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
'pentium' cpu is old and obsolete and should be avoided for running tests if
its not strictly needed. Use 'max' cpu instead for generic non-cpu specific
numa test.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-ID: <20240610155303.7933-2-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The VM uses old bios to boot up only 1 cpu, causing the test case to fail.
Update the bios to solve this problem.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Message-ID: <20240604030058.2327145-1-gaosong@loongson.cn>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This is a bit more generic, as it can be applied to MPX as well.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just like X86_ENTRYr, X86_ENTRYwr is easily changed to use only T0.
In this case, the motivation is to use it for the MOV instruction
family. The case when you need to preserve the input value is the
odd one, as it is used basically only for BLS* instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I am not sure why I made it use T1. It is a bit more symmetric with
respect to X86_ENTRYwr (which uses T0 for the "w"ritten operand
and T1 for the "r"ead operand), but it is also less flexible because it
does not let you apply zextT0/sextT0.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes for easier cpu_cc_* setup, and not using set_cc_op()
should come in handy if QEMU ever implements APX.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid using set_cc_op() in preparation for implementing APX; treat
CC_OP_EFLAGS similar to the case where we have the "opposite" cc_op
(CC_OP_ADOX for ADCX and CC_OP_ADCX for ADOX), except the resulting
cc_op is not CC_OP_ADCOX. This is written easily as two "if"s, whose
conditions are both false for CC_OP_EFLAGS, both true for CC_OP_ADCOX,
and one each true for CC_OP_ADCX/ADOX.
The new logic also makes it easy to drop usage of tmp0.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUX86State argument would only be used to fetch bytes, but that has to be
done before the generator function is called. So remove it, and all
temptation together with it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fixes Coverity CID 1546885.
Fixes: 16dcf200dc ("i386/sev: Introduce "sev-common" type to encapsulate common SEV state")
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240607183611.1111100-4-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Set 'finish->id_block_en' early, so that it is properly reset.
Fixes coverity CID 1546887.
Fixes: 7b34df4426 ("i386/sev: Introduce 'sev-snp-guest' object")
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240607183611.1111100-2-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When QEMU is started with:
-cpu host,host-cache-info=on,l3-cache=off \
-smp 2,sockets=1,dies=1,cores=1,threads=2
Guest can't acquire maximum number of addressable IDs for processor cores in
the physical package from CPUID[04H].
When creating a CPU topology of 1 core per package, host-cache-info only
uses the Host's addressable core IDs field (CPUID.04H.EAX[bits 31-26]),
resulting in a conflict (on the multicore Host) between the Guest core
topology information in this field and the Guest's actual cores number.
Fix it by removing the unnecessary condition to cover 1 core per package
case. This is safe because cores_per_pkg will not be 0 and will be at
least 1.
Fixes: d7caf13b5f ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240611032314.64076-1-xuchuangxclwt@bytedance.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Events aren't designed to be multi-lines. Multiple events
can be used instead. Prevent that format using multi-lines
by forbidding the newline character.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240606103943.79116-6-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Trace events aren't designed to be multi-lines.
Remove the newline characters.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240606103943.79116-5-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Trace events aren't designed to be multi-lines.
Remove the newline characters.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240606103943.79116-4-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Trace events aren't designed to be multi-lines. Remove
the newline character which doesn't bring much value.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240606103943.79116-3-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Split the 'tpm_util_show_buffer' event in two to avoid
using a newline character.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Mads Ynddal <mads@ynddal.dk>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-id: 20240606103943.79116-2-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For VMs configured with the USB CDROM device:
-drive file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
-device usb-storage,drive=drive-usb-disk0,id=usb-disk0...
QEMU process may crash after live migration, to reproduce the issue,
configure VM (Guest OS ubuntu 20.04 or 21.10) with the following XML:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/path/to/share_fs/cdrom.iso'/>
<target dev='sda' bus='usb'/>
<readonly/>
<address type='usb' bus='0' port='2'/>
</disk>
<controller type='usb' index='0' model='piix3-uhci'/>
Do the live migration repeatedly, crash may happen after live migratoin,
trace log at the source before live migration is as follows:
324808@1711972823.521945:usb_uhci_frame_start nr 319
324808@1711972823.521978:usb_uhci_qh_load qh 0x35cb5400
324808@1711972823.521989:usb_uhci_qh_load qh 0x35cb5480
324808@1711972823.521997:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 0x0, token 0xffe07f69
324808@1711972823.522010:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
324808@1711972823.522022:usb_uhci_qh_load qh 0x35cb5680
324808@1711972823.522030:usb_uhci_td_load qh 0x35cb5680, td 0x75ac5180, ctrl 0x19800000, token 0x3c903e1
324808@1711972823.522045:usb_uhci_packet_add token 0x103e1, td 0x75ac5180
324808@1711972823.522056:usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state undef -> setup
324808@1711972823.522079:usb_msd_cmd_submit lun 0, tag 0x472, flags 0x00000080, len 10, data-len 8
324808@1711972823.522107:scsi_req_parsed target 0 lun 0 tag 1138 command 74 dir 1 length 8
324808@1711972823.522124:scsi_req_parsed_lba target 0 lun 0 tag 1138 command 74 lba 4096
324808@1711972823.522139:scsi_req_alloc target 0 lun 0 tag 1138
324808@1711972823.522169:scsi_req_continue target 0 lun 0 tag 1138
324808@1711972823.522181:scsi_req_data target 0 lun 0 tag 1138 len 8
324808@1711972823.522194:usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state setup -> complete
324808@1711972823.522209:usb_uhci_packet_complete_success token 0x103e1, td 0x75ac5180
324808@1711972823.522219:usb_uhci_packet_del token 0x103e1, td 0x75ac5180
324808@1711972823.522232:usb_uhci_td_complete qh 0x35cb5680, td 0x75ac5180
trace log at the destination after live migration is as follows:
3286206@1711972823.951646:usb_uhci_frame_start nr 320
3286206@1711972823.951663:usb_uhci_qh_load qh 0x35cb5100
3286206@1711972823.951671:usb_uhci_qh_load qh 0x35cb5480
3286206@1711972823.951680:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 0x1000000, token 0xffe07f69
3286206@1711972823.951693:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
3286206@1711972823.951702:usb_uhci_qh_load qh 0x35cb5700
3286206@1711972823.951709:usb_uhci_td_load qh 0x35cb5700, td 0x75ac5240, ctrl 0x39800000, token 0xe08369
3286206@1711972823.951727:usb_uhci_queue_add token 0x8369
3286206@1711972823.951735:usb_uhci_packet_add token 0x8369, td 0x75ac5240
3286206@1711972823.951746:usb_packet_state_change bus 0, port 2, ep 1, packet 0x56066b2fb5a0, state undef -> setup
3286206@1711972823.951766:usb_msd_data_in 8/8 (scsi 8)
2024-04-01 12:00:24.665+0000: shutting down, reason=crashed
The backtrace reveals the following:
Program terminated with signal SIGSEGV, Segmentation fault.
0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
312 movq -8(%rsi,%rdx), %rcx
[Current thread is 1 (Thread 0x7f0a9025fc00 (LWP 3286206))]
(gdb) bt
0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
1 memcpy (__len=8, __src=<optimized out>, __dest=<optimized out>) at /usr/include/bits/string_fortified.h:34
2 iov_from_buf_full (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, buf=0x0, bytes=bytes@entry=8) at ../util/iov.c:33
3 iov_from_buf (bytes=8, buf=<optimized out>, offset=<optimized out>, iov_cnt=<optimized out>, iov=<optimized out>)
at /usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49
4 usb_packet_copy (p=p@entry=0x56066b2fb5a0, ptr=<optimized out>, bytes=bytes@entry=8) at ../hw/usb/core.c:636
5 usb_msd_copy_data (s=s@entry=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at ../hw/usb/dev-storage.c:186
6 usb_msd_handle_data (dev=0x56066c62c770, p=0x56066b2fb5a0) at ../hw/usb/dev-storage.c:496
7 usb_handle_packet (dev=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at ../hw/usb/core.c:455
8 uhci_handle_td (s=s@entry=0x56066bd5f210, q=0x56066bb7fbd0, q@entry=0x0, qh_addr=qh_addr@entry=902518530, td=td@entry=0x7fffe6e788f0, td_addr=<optimized out>,
int_mask=int_mask@entry=0x7fffe6e788e4) at ../hw/usb/hcd-uhci.c:885
9 uhci_process_frame (s=s@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1061
10 uhci_frame_timer (opaque=opaque@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1159
11 timerlist_run_timers (timer_list=0x56066af26bd0) at ../util/qemu-timer.c:642
12 qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at ../util/qemu-timer.c:656
13 qemu_clock_run_all_timers () at ../util/qemu-timer.c:738
14 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:542
15 qemu_main_loop () at ../softmmu/runstate.c:739
16 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:52
(gdb) frame 5
(gdb) p ((SCSIDiskReq *)s->req)->iov
$1 = {iov_base = 0x0, iov_len = 0}
(gdb) p/x s->req->tag
$2 = 0x472
When designing the USB mass storage device model, QEMU places SCSI disk
device as the backend of USB mass storage device. In addition, USB mass
device driver in Guest OS conforms to the "Universal Serial Bus Mass
Storage Class Bulk-Only Transport" specification in order to simulate
the transform behavior between a USB controller and a USB mass device.
The following shows the protocol hierarchy:
+----------------+
CDROM driver | scsi command | CDROM
+----------------+
+-----------------------+
USB mass | USB Mass Storage Class| USB mass
storage driver | Bulk-Only Transport | storage device
+-----------------------+
+----------------+
USB Controller | USB Protocol | USB device
+----------------+
In the USB protocol layer, between the USB controller and USB device, at
least two USB packets will be transformed when guest OS send a
read operation to USB mass storage device:
1. The CBW packet, which will be delivered to the USB device's Bulk-Out
endpoint. In order to simulate a read operation, the USB mass storage
device parses the CBW and converts it to a SCSI command, which would be
executed by CDROM(represented as SCSI disk in QEMU internally), and store
the result data of the SCSI command in a buffer.
2. The DATA-IN packet, which will be delivered from the USB device's
Bulk-In endpoint(fetched directly from the preceding buffer) to the USB
controller.
We consider UHCI to be the controller. The two packets mentioned above may
have been processed by UHCI in two separate frame entries of the Frame List
, and also described by two different TDs. Unlike the physical environment,
a virtualized environment requires the QEMU to make sure that the result
data of CBW is not lost and is delivered to the UHCI controller.
Currently, these types of SCSI requests are not migrated, so QEMU cannot
ensure the result data of the IO operation is not lost if there are
inflight emulated SCSI requests during the live migration.
Assume for the moment that the USB mass storage device is processing the
CBW and storing the result data of the read operation to a buffre, live
migration happens and moves the VM to the destination while not migrating
the result data of the read operation.
After migration, when UHCI at the destination issues a DATA-IN request to
the USB mass storage device, a crash happens because USB mass storage device
fetches the result data and get nothing.
The scenario this patch addresses is this one.
Theoretically, any device that uses the SCSI disk as a back-end would be
affected by this issue. In this case, it is the USB CDROM.
To fix it, inflight emulated SCSI request be migrated during live migration,
similar to the DMA SCSI request.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Message-ID: <878c8f093f3fc2f584b5c31cb2490d9f6a12131a.1716531409.git.yong.huang@smartx.com>
[Do not bump migration version, introduce compat property instead. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
vcpu.py is pointless since commit 89aafcf2a7 ("trace:
remove code that depends on setting vcpu"), remote it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-id: 20240606102631.78152-1-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The n_threads argument is no longer used since the previous commit.
Remove it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240527155851.892885-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Ciphers are pre-allocated by qcrypto_block_init_cipher() depending on
the given number of threads. The -device
virtio-blk-pci,iothread-vq-mapping= feature allows users to assign
multiple IOThreads to a virtio-blk device, but the association between
the virtio-blk device and the block driver happens after the block
driver is already open.
When the number of threads given to qcrypto_block_init_cipher() is
smaller than the actual number of threads at runtime, the
block->n_free_ciphers > 0 assertion in qcrypto_block_pop_cipher() can
fail.
Get rid of qcrypto_block_init_cipher() n_thread's argument and allocate
ciphers on demand.
Reported-by: Qing Wang <qinwang@redhat.com>
Buglink: https://issues.redhat.com/browse/RHEL-36159
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240527155851.892885-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Libaio defines IO_CMD_FDSYNC command to sync all outstanding
asynchronous I/O operations, by flushing out file data to the
disk storage. Enable linux-aio to submit such aio request.
When using aio=native without fdsync() support, QEMU creates
pthreads, and destroying these pthreads results in TLB flushes.
In a real-time guest environment, TLB flushes cause a latency
spike. This patch helps to avoid such spikes.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-ID: <20240425070412.37248-1-ppandit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>