Commit Graph

115447 Commits

Author SHA1 Message Date
Thomas Huth
624fb343df tests/functional: Convert the microblaze avocado tests into standalone tests
The machine_microblaze.py file contained two tests, one for each
endianness. Since we only support one QEMU target binary per file
in the new functional test environment, we have to split this file
up into two files now.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-23-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 11:14:33 +02:00
Thomas Huth
be849ef715 tests/functional: Convert the x86_cpu_model_versions test
Nothing thrilling in here, it's just a straight forward conversion.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-22-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 11:14:33 +02:00
Thomas Huth
e2e9fd256e tests/functional: Convert the s390x avocado tests into standalone tests
These tests use archive.lzma_uncompress() from the Avocado utils,
so provide a small helper function for this, based on the
standard lzma module from Python instead.

And while we're at it, replace the MD5 hashes in the topology test
with proper SHA256 hashes, since MD5 should not be used anymore
nowadays.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-21-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 11:14:33 +02:00
Thomas Huth
e3fc99b164 tests/functional: Convert some avocado tests that needed avocado.utils.archive
Instead of using the "archive" module from avocado.utils, switch
these tests to use the new wrapper function that is based on the
"tarfile" module instead.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-20-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 11:14:29 +02:00
Thomas Huth
850a1951b7 tests/functional: Add a function for extracting files from an archive
Some Avocado-based tests use the "archive" module from avocado.utils
to extract files from an archive. To be able to use these tests
without Avocado, we have to provide our own function for extracting
files. Fortunately, there is already the tarfile module that will
provide us with this functionality, so let's just add a nice wrapper
function around that.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-19-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 11:13:48 +02:00
Thomas Huth
4c0a2df81c tests/functional: Convert some tests that download files via fetch_asset()
Now that we've got the Asset class with pre-caching, we can convert
some Avocado tests that use fetch_asset() for downloading their
required files.

Message-ID: <20240830133841.142644-18-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 11:11:31 +02:00
Thomas Huth
34b17c0a65 tests/functional: Allow asset downloading with concurrent threads
When running "make -j$(nproc) check-functional", tests that use the
same asset might be running in parallel. Improve the downloading to
detect this situation and wait for the other thread to finish the
download.

Message-ID: <20240830133841.142644-17-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Daniel P. Berrangé
f57213f85b tests/functional: enable pre-emptive caching of assets
Many tests need to access assets stored on remote sites. We don't want
to download these during test execution when run by meson, since this
risks hitting test timeouts when data transfers are slow.

Add support for pre-emptive caching of assets by setting the env var
QEMU_TEST_PRECACHE to point to a timestamp file. When this is set,
instead of running the test, the assets will be downloaded and saved
to the cache, then the timestamp file created.

A meson custom target is created as a dependency of each test suite
to trigger the pre-emptive caching logic before the test runs.

When run in caching mode, it will locate assets by looking for class
level variables with a name prefix "ASSET_", and type "Asset".

At the ninja level

   ninja test --suite functional

will speculatively download any assets that are not already cached,
so it is advisable to set a timeout multiplier.

   QEMU_TEST_NO_DOWNLOAD=1 ninja test --suite functional

will fail the test if a required asset is not already cached

   ninja precache-functional

will download and cache all assets required by the functional
tests

At the make level, precaching is always done by

   make check-functional

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Richard Henderson <richard.henderson@linaro.org>
[thuth: Remove the duplicated "path = os.path.basename(...)" line]
Message-ID: <20240830133841.142644-16-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Daniel P. Berrangé
9903217a4e tests/functional: add a module for handling asset download & caching
The 'Asset' class is a simple module that declares a downloadable
asset that can be cached locally. Downloads are stored in the user's
home dir at ~/.cache/qemu/download, using a sha256 sum of the URL.

[thuth: Drop sha1 support, use hash on file content for naming instead of URL,
        add the possibility to specify the cache dir via environment variable]

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-15-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
eeba3d7365 tests/functional: Convert avocado tests that just need a small adjustment
These simple tests can be converted to stand-alone tests quite easily,
e.g. by just setting the machine to 'none' now manually or by adding
"-cpu" command line parameters, since we don't support the corresponding
avocado tags in the new python test framework.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-14-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
cce85725f1 tests/functional: Convert simple avocado tests into standalone python tests
These test are rather simple and don't need any modifications apart
from adjusting the "from avocado_qemu" line. To ease debugging, make
the files executable and add a shebang line and Python '__main__'
handling, too, so that these tests can now be run by executing them
directly.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-13-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
1497377857 tests/functional: Prepare the meson build system for the functional tests
Provide a meson.build file for the upcoming python-based functional
tests, and add some wrapper glue targets to the tests/Makefile.include
file. We are going to use two "speed" modes for the functional tests:
The "quick" tests can be run at any time (i.e. also during "make check"),
while the "thorough" tests should only be run when running a
"make check-functional" test run (since these tests might download
additional assets from the internet).

The changes to the meson.build files are partly based on an earlier
patch by Ani Sinha.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-12-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
cf1f31089c tests/Makefile.include: Increase the level of indentation in the help text
The next patch is going to add some entries that need more space between
the command and the help text, so let's increase the indentation here
first.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20240830133841.142644-11-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
84e4a27fed tests/functional: Set up logging
Create log files for each test separately, one file that contains
the basic logging and one that contains the console output.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-10-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
fa32a63432 tests/functional: Add base classes for the upcoming pytest-based tests
The files are mostly a copy of the tests/avocado/avocado_qemu/__init__.py
file with some adjustments to get rid of the Avocado dependencies (i.e.
we also have to drop the LinuxSSHMixIn and LinuxTest for now).

The emulator binary and build directory are now passed via
environment variables that will be set via meson.build later.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-9-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
5ec1eec110 python: Install pycotap in our venv if necessary
The upcoming functional tests will require pycotap for providing
TAP output from the python-based tests. Since we want to be able
to run some of the tests offline by default, too, let's install
it along with meson in our venv if necessary (it's size is only
5 kB, so adding the wheel here should not really be a problem).

The wheel file has been obtained with:

 pip download --only-binary :all: --dest . --no-cache pycotap

Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-8-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
b5347978a9 tests/avocado/boot_linux_console: Remove the s390x subtest
We've got a much more sophisticated, Fedora-based test for s390x
("test_s390x_fedora" in another file) already, so the test in
boot_linux_console.py seems to be rather a waste of precious test
cycles. Thus move the command line check and delete the s390x
test in boot_linux_console.py.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-7-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
c67cb553f1 tests/avocado/avocado_qemu: Fix the "from" statements in linuxtest.py
Without this change, the new Avocado v103 fails to find the tests
that are based on the LinuxTest class.

Suggested-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-6-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Cleber Rosa
657136c653 Bump avocado to 103.0
This bumps Avocado to latest the LTS release.

An LTS release is one that can receive bugfixes and guarantees
stability for a much longer period and has incremental minor releases
made.

Even though the 103.0 LTS release is pretty a rewrite of Avocado when
compared to 88.1, the behavior of all existing tests under
tests/avocado has been extensively tested no regression in behavior
was found.

To keep behavior of jobs as close as possible with previous version,
this version bump keeps the execution serial (maximum of one task at a
time being run).

Reference: https://avocado-framework.readthedocs.io/en/103.0/releases/lts/103_0.html
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-ID: <20240806173119.582857-2-crosa@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-5-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:14 +02:00
Cleber Rosa
a14841264e tests/avocado/machine_aarch64_sbsaref.py: allow for rw usage of image
When the OpenBSD based tests are run in parallel, the previously
single instance of the image would become corrupt.  Let's give each
test its own snapshot.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20240806173119.582857-9-crosa@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-ID: <20240830133841.142644-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:51:46 +02:00
Cleber Rosa
8dceb48e23 tests/avocado/boot_xen.py: fetch kernel during test setUp()
The kernel is a common blob used in all tests.  By moving it to the
setUp() method, the "fetch asset" plugin will recognize the kernel and
attempt to fetch it and cache it before the tests are started.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Message-ID: <20240806173119.582857-7-crosa@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-ID: <20240830133841.142644-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:51:27 +02:00
Cleber Rosa
7e3dca5bca tests/avocado: machine aarch64: standardize location and RO access
The tests under machine_aarch64_virt.py and machine_aarch64_sbsaref.py
should not be writing to the ISO files.  By adding "media=cdrom" the
"ro" is automatically set.

While at it, let's use a single code style and hash for the ISO url.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-ID: <20240806173119.582857-5-crosa@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:51:19 +02:00
Helge Deller
d33d3adb57 target/hppa: Fix random 32-bit linux-user crashes
The linux-user hppa target crashes randomly for me since commit
081a0ed188 ("target/hppa: Do not mask in copy_iaoq_entry").

That commit dropped the masking of the IAOQ addresses while copying them
from other registers and instead keeps them with all 64 bits up until
the full gva is formed with the help of hppa_form_gva_psw().

So, when running in linux-user mode on an emulated 64-bit CPU, we need
to mask to a 32-bit address space at the very end in hppa_form_gva_psw()
if the PSW-W flag isn't set (which is the case for linux-user on hppa).

Fixes: 081a0ed188 ("target/hppa: Do not mask in copy_iaoq_entry")
Cc: qemu-stable@nongnu.org # v9.1+
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-09-03 22:08:22 +02:00
Helge Deller
ead5078cf1 target/hppa: Fix PSW V-bit packaging in cpu_hppa_get for hppa64
While adding hppa64 support, the psw_v variable got extended from 32 to 64
bits.  So, when packaging the PSW-V bit from the psw_v variable for interrupt
processing, check bit 31 instead the 63th (sign) bit.

This fixes a hard to find Linux kernel boot issue where the loss of the PSW-V
bit due to an ITLB interruption in the middle of a series of ds/addc
instructions (from the divU milicode library) generated the wrong division
result and thus triggered a Linux kernel crash.

Link: https://lore.kernel.org/lkml/718b8afe-222f-4b3a-96d3-93af0e4ceff1@roeck-us.net/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 931adff314 ("target/hppa: Update cpu_hppa_get/put_psw for hppa64")
Cc: qemu-stable@nongnu.org # v8.2+
2024-09-03 22:08:22 +02:00
Thomas Huth
d41c9896f4 tests/qtest/migration: Add a check for the availability of the "pc" machine
The test_vcpu_dirty_limit is the only test that does not check for the
availability of the machine before starting the test, so it fails when
QEMU has been configured with --without-default-devices. Add a check for
the "pc" machine type to fix it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:37 -03:00
Arman Nabiev
203beb6f04 target/ppc: Fix migration of CPUs with TLB_EMB TLB type
In vmstate_tlbemb a cut-and-paste error meant we gave
this vmstate subsection the same "cpu/tlb6xx" name as
the vmstate_tlb6xx subsection. This breaks migration load
for any CPU using the TLB_EMB CPU type, because when we
see the "tlb6xx" name in the incoming data we try to
interpret it as a vmstate_tlb6xx subsection, which it
isn't the right format for:

 $ qemu-system-ppc -drive
 if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2
 -monitor stdio -M bamboo
 QEMU 9.0.92 monitor - type 'help' for more information
 (qemu) savevm foo
 (qemu) loadvm foo
 Missing section footer for cpu
 Error: Error -22 while loading VM state

Correct the incorrect vmstate section name. Since migration
for these CPU types was completely broken before, we don't
need to care that this is a migration compatibility break.

This affects the PPC 405, 440, 460 and e200 CPU families.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2522
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Arman Nabiev <nabiev.arman13@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:37 -03:00
Fabiano Rosas
62e1af13bb migration/multifd: Add documentation for multifd methods
Add documentation clarifying the usage of the multifd methods. The
general idea is that the client code calls into multifd to trigger
send/recv of data and multifd then calls these hooks back from the
worker threads at opportune moments so the client can process a
portion of the data.

Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:37 -03:00
Fabiano Rosas
90e0eeb99b migration/multifd: Add a couple of asserts for p->iov
Check that p->iov is indeed always allocated and freed by the
MultiFDMethods hooks.

Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:37 -03:00
Fabiano Rosas
405e352d28 migration/multifd: Fix p->iov leak in multifd-uadk.c
The send_cleanup() hook should free the p->iov that was allocated at
send_setup(). This was missed because the UADK code is conditional on
the presence of the accelerator, so it's not tested by default.

Fixes: 819dd20636 ("migration/multifd: Add UADK initialization")
Reported-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
81b0ed8ad8 migration/multifd: Stop changing the packet on recv side
As observed by Philippe, the multifd_ram_unfill_packet() function
currently leaves the MultiFDPacket structure with mixed
endianness. This is harmless, but ultimately not very clean.

Stop touching the received packet and do the necessary work using
stack variables instead.

While here tweak the error strings and fix the space before
semicolons. Also remove the "100 times bigger" comment because it's
just one possible explanation for a size mismatch and it doesn't even
match the code.

CC: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
308d165c77 migration/multifd: Make MultiFDMethods const
The methods are defined at module_init time and don't ever
change. Make them const.

Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
40c9471e40 migration/multifd: Move nocomp code into multifd-nocomp.c
In preparation for adding new payload types to multifd, move most of
the no-compression code into multifd-nocomp.c. Let's try to keep a
semblance of layering by not mixing general multifd control flow with
the details of transmitting pages of ram.

There are still some pieces leftover, namely the p->normal, p->zero,
etc variables that we use for zero page tracking and the packet
allocation which is heavily dependent on the ram code.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
dc6327d99c migration/multifd: Register nocomp ops dynamically
Prior to moving the ram code into multifd-nocomp.c, change the code to
register the nocomp ops dynamically so we don't need to have the ops
structure defined in multifd.c.

While here, move the ops struct initialization to the end of the file
to make the next diff cleaner.

Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
6f848dac4a migration/multifd: Standardize on multifd ops names
Add the multifd_ prefix to all functions and remove the useless
docstrings.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
a0c78d815c migration/multifd: Allow multifd sync without flush
Separate the multifd sync from flushing the client data to the
channels. These two operations are closely related but not strictly
necessary to be executed together.

The multifd sync is intrinsic to how multifd works. The multiple
channels operate independently and may finish IO out of order in
relation to each other. This applies also between the source and
destination QEMU.

Flushing the data that is left in the client-owned data structures
(e.g. MultiFDPages_t) prior to sync is usually the right thing to do,
but that is particular to how the ram migration is implemented with
several passes over dirty data.

Make these two routines separate, allowing future code to call the
sync by itself if needed. This also allows the usage of
multifd_ram_send to be isolated to ram code.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
a71ef5c7f3 migration/multifd: Replace multifd_send_state->pages with client data
Multifd currently has a simple scheduling mechanism that distributes
work to the various channels by keeping storage space within each
channel and an extra space that is given to the client. Each time the
client fills the space with data and calls into multifd, that space is
given to the next idle channel and a free storage space is taken from
the channel and given to client for the next iteration.

This means we always need (#multifd_channels + 1) memory slots to
operate multifd.

This is fine, except that the presence of this one extra memory slot
doesn't allow different types of payloads to be processed at the same
time in different channels, i.e. the data type of
multifd_send_state->pages needs to be the same as p->pages.

For each new data type different from MultiFDPage_t that is to be
handled, this logic would need to be duplicated by adding new fields
to multifd_send_state, to the channels and to multifd_send_pages().

Fix this situation by moving the extra slot into the client and using
only the generic type MultiFDSendData in the multifd core.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
d7e58f412c migration/multifd: Don't send ram data during SYNC
Skip saving and loading any ram data in the packet in the case of a
SYNC. This fixes a shortcoming of the current code which requires a
reset of the MultiFDPages_t fields right after the previous
pending_job finishes, otherwise the very next job might be a SYNC and
multifd_send_fill_packet() will put the stale values in the packet.

By not calling multifd_ram_fill_packet(), we can stop resetting
MultiFDPages_t in the multifd core and leave that to the client code.

Actually moving the reset function is not yet done because
pages->num==0 is used by the client code to determine whether the
MultiFDPages_t needs to be flushed. The subsequent patches will
replace that with a generic flag that is not dependent on
MultiFDPages_t.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
87bb9e953e migration/multifd: Isolate ram pages packet data
While we cannot yet disentangle the multifd packet from page data, we
can make the code a bit cleaner by setting the page-related fields in
a separate function.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
96d396bf50 migration/multifd: Remove total pages tracing
The total_normal_pages and total_zero_pages elements are used only for
the end tracepoints of the multifd threads. These are not super useful
since they record per-channel numbers and are just the sum of all the
pages that are transmitted per-packet, for which we already have
tracepoints. Remove the totals from the tracing.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
5aff71767c migration/multifd: Move pages accounting into multifd_send_zero_page_detect()
All references to pages are being removed from the multifd worker
threads in order to allow multifd to deal with different payload
types.

multifd_send_zero_page_detect() is called by all multifd migration
paths that deal with pages and is the last spot where zero pages and
normal page amounts are adjusted. Move the pages accounting into that
function.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
9f0e108901 migration/multifd: Replace p->pages with an union pointer
We want multifd to be able to handle more types of data than just ram
pages. To start decoupling multifd from pages, replace p->pages
(MultiFDPages_t) with the new type MultiFDSendData that hides the
client payload inside an union.

The general idea here is to isolate functions that *need* to handle
MultiFDPages_t and move them in the future to multifd-ram.c, while
multifd.c will stay with only the core functions that handle
MultiFDSendData/MultiFDRecvData.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
0e427da096 migration/multifd: Make MultiFDPages_t:offset a flexible array member
We're about to use MultiFDPages_t from inside the MultiFDSendData
payload union, which means we cannot have pointers to allocated data
inside the pages structure, otherwise we'd lose the reference to that
memory once another payload type touches the union. Move the offset
array into the end of the structure and turn it into a flexible array
member, so it is allocated along with the rest of MultiFDSendData in
the next patches.

Note that other pointers, such as the ramblock pointer are still fine
as long as the storage for them is not owned by the migration code and
can be correctly released at some point.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
addd7d1581 migration/multifd: Introduce MultiFDSendData
Add a new data structure to replace p->pages in the multifd
channel. This new structure will hide the multifd payload type behind
an union, so we don't need to add a new field to the channel each time
we want to handle a different data type.

This also allow us to keep multifd_send_pages() as is, without needing
to complicate the pointer switching.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
112f7d1b75 migration/multifd: Pass in MultiFDPages_t to file_write_ramblock_iov
We want to stop dereferencing 'pages' so it can be replaced by an
opaque pointer in the next patches.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
171056ec91 migration/multifd: Remove pages->allocated
This value never changes and is always the same as page_count. We
don't need a copy of it per-channel plus one in the extra slot. Remove
it.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Fabiano Rosas
90fa121c6c migration/multifd: Inline page_size and page_count
The MultiFD*Params structures are for per-channel data. Constant
values should not be there because that needlessly wastes cycles and
storage. The page_size and page_count fall into this category so move
them inline in multifd.h.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Fabiano Rosas
bc112a6c90 migration/multifd: Reduce access to p->pages
I'm about to replace the p->pages pointer with an opaque pointer, so
do a cleanup now to reduce direct accesses to p->page, which makes the
next diffs cleaner.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
854f67fa38 tests/qtest/migration-test: Don't leak QTestState in test_multifd_tcp_cancel()
In test_multifd_tcp_cancel() we create three QEMU processes: 'from',
'to' and 'to2'.  We clean up (via qtest_quit()) 'from' and 'to2' when
we call test_migrate_end(), but never clean up 'to', which results in
this leak:

Direct leak of 336 byte(s) in 1 object(s) allocated from:
    #0 0x55e984fcd328 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f328) (BuildId: 710d409b68bb04427009e9ca6e1b63ff8af785d3)
    #1 0x7f0878b39c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
    #2 0x55e98503a172 in qtest_spawn_qemu tests/qtest/libqtest.c:397:21
    #3 0x55e98502bc4a in qtest_init_internal tests/qtest/libqtest.c:471:9
    #4 0x55e98502c5b7 in qtest_init_with_env tests/qtest/libqtest.c:533:21
    #5 0x55e9850eef0f in test_migrate_start tests/qtest/migration-test.c:857:11
    #6 0x55e9850eb01d in test_multifd_tcp_cancel tests/qtest/migration-test.c:3297:9
    #7 0x55e985103407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5

Call qtest_quit() on 'to' to clean it up once it has exited.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
78a053bc1b tests/qtest/migration-test: Don't strdup in get_dirty_rate()
We g_strdup() the "status" string we get out of the qdict in
get_dirty_rate(), but we never free it.  Since we only use this
string while the dictionary is still valid, we don't need to strdup
at all; drop the unnecessary call to avoid this leak:

Direct leak of 18 byte(s) in 2 object(s) allocated from:
    #0 0x564b3e01913e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: d6403a811332fcc846f93c45e23abfd06d1e67c4)
    #1 0x7f2f278ff738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
    #2 0x7f2f27914583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
    #3 0x564b3e14bb5b in get_dirty_rate tests/qtest/migration-test.c:3447:14
    #4 0x564b3e138e00 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3565:16
    #5 0x564b3e14f417 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
6ed8c950b4 tests/qtest/migration-helpers: Don't dup argument to qdict_put_str()
In migrate_set_ports() we call qdict_put_str() with a value string
which we g_strdup(). However qdict_put_str() takes a copy of the
value string, it doesn't take ownership of it, so the g_strdup()
only results in a leak:

Direct leak of 6 byte(s) in 1 object(s) allocated from:
    #0 0x56298023713e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: b2b9174a5a54707a7f76bca51cdc95d2aa08bac1)
    #1 0x7fba0ad39738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
    #2 0x7fba0ad4e583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
    #3 0x56298036b16e in migrate_set_ports tests/qtest/migration-helpers.c:145:49
    #4 0x56298036ad1c in migrate_qmp tests/qtest/migration-helpers.c:228:9
    #5 0x56298035b3dd in test_precopy_common tests/qtest/migration-test.c:1820:5
    #6 0x5629803549dc in test_multifd_tcp_channels_none tests/qtest/migration-test.c:3077:5
    #7 0x56298036d427 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5

Drop the unnecessary g_strdup() call.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00