Commit Graph

115438 Commits

Author SHA1 Message Date
Thomas Huth
eeba3d7365 tests/functional: Convert avocado tests that just need a small adjustment
These simple tests can be converted to stand-alone tests quite easily,
e.g. by just setting the machine to 'none' now manually or by adding
"-cpu" command line parameters, since we don't support the corresponding
avocado tags in the new python test framework.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-14-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
cce85725f1 tests/functional: Convert simple avocado tests into standalone python tests
These test are rather simple and don't need any modifications apart
from adjusting the "from avocado_qemu" line. To ease debugging, make
the files executable and add a shebang line and Python '__main__'
handling, too, so that these tests can now be run by executing them
directly.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-13-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
1497377857 tests/functional: Prepare the meson build system for the functional tests
Provide a meson.build file for the upcoming python-based functional
tests, and add some wrapper glue targets to the tests/Makefile.include
file. We are going to use two "speed" modes for the functional tests:
The "quick" tests can be run at any time (i.e. also during "make check"),
while the "thorough" tests should only be run when running a
"make check-functional" test run (since these tests might download
additional assets from the internet).

The changes to the meson.build files are partly based on an earlier
patch by Ani Sinha.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-12-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
cf1f31089c tests/Makefile.include: Increase the level of indentation in the help text
The next patch is going to add some entries that need more space between
the command and the help text, so let's increase the indentation here
first.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20240830133841.142644-11-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
84e4a27fed tests/functional: Set up logging
Create log files for each test separately, one file that contains
the basic logging and one that contains the console output.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-10-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
fa32a63432 tests/functional: Add base classes for the upcoming pytest-based tests
The files are mostly a copy of the tests/avocado/avocado_qemu/__init__.py
file with some adjustments to get rid of the Avocado dependencies (i.e.
we also have to drop the LinuxSSHMixIn and LinuxTest for now).

The emulator binary and build directory are now passed via
environment variables that will be set via meson.build later.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-9-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
5ec1eec110 python: Install pycotap in our venv if necessary
The upcoming functional tests will require pycotap for providing
TAP output from the python-based tests. Since we want to be able
to run some of the tests offline by default, too, let's install
it along with meson in our venv if necessary (it's size is only
5 kB, so adding the wheel here should not really be a problem).

The wheel file has been obtained with:

 pip download --only-binary :all: --dest . --no-cache pycotap

Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-8-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
b5347978a9 tests/avocado/boot_linux_console: Remove the s390x subtest
We've got a much more sophisticated, Fedora-based test for s390x
("test_s390x_fedora" in another file) already, so the test in
boot_linux_console.py seems to be rather a waste of precious test
cycles. Thus move the command line check and delete the s390x
test in boot_linux_console.py.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240830133841.142644-7-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Thomas Huth
c67cb553f1 tests/avocado/avocado_qemu: Fix the "from" statements in linuxtest.py
Without this change, the new Avocado v103 fails to find the tests
that are based on the LinuxTest class.

Suggested-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-6-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:29 +02:00
Cleber Rosa
657136c653 Bump avocado to 103.0
This bumps Avocado to latest the LTS release.

An LTS release is one that can receive bugfixes and guarantees
stability for a much longer period and has incremental minor releases
made.

Even though the 103.0 LTS release is pretty a rewrite of Avocado when
compared to 88.1, the behavior of all existing tests under
tests/avocado has been extensively tested no regression in behavior
was found.

To keep behavior of jobs as close as possible with previous version,
this version bump keeps the execution serial (maximum of one task at a
time being run).

Reference: https://avocado-framework.readthedocs.io/en/103.0/releases/lts/103_0.html
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-ID: <20240806173119.582857-2-crosa@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-5-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:52:14 +02:00
Cleber Rosa
a14841264e tests/avocado/machine_aarch64_sbsaref.py: allow for rw usage of image
When the OpenBSD based tests are run in parallel, the previously
single instance of the image would become corrupt.  Let's give each
test its own snapshot.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Message-ID: <20240806173119.582857-9-crosa@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-ID: <20240830133841.142644-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:51:46 +02:00
Cleber Rosa
8dceb48e23 tests/avocado/boot_xen.py: fetch kernel during test setUp()
The kernel is a common blob used in all tests.  By moving it to the
setUp() method, the "fetch asset" plugin will recognize the kernel and
attempt to fetch it and cache it before the tests are started.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Message-ID: <20240806173119.582857-7-crosa@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-ID: <20240830133841.142644-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:51:27 +02:00
Cleber Rosa
7e3dca5bca tests/avocado: machine aarch64: standardize location and RO access
The tests under machine_aarch64_virt.py and machine_aarch64_sbsaref.py
should not be writing to the ISO files.  By adding "media=cdrom" the
"ro" is automatically set.

While at it, let's use a single code style and hash for the ISO url.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-ID: <20240806173119.582857-5-crosa@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240830133841.142644-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-04 10:51:19 +02:00
Helge Deller
d33d3adb57 target/hppa: Fix random 32-bit linux-user crashes
The linux-user hppa target crashes randomly for me since commit
081a0ed188 ("target/hppa: Do not mask in copy_iaoq_entry").

That commit dropped the masking of the IAOQ addresses while copying them
from other registers and instead keeps them with all 64 bits up until
the full gva is formed with the help of hppa_form_gva_psw().

So, when running in linux-user mode on an emulated 64-bit CPU, we need
to mask to a 32-bit address space at the very end in hppa_form_gva_psw()
if the PSW-W flag isn't set (which is the case for linux-user on hppa).

Fixes: 081a0ed188 ("target/hppa: Do not mask in copy_iaoq_entry")
Cc: qemu-stable@nongnu.org # v9.1+
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2024-09-03 22:08:22 +02:00
Helge Deller
ead5078cf1 target/hppa: Fix PSW V-bit packaging in cpu_hppa_get for hppa64
While adding hppa64 support, the psw_v variable got extended from 32 to 64
bits.  So, when packaging the PSW-V bit from the psw_v variable for interrupt
processing, check bit 31 instead the 63th (sign) bit.

This fixes a hard to find Linux kernel boot issue where the loss of the PSW-V
bit due to an ITLB interruption in the middle of a series of ds/addc
instructions (from the divU milicode library) generated the wrong division
result and thus triggered a Linux kernel crash.

Link: https://lore.kernel.org/lkml/718b8afe-222f-4b3a-96d3-93af0e4ceff1@roeck-us.net/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 931adff314 ("target/hppa: Update cpu_hppa_get/put_psw for hppa64")
Cc: qemu-stable@nongnu.org # v8.2+
2024-09-03 22:08:22 +02:00
Thomas Huth
d41c9896f4 tests/qtest/migration: Add a check for the availability of the "pc" machine
The test_vcpu_dirty_limit is the only test that does not check for the
availability of the machine before starting the test, so it fails when
QEMU has been configured with --without-default-devices. Add a check for
the "pc" machine type to fix it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:37 -03:00
Arman Nabiev
203beb6f04 target/ppc: Fix migration of CPUs with TLB_EMB TLB type
In vmstate_tlbemb a cut-and-paste error meant we gave
this vmstate subsection the same "cpu/tlb6xx" name as
the vmstate_tlb6xx subsection. This breaks migration load
for any CPU using the TLB_EMB CPU type, because when we
see the "tlb6xx" name in the incoming data we try to
interpret it as a vmstate_tlb6xx subsection, which it
isn't the right format for:

 $ qemu-system-ppc -drive
 if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2
 -monitor stdio -M bamboo
 QEMU 9.0.92 monitor - type 'help' for more information
 (qemu) savevm foo
 (qemu) loadvm foo
 Missing section footer for cpu
 Error: Error -22 while loading VM state

Correct the incorrect vmstate section name. Since migration
for these CPU types was completely broken before, we don't
need to care that this is a migration compatibility break.

This affects the PPC 405, 440, 460 and e200 CPU families.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2522
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Arman Nabiev <nabiev.arman13@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:37 -03:00
Fabiano Rosas
62e1af13bb migration/multifd: Add documentation for multifd methods
Add documentation clarifying the usage of the multifd methods. The
general idea is that the client code calls into multifd to trigger
send/recv of data and multifd then calls these hooks back from the
worker threads at opportune moments so the client can process a
portion of the data.

Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:37 -03:00
Fabiano Rosas
90e0eeb99b migration/multifd: Add a couple of asserts for p->iov
Check that p->iov is indeed always allocated and freed by the
MultiFDMethods hooks.

Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:37 -03:00
Fabiano Rosas
405e352d28 migration/multifd: Fix p->iov leak in multifd-uadk.c
The send_cleanup() hook should free the p->iov that was allocated at
send_setup(). This was missed because the UADK code is conditional on
the presence of the accelerator, so it's not tested by default.

Fixes: 819dd20636 ("migration/multifd: Add UADK initialization")
Reported-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
81b0ed8ad8 migration/multifd: Stop changing the packet on recv side
As observed by Philippe, the multifd_ram_unfill_packet() function
currently leaves the MultiFDPacket structure with mixed
endianness. This is harmless, but ultimately not very clean.

Stop touching the received packet and do the necessary work using
stack variables instead.

While here tweak the error strings and fix the space before
semicolons. Also remove the "100 times bigger" comment because it's
just one possible explanation for a size mismatch and it doesn't even
match the code.

CC: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
308d165c77 migration/multifd: Make MultiFDMethods const
The methods are defined at module_init time and don't ever
change. Make them const.

Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
40c9471e40 migration/multifd: Move nocomp code into multifd-nocomp.c
In preparation for adding new payload types to multifd, move most of
the no-compression code into multifd-nocomp.c. Let's try to keep a
semblance of layering by not mixing general multifd control flow with
the details of transmitting pages of ram.

There are still some pieces leftover, namely the p->normal, p->zero,
etc variables that we use for zero page tracking and the packet
allocation which is heavily dependent on the ram code.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
dc6327d99c migration/multifd: Register nocomp ops dynamically
Prior to moving the ram code into multifd-nocomp.c, change the code to
register the nocomp ops dynamically so we don't need to have the ops
structure defined in multifd.c.

While here, move the ops struct initialization to the end of the file
to make the next diff cleaner.

Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
6f848dac4a migration/multifd: Standardize on multifd ops names
Add the multifd_ prefix to all functions and remove the useless
docstrings.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
a0c78d815c migration/multifd: Allow multifd sync without flush
Separate the multifd sync from flushing the client data to the
channels. These two operations are closely related but not strictly
necessary to be executed together.

The multifd sync is intrinsic to how multifd works. The multiple
channels operate independently and may finish IO out of order in
relation to each other. This applies also between the source and
destination QEMU.

Flushing the data that is left in the client-owned data structures
(e.g. MultiFDPages_t) prior to sync is usually the right thing to do,
but that is particular to how the ram migration is implemented with
several passes over dirty data.

Make these two routines separate, allowing future code to call the
sync by itself if needed. This also allows the usage of
multifd_ram_send to be isolated to ram code.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:36 -03:00
Fabiano Rosas
a71ef5c7f3 migration/multifd: Replace multifd_send_state->pages with client data
Multifd currently has a simple scheduling mechanism that distributes
work to the various channels by keeping storage space within each
channel and an extra space that is given to the client. Each time the
client fills the space with data and calls into multifd, that space is
given to the next idle channel and a free storage space is taken from
the channel and given to client for the next iteration.

This means we always need (#multifd_channels + 1) memory slots to
operate multifd.

This is fine, except that the presence of this one extra memory slot
doesn't allow different types of payloads to be processed at the same
time in different channels, i.e. the data type of
multifd_send_state->pages needs to be the same as p->pages.

For each new data type different from MultiFDPage_t that is to be
handled, this logic would need to be duplicated by adding new fields
to multifd_send_state, to the channels and to multifd_send_pages().

Fix this situation by moving the extra slot into the client and using
only the generic type MultiFDSendData in the multifd core.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
d7e58f412c migration/multifd: Don't send ram data during SYNC
Skip saving and loading any ram data in the packet in the case of a
SYNC. This fixes a shortcoming of the current code which requires a
reset of the MultiFDPages_t fields right after the previous
pending_job finishes, otherwise the very next job might be a SYNC and
multifd_send_fill_packet() will put the stale values in the packet.

By not calling multifd_ram_fill_packet(), we can stop resetting
MultiFDPages_t in the multifd core and leave that to the client code.

Actually moving the reset function is not yet done because
pages->num==0 is used by the client code to determine whether the
MultiFDPages_t needs to be flushed. The subsequent patches will
replace that with a generic flag that is not dependent on
MultiFDPages_t.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
87bb9e953e migration/multifd: Isolate ram pages packet data
While we cannot yet disentangle the multifd packet from page data, we
can make the code a bit cleaner by setting the page-related fields in
a separate function.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
96d396bf50 migration/multifd: Remove total pages tracing
The total_normal_pages and total_zero_pages elements are used only for
the end tracepoints of the multifd threads. These are not super useful
since they record per-channel numbers and are just the sum of all the
pages that are transmitted per-packet, for which we already have
tracepoints. Remove the totals from the tracing.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
5aff71767c migration/multifd: Move pages accounting into multifd_send_zero_page_detect()
All references to pages are being removed from the multifd worker
threads in order to allow multifd to deal with different payload
types.

multifd_send_zero_page_detect() is called by all multifd migration
paths that deal with pages and is the last spot where zero pages and
normal page amounts are adjusted. Move the pages accounting into that
function.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
9f0e108901 migration/multifd: Replace p->pages with an union pointer
We want multifd to be able to handle more types of data than just ram
pages. To start decoupling multifd from pages, replace p->pages
(MultiFDPages_t) with the new type MultiFDSendData that hides the
client payload inside an union.

The general idea here is to isolate functions that *need* to handle
MultiFDPages_t and move them in the future to multifd-ram.c, while
multifd.c will stay with only the core functions that handle
MultiFDSendData/MultiFDRecvData.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
0e427da096 migration/multifd: Make MultiFDPages_t:offset a flexible array member
We're about to use MultiFDPages_t from inside the MultiFDSendData
payload union, which means we cannot have pointers to allocated data
inside the pages structure, otherwise we'd lose the reference to that
memory once another payload type touches the union. Move the offset
array into the end of the structure and turn it into a flexible array
member, so it is allocated along with the rest of MultiFDSendData in
the next patches.

Note that other pointers, such as the ramblock pointer are still fine
as long as the storage for them is not owned by the migration code and
can be correctly released at some point.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
addd7d1581 migration/multifd: Introduce MultiFDSendData
Add a new data structure to replace p->pages in the multifd
channel. This new structure will hide the multifd payload type behind
an union, so we don't need to add a new field to the channel each time
we want to handle a different data type.

This also allow us to keep multifd_send_pages() as is, without needing
to complicate the pointer switching.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
112f7d1b75 migration/multifd: Pass in MultiFDPages_t to file_write_ramblock_iov
We want to stop dereferencing 'pages' so it can be replaced by an
opaque pointer in the next patches.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:35 -03:00
Fabiano Rosas
171056ec91 migration/multifd: Remove pages->allocated
This value never changes and is always the same as page_count. We
don't need a copy of it per-channel plus one in the extra slot. Remove
it.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Fabiano Rosas
90fa121c6c migration/multifd: Inline page_size and page_count
The MultiFD*Params structures are for per-channel data. Constant
values should not be there because that needlessly wastes cycles and
storage. The page_size and page_count fall into this category so move
them inline in multifd.h.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Fabiano Rosas
bc112a6c90 migration/multifd: Reduce access to p->pages
I'm about to replace the p->pages pointer with an opaque pointer, so
do a cleanup now to reduce direct accesses to p->page, which makes the
next diffs cleaner.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
854f67fa38 tests/qtest/migration-test: Don't leak QTestState in test_multifd_tcp_cancel()
In test_multifd_tcp_cancel() we create three QEMU processes: 'from',
'to' and 'to2'.  We clean up (via qtest_quit()) 'from' and 'to2' when
we call test_migrate_end(), but never clean up 'to', which results in
this leak:

Direct leak of 336 byte(s) in 1 object(s) allocated from:
    #0 0x55e984fcd328 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f328) (BuildId: 710d409b68bb04427009e9ca6e1b63ff8af785d3)
    #1 0x7f0878b39c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
    #2 0x55e98503a172 in qtest_spawn_qemu tests/qtest/libqtest.c:397:21
    #3 0x55e98502bc4a in qtest_init_internal tests/qtest/libqtest.c:471:9
    #4 0x55e98502c5b7 in qtest_init_with_env tests/qtest/libqtest.c:533:21
    #5 0x55e9850eef0f in test_migrate_start tests/qtest/migration-test.c:857:11
    #6 0x55e9850eb01d in test_multifd_tcp_cancel tests/qtest/migration-test.c:3297:9
    #7 0x55e985103407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5

Call qtest_quit() on 'to' to clean it up once it has exited.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
78a053bc1b tests/qtest/migration-test: Don't strdup in get_dirty_rate()
We g_strdup() the "status" string we get out of the qdict in
get_dirty_rate(), but we never free it.  Since we only use this
string while the dictionary is still valid, we don't need to strdup
at all; drop the unnecessary call to avoid this leak:

Direct leak of 18 byte(s) in 2 object(s) allocated from:
    #0 0x564b3e01913e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: d6403a811332fcc846f93c45e23abfd06d1e67c4)
    #1 0x7f2f278ff738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
    #2 0x7f2f27914583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
    #3 0x564b3e14bb5b in get_dirty_rate tests/qtest/migration-test.c:3447:14
    #4 0x564b3e138e00 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3565:16
    #5 0x564b3e14f417 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
6ed8c950b4 tests/qtest/migration-helpers: Don't dup argument to qdict_put_str()
In migrate_set_ports() we call qdict_put_str() with a value string
which we g_strdup(). However qdict_put_str() takes a copy of the
value string, it doesn't take ownership of it, so the g_strdup()
only results in a leak:

Direct leak of 6 byte(s) in 1 object(s) allocated from:
    #0 0x56298023713e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: b2b9174a5a54707a7f76bca51cdc95d2aa08bac1)
    #1 0x7fba0ad39738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
    #2 0x7fba0ad4e583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
    #3 0x56298036b16e in migrate_set_ports tests/qtest/migration-helpers.c:145:49
    #4 0x56298036ad1c in migrate_qmp tests/qtest/migration-helpers.c:228:9
    #5 0x56298035b3dd in test_precopy_common tests/qtest/migration-test.c:1820:5
    #6 0x5629803549dc in test_multifd_tcp_channels_none tests/qtest/migration-test.c:3077:5
    #7 0x56298036d427 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5

Drop the unnecessary g_strdup() call.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
ba3859207d tests/unit/crypto-tls-x509-helpers: deinit privkey in test_tls_cleanup
We create a gnutls_x509_privkey_t in test_tls_init(), but forget
to deinit it in test_tls_cleanup(), resulting in leaks
reported in hte migration test such as:

Indirect leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x55fa6d11c12e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f12e) (BuildId: 852a267993587f557f50e5715f352f43720077ba)
    #1 0x7f073982685d in __gmp_default_allocate (/lib/x86_64-linux-gnu/libgmp.so.10+0xa85d) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
    #2 0x7f0739836193 in __gmpz_realloc (/lib/x86_64-linux-gnu/libgmp.so.10+0x1a193) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
    #3 0x7f0739836594 in __gmpz_import (/lib/x86_64-linux-gnu/libgmp.so.10+0x1a594) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
    #4 0x7f07398a91ed in nettle_mpz_set_str_256_u (/lib/x86_64-linux-gnu/libhogweed.so.6+0xb1ed) (BuildId: 3cc4a3474de72db89e9dcc93bfb95fe377f48c37)
    #5 0x7f073a146a5a  (/lib/x86_64-linux-gnu/libgnutls.so.30+0x131a5a) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
    #6 0x7f073a07192c  (/lib/x86_64-linux-gnu/libgnutls.so.30+0x5c92c) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
    #7 0x7f073a078333  (/lib/x86_64-linux-gnu/libgnutls.so.30+0x63333) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
    #8 0x7f073a0e8353  (/lib/x86_64-linux-gnu/libgnutls.so.30+0xd3353) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
    #9 0x7f073a0ef0ac in gnutls_x509_privkey_import (/lib/x86_64-linux-gnu/libgnutls.so.30+0xda0ac) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
    #10 0x55fa6d2547e3 in test_tls_load_key tests/unit/crypto-tls-x509-helpers.c:99:11
    #11 0x55fa6d25460c in test_tls_init tests/unit/crypto-tls-x509-helpers.c:128:15
    #12 0x55fa6d2495c4 in test_migrate_tls_x509_start_common tests/qtest/migration-test.c:1044:5
    #13 0x55fa6d24c23a in test_migrate_tls_x509_start_reject_anon_client tests/qtest/migration-test.c:1216:12
    #14 0x55fa6d23fb40 in test_precopy_common tests/qtest/migration-test.c:1789:21
    #15 0x55fa6d236b7c in test_precopy_tcp_tls_x509_reject_anon_client tests/qtest/migration-test.c:2614:5

(Oddly, there is no reported leak in the x509 unit tests, even though
those also use test_tls_init() and test_tls_cleanup().)

Deinit the privkey in test_tls_cleanup().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
2cf6dc4101 tests/qtest/migration-test: Free QCRyptoTLSTestCertReq objects
In the migration test we create several TLS certificates with
the TLS_* macros from crypto-tls-x509-helpers.h. These macros
create both a QCryptoTLSCertReq object which must be deinitialized
and also an on-disk certificate file. The migration test currently
removes the on-disk file in test_migrate_tls_x509_finish() but
never deinitializes the QCryptoTLSCertReq, which means that memory
allocated as part of it is leaked:

Indirect leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x5558ba33712e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f12e) (BuildId: 4c8618f663e538538cad19d35233124cea161491)
    #1 0x7f64afc131f4  (/lib/x86_64-linux-gnu/libtasn1.so.6+0x81f4) (BuildId: 2fde6ecb43c586fe4077118f771077aa1298e7ea)
    #2 0x7f64afc18d58 in asn1_write_value (/lib/x86_64-linux-gnu/libtasn1.so.6+0xdd58) (BuildId: 2fde6ecb43c586fe4077118f771077aa1298e7ea)
    #3 0x7f64af8fc678 in gnutls_x509_crt_set_version (/lib/x86_64-linux-gnu/libgnutls.so.30+0xe7678) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
    #4 0x5558ba470035 in test_tls_generate_cert tests/unit/crypto-tls-x509-helpers.c:234:5
    #5 0x5558ba464e4a in test_migrate_tls_x509_start_common tests/qtest/migration-test.c:1058:5
    #6 0x5558ba462c8a in test_migrate_tls_x509_start_default_host tests/qtest/migration-test.c:1123:12
    #7 0x5558ba45ab40 in test_precopy_common tests/qtest/migration-test.c:1786:21
    #8 0x5558ba450015 in test_precopy_unix_tls_x509_default_host tests/qtest/migration-test.c:2077:5
    #9 0x5558ba46d3c7 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5

(and similar reports).

The only function currently provided to deinit a QCryptoTLSCertReq is
test_tls_discard_cert(), which also removes the on-disk certificate
file.  For the migration tests we need to retain the on-disk files
until we've finished running the test, so the simplest fix is to
provide a new function test_tls_deinit_cert() which does only the
cleanup of the QCryptoTLSCertReq, and call it in the right places.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:34 -03:00
Peter Maydell
c94170ae02 tests/qtest/migration-helpers: Fix migrate_get_socket_address() leak
In migrate_get_socket_address() we leak the SocketAddressList:
 (cd build/asan && \
  ASAN_OPTIONS="fast_unwind_on_malloc=0:strip_path_prefix=/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../"
  QTEST_QEMU_BINARY=./qemu-system-x86_64 \
  ./tests/qtest/migration-test --tap -k -p /x86_64/migration/multifd/tcp/tls/psk/match )

[...]
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x563d7f22f318 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f318) (BuildId: 2ad6282fb5d076c863ab87f41a345d46dc965ded)
    #1 0x7f9de3b39c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
    #2 0x563d7f3a119c in qobject_input_start_list qapi/qobject-input-visitor.c:336:17
    #3 0x563d7f390fbf in visit_start_list qapi/qapi-visit-core.c:80:10
    #4 0x563d7f3882ef in visit_type_SocketAddressList /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-visit-sockets.c:519:10
    #5 0x563d7f3658c9 in migrate_get_socket_address tests/qtest/migration-helpers.c:97:5
    #6 0x563d7f362e24 in migrate_get_connect_uri tests/qtest/migration-helpers.c:111:13
    #7 0x563d7f362bb2 in migrate_qmp tests/qtest/migration-helpers.c:222:23
    #8 0x563d7f3533cd in test_precopy_common tests/qtest/migration-test.c:1817:5
    #9 0x563d7f34dc1c in test_multifd_tcp_tls_psk_match tests/qtest/migration-test.c:3185:5
    #10 0x563d7f365337 in migration_test_wrapper tests/qtest/migration-helpers.c:458:5

The code fishes out the SocketAddress from the list to return it, and the
callers are freeing that, but nothing frees the list.

Since this function is called in only two places, the simple fix is to
make it return the SocketAddressList rather than just a SocketAddress,
and then the callers can easily access the SocketAddress, and free
the whole SocketAddressList when they're done.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:33 -03:00
Peter Maydell
f0d74774b0 tests/qtest/migration-test: Fix leaks in calc_dirtyrate_ready()
In calc_dirtyrate_ready() we g_strdup() a string but then never free it:

Direct leak of 19 byte(s) in 2 object(s) allocated from:
    #0 0x55ead613413e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: e7cd5c37b2987a1af682b43ee5240b98bb316737)
    #1 0x7f7a13d39738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
    #2 0x7f7a13d4e583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
    #3 0x55ead6266f48 in calc_dirtyrate_ready tests/qtest/migration-test.c:3409:14
    #4 0x55ead62669fe in wait_for_calc_dirtyrate_complete tests/qtest/migration-test.c:3422:13
    #5 0x55ead6253df7 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3562:9
    #6 0x55ead626a407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5

We also fail to unref the QMP rsp_return, so we leak that also.

Rather than duplicating the string, use the in-place value from
the qdict, and then unref the qdict.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:33 -03:00
Peter Maydell
0fa2cf819d tests/qtest/migration-test: Don't leak resp in multifd_mapped_ram_fdset_end()
In multifd_mapped_ram_fdset_end() we call qtest_qmp() but forgot
to unref the response QDict we get back, which means it is leaked:

Indirect leak of 4120 byte(s) in 1 object(s) allocated from:
    #0 0x55c0c095d318 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f318) (BuildI
d: 07f667506452d6c467dbc06fd95191966d3e91b4)
    #1 0x7f186f939c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
    #2 0x55c0c0ae9b01 in qdict_new qobject/qdict.c:30:13
    #3 0x55c0c0afc16c in parse_object qobject/json-parser.c:317:12
    #4 0x55c0c0afb90f in parse_value qobject/json-parser.c:545:16
    #5 0x55c0c0afb579 in json_parser_parse qobject/json-parser.c:579:14
    #6 0x55c0c0afa21d in json_message_process_token qobject/json-streamer.c:92:12
    #7 0x55c0c0bca2e5 in json_lexer_feed_char qobject/json-lexer.c:313:13
    #8 0x55c0c0bc97ce in json_lexer_feed qobject/json-lexer.c:350:9
    #9 0x55c0c0afabbc in json_message_parser_feed qobject/json-streamer.c:121:5
    #10 0x55c0c09cbd52 in qmp_fd_receive tests/qtest/libqmp.c:86:9
    #11 0x55c0c09be69b in qtest_qmp_receive_dict tests/qtest/libqtest.c:760:12
    #12 0x55c0c09bca77 in qtest_qmp_receive tests/qtest/libqtest.c:741:27
    #13 0x55c0c09bee9d in qtest_vqmp tests/qtest/libqtest.c:812:12
    #14 0x55c0c09bd257 in qtest_qmp tests/qtest/libqtest.c:835:16
    #15 0x55c0c0a87747 in multifd_mapped_ram_fdset_end tests/qtest/migration-test.c:2393:12
    #16 0x55c0c0a85eb3 in test_file_common tests/qtest/migration-test.c:1978:9
    #17 0x55c0c0a746a3 in test_multifd_file_mapped_ram_fdset tests/qtest/migration-test.c:2437:5
    #18 0x55c0c0a93237 in migration_test_wrapper tests/qtest/migration-helpers.c:458:5
    #19 0x7f186f958aed in test_case_run debian/build/deb/../../../glib/gtestutils.c:2930:15
    #20 0x7f186f958aed in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3018:16
    #21 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
    #22 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
    #23 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
    #24 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
    #25 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
    #26 0x7f186f958faa in g_test_run_suite debian/build/deb/../../../glib/gtestutils.c:3109:18
    #27 0x7f186f959055 in g_test_run debian/build/deb/../../../glib/gtestutils.c:2231:7
    #28 0x7f186f959055 in g_test_run debian/build/deb/../../../glib/gtestutils.c:2218:1
    #29 0x55c0c0a6e427 in main tests/qtest/migration-test.c:4033:11

Unref the object after we've confirmed that it is what we expect.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:33 -03:00
Peter Maydell
d278455eb1 tests/qtest/migration-test: Fix bootfile cleanup handling
If you invoke the migration-test binary in such a way that it doesn't run
any tests, then we never call bootfile_create(), and at the end of
main() bootfile_delete() will try to unlink(NULL), which is not valid.
This can happen if for instance you tell the test binary to run a
subset of tests that turns out to be empty, like this:

 (cd build/asan && QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --tap -k -p bang)
 # random seed: R02S6501b289ff8ced4231ba452c3a87bc6f
 # Skipping test: userfaultfd not available
 1..0
 ../../tests/qtest/migration-test.c:182:12: runtime error: null pointer passed as argument 1, which is declared to never be null
 /usr/include/unistd.h:858:48: note: nonnull attribute specified here

Handle this by making bootfile_delete() not needing to do anything
because bootfile_create() was never called.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed conflict with aee07f2563]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:33 -03:00
Fabiano Rosas
ceb1ab1af4 tests/qtest/migration: Remove vmstate-static-checker test
I fumbled one of my last pull requests when fixing in-tree an issue
with commit 87d67fadb9 ("monitor: Stop removing non-duplicated
fds"). Basically mixed-up my `git add -p` and `git checkout -p` and
committed a piece of test infra that has not been reviewed yet.

This has not caused any bad symptoms because the test is not enabled
by default anywhere: make check doesn't use two qemu binaries and the
CI doesn't have PYTHON set for the compat tests. Besides, the test
works fine anyway, it would not break anything.

Remove this because it was never intended to be merged.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:33 -03:00
Steve Sistare
c83b77f4ad migration: delete unused parameter mis
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-09-03 16:24:33 -03:00
Richard Henderson
e638d685ec Open 9.2 development tree
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-09-03 09:18:43 -07:00