The linux-user hppa target crashes randomly for me since commit
081a0ed188 ("target/hppa: Do not mask in copy_iaoq_entry").
That commit dropped the masking of the IAOQ addresses while copying them
from other registers and instead keeps them with all 64 bits up until
the full gva is formed with the help of hppa_form_gva_psw().
So, when running in linux-user mode on an emulated 64-bit CPU, we need
to mask to a 32-bit address space at the very end in hppa_form_gva_psw()
if the PSW-W flag isn't set (which is the case for linux-user on hppa).
Fixes: 081a0ed188 ("target/hppa: Do not mask in copy_iaoq_entry")
Cc: qemu-stable@nongnu.org # v9.1+
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
While adding hppa64 support, the psw_v variable got extended from 32 to 64
bits. So, when packaging the PSW-V bit from the psw_v variable for interrupt
processing, check bit 31 instead the 63th (sign) bit.
This fixes a hard to find Linux kernel boot issue where the loss of the PSW-V
bit due to an ITLB interruption in the middle of a series of ds/addc
instructions (from the divU milicode library) generated the wrong division
result and thus triggered a Linux kernel crash.
Link: https://lore.kernel.org/lkml/718b8afe-222f-4b3a-96d3-93af0e4ceff1@roeck-us.net/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 931adff314 ("target/hppa: Update cpu_hppa_get/put_psw for hppa64")
Cc: qemu-stable@nongnu.org # v8.2+
The test_vcpu_dirty_limit is the only test that does not check for the
availability of the machine before starting the test, so it fails when
QEMU has been configured with --without-default-devices. Add a check for
the "pc" machine type to fix it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In vmstate_tlbemb a cut-and-paste error meant we gave
this vmstate subsection the same "cpu/tlb6xx" name as
the vmstate_tlb6xx subsection. This breaks migration load
for any CPU using the TLB_EMB CPU type, because when we
see the "tlb6xx" name in the incoming data we try to
interpret it as a vmstate_tlb6xx subsection, which it
isn't the right format for:
$ qemu-system-ppc -drive
if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2
-monitor stdio -M bamboo
QEMU 9.0.92 monitor - type 'help' for more information
(qemu) savevm foo
(qemu) loadvm foo
Missing section footer for cpu
Error: Error -22 while loading VM state
Correct the incorrect vmstate section name. Since migration
for these CPU types was completely broken before, we don't
need to care that this is a migration compatibility break.
This affects the PPC 405, 440, 460 and e200 CPU families.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2522
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Arman Nabiev <nabiev.arman13@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add documentation clarifying the usage of the multifd methods. The
general idea is that the client code calls into multifd to trigger
send/recv of data and multifd then calls these hooks back from the
worker threads at opportune moments so the client can process a
portion of the data.
Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Check that p->iov is indeed always allocated and freed by the
MultiFDMethods hooks.
Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The send_cleanup() hook should free the p->iov that was allocated at
send_setup(). This was missed because the UADK code is conditional on
the presence of the accelerator, so it's not tested by default.
Fixes: 819dd20636 ("migration/multifd: Add UADK initialization")
Reported-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
As observed by Philippe, the multifd_ram_unfill_packet() function
currently leaves the MultiFDPacket structure with mixed
endianness. This is harmless, but ultimately not very clean.
Stop touching the received packet and do the necessary work using
stack variables instead.
While here tweak the error strings and fix the space before
semicolons. Also remove the "100 times bigger" comment because it's
just one possible explanation for a size mismatch and it doesn't even
match the code.
CC: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The methods are defined at module_init time and don't ever
change. Make them const.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In preparation for adding new payload types to multifd, move most of
the no-compression code into multifd-nocomp.c. Let's try to keep a
semblance of layering by not mixing general multifd control flow with
the details of transmitting pages of ram.
There are still some pieces leftover, namely the p->normal, p->zero,
etc variables that we use for zero page tracking and the packet
allocation which is heavily dependent on the ram code.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Prior to moving the ram code into multifd-nocomp.c, change the code to
register the nocomp ops dynamically so we don't need to have the ops
structure defined in multifd.c.
While here, move the ops struct initialization to the end of the file
to make the next diff cleaner.
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add the multifd_ prefix to all functions and remove the useless
docstrings.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Separate the multifd sync from flushing the client data to the
channels. These two operations are closely related but not strictly
necessary to be executed together.
The multifd sync is intrinsic to how multifd works. The multiple
channels operate independently and may finish IO out of order in
relation to each other. This applies also between the source and
destination QEMU.
Flushing the data that is left in the client-owned data structures
(e.g. MultiFDPages_t) prior to sync is usually the right thing to do,
but that is particular to how the ram migration is implemented with
several passes over dirty data.
Make these two routines separate, allowing future code to call the
sync by itself if needed. This also allows the usage of
multifd_ram_send to be isolated to ram code.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Multifd currently has a simple scheduling mechanism that distributes
work to the various channels by keeping storage space within each
channel and an extra space that is given to the client. Each time the
client fills the space with data and calls into multifd, that space is
given to the next idle channel and a free storage space is taken from
the channel and given to client for the next iteration.
This means we always need (#multifd_channels + 1) memory slots to
operate multifd.
This is fine, except that the presence of this one extra memory slot
doesn't allow different types of payloads to be processed at the same
time in different channels, i.e. the data type of
multifd_send_state->pages needs to be the same as p->pages.
For each new data type different from MultiFDPage_t that is to be
handled, this logic would need to be duplicated by adding new fields
to multifd_send_state, to the channels and to multifd_send_pages().
Fix this situation by moving the extra slot into the client and using
only the generic type MultiFDSendData in the multifd core.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Skip saving and loading any ram data in the packet in the case of a
SYNC. This fixes a shortcoming of the current code which requires a
reset of the MultiFDPages_t fields right after the previous
pending_job finishes, otherwise the very next job might be a SYNC and
multifd_send_fill_packet() will put the stale values in the packet.
By not calling multifd_ram_fill_packet(), we can stop resetting
MultiFDPages_t in the multifd core and leave that to the client code.
Actually moving the reset function is not yet done because
pages->num==0 is used by the client code to determine whether the
MultiFDPages_t needs to be flushed. The subsequent patches will
replace that with a generic flag that is not dependent on
MultiFDPages_t.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
While we cannot yet disentangle the multifd packet from page data, we
can make the code a bit cleaner by setting the page-related fields in
a separate function.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The total_normal_pages and total_zero_pages elements are used only for
the end tracepoints of the multifd threads. These are not super useful
since they record per-channel numbers and are just the sum of all the
pages that are transmitted per-packet, for which we already have
tracepoints. Remove the totals from the tracing.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
All references to pages are being removed from the multifd worker
threads in order to allow multifd to deal with different payload
types.
multifd_send_zero_page_detect() is called by all multifd migration
paths that deal with pages and is the last spot where zero pages and
normal page amounts are adjusted. Move the pages accounting into that
function.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We want multifd to be able to handle more types of data than just ram
pages. To start decoupling multifd from pages, replace p->pages
(MultiFDPages_t) with the new type MultiFDSendData that hides the
client payload inside an union.
The general idea here is to isolate functions that *need* to handle
MultiFDPages_t and move them in the future to multifd-ram.c, while
multifd.c will stay with only the core functions that handle
MultiFDSendData/MultiFDRecvData.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're about to use MultiFDPages_t from inside the MultiFDSendData
payload union, which means we cannot have pointers to allocated data
inside the pages structure, otherwise we'd lose the reference to that
memory once another payload type touches the union. Move the offset
array into the end of the structure and turn it into a flexible array
member, so it is allocated along with the rest of MultiFDSendData in
the next patches.
Note that other pointers, such as the ramblock pointer are still fine
as long as the storage for them is not owned by the migration code and
can be correctly released at some point.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add a new data structure to replace p->pages in the multifd
channel. This new structure will hide the multifd payload type behind
an union, so we don't need to add a new field to the channel each time
we want to handle a different data type.
This also allow us to keep multifd_send_pages() as is, without needing
to complicate the pointer switching.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We want to stop dereferencing 'pages' so it can be replaced by an
opaque pointer in the next patches.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This value never changes and is always the same as page_count. We
don't need a copy of it per-channel plus one in the extra slot. Remove
it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The MultiFD*Params structures are for per-channel data. Constant
values should not be there because that needlessly wastes cycles and
storage. The page_size and page_count fall into this category so move
them inline in multifd.h.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
I'm about to replace the p->pages pointer with an opaque pointer, so
do a cleanup now to reduce direct accesses to p->page, which makes the
next diffs cleaner.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In test_multifd_tcp_cancel() we create three QEMU processes: 'from',
'to' and 'to2'. We clean up (via qtest_quit()) 'from' and 'to2' when
we call test_migrate_end(), but never clean up 'to', which results in
this leak:
Direct leak of 336 byte(s) in 1 object(s) allocated from:
#0 0x55e984fcd328 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f328) (BuildId: 710d409b68bb04427009e9ca6e1b63ff8af785d3)
#1 0x7f0878b39c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55e98503a172 in qtest_spawn_qemu tests/qtest/libqtest.c:397:21
#3 0x55e98502bc4a in qtest_init_internal tests/qtest/libqtest.c:471:9
#4 0x55e98502c5b7 in qtest_init_with_env tests/qtest/libqtest.c:533:21
#5 0x55e9850eef0f in test_migrate_start tests/qtest/migration-test.c:857:11
#6 0x55e9850eb01d in test_multifd_tcp_cancel tests/qtest/migration-test.c:3297:9
#7 0x55e985103407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
Call qtest_quit() on 'to' to clean it up once it has exited.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We g_strdup() the "status" string we get out of the qdict in
get_dirty_rate(), but we never free it. Since we only use this
string while the dictionary is still valid, we don't need to strdup
at all; drop the unnecessary call to avoid this leak:
Direct leak of 18 byte(s) in 2 object(s) allocated from:
#0 0x564b3e01913e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: d6403a811332fcc846f93c45e23abfd06d1e67c4)
#1 0x7f2f278ff738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7f2f27914583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x564b3e14bb5b in get_dirty_rate tests/qtest/migration-test.c:3447:14
#4 0x564b3e138e00 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3565:16
#5 0x564b3e14f417 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In migrate_set_ports() we call qdict_put_str() with a value string
which we g_strdup(). However qdict_put_str() takes a copy of the
value string, it doesn't take ownership of it, so the g_strdup()
only results in a leak:
Direct leak of 6 byte(s) in 1 object(s) allocated from:
#0 0x56298023713e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: b2b9174a5a54707a7f76bca51cdc95d2aa08bac1)
#1 0x7fba0ad39738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7fba0ad4e583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x56298036b16e in migrate_set_ports tests/qtest/migration-helpers.c:145:49
#4 0x56298036ad1c in migrate_qmp tests/qtest/migration-helpers.c:228:9
#5 0x56298035b3dd in test_precopy_common tests/qtest/migration-test.c:1820:5
#6 0x5629803549dc in test_multifd_tcp_channels_none tests/qtest/migration-test.c:3077:5
#7 0x56298036d427 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
Drop the unnecessary g_strdup() call.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We create a gnutls_x509_privkey_t in test_tls_init(), but forget
to deinit it in test_tls_cleanup(), resulting in leaks
reported in hte migration test such as:
Indirect leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x55fa6d11c12e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f12e) (BuildId: 852a267993587f557f50e5715f352f43720077ba)
#1 0x7f073982685d in __gmp_default_allocate (/lib/x86_64-linux-gnu/libgmp.so.10+0xa85d) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
#2 0x7f0739836193 in __gmpz_realloc (/lib/x86_64-linux-gnu/libgmp.so.10+0x1a193) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
#3 0x7f0739836594 in __gmpz_import (/lib/x86_64-linux-gnu/libgmp.so.10+0x1a594) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
#4 0x7f07398a91ed in nettle_mpz_set_str_256_u (/lib/x86_64-linux-gnu/libhogweed.so.6+0xb1ed) (BuildId: 3cc4a3474de72db89e9dcc93bfb95fe377f48c37)
#5 0x7f073a146a5a (/lib/x86_64-linux-gnu/libgnutls.so.30+0x131a5a) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#6 0x7f073a07192c (/lib/x86_64-linux-gnu/libgnutls.so.30+0x5c92c) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#7 0x7f073a078333 (/lib/x86_64-linux-gnu/libgnutls.so.30+0x63333) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#8 0x7f073a0e8353 (/lib/x86_64-linux-gnu/libgnutls.so.30+0xd3353) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#9 0x7f073a0ef0ac in gnutls_x509_privkey_import (/lib/x86_64-linux-gnu/libgnutls.so.30+0xda0ac) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#10 0x55fa6d2547e3 in test_tls_load_key tests/unit/crypto-tls-x509-helpers.c:99:11
#11 0x55fa6d25460c in test_tls_init tests/unit/crypto-tls-x509-helpers.c:128:15
#12 0x55fa6d2495c4 in test_migrate_tls_x509_start_common tests/qtest/migration-test.c:1044:5
#13 0x55fa6d24c23a in test_migrate_tls_x509_start_reject_anon_client tests/qtest/migration-test.c:1216:12
#14 0x55fa6d23fb40 in test_precopy_common tests/qtest/migration-test.c:1789:21
#15 0x55fa6d236b7c in test_precopy_tcp_tls_x509_reject_anon_client tests/qtest/migration-test.c:2614:5
(Oddly, there is no reported leak in the x509 unit tests, even though
those also use test_tls_init() and test_tls_cleanup().)
Deinit the privkey in test_tls_cleanup().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In the migration test we create several TLS certificates with
the TLS_* macros from crypto-tls-x509-helpers.h. These macros
create both a QCryptoTLSCertReq object which must be deinitialized
and also an on-disk certificate file. The migration test currently
removes the on-disk file in test_migrate_tls_x509_finish() but
never deinitializes the QCryptoTLSCertReq, which means that memory
allocated as part of it is leaked:
Indirect leak of 2 byte(s) in 1 object(s) allocated from:
#0 0x5558ba33712e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f12e) (BuildId: 4c8618f663e538538cad19d35233124cea161491)
#1 0x7f64afc131f4 (/lib/x86_64-linux-gnu/libtasn1.so.6+0x81f4) (BuildId: 2fde6ecb43c586fe4077118f771077aa1298e7ea)
#2 0x7f64afc18d58 in asn1_write_value (/lib/x86_64-linux-gnu/libtasn1.so.6+0xdd58) (BuildId: 2fde6ecb43c586fe4077118f771077aa1298e7ea)
#3 0x7f64af8fc678 in gnutls_x509_crt_set_version (/lib/x86_64-linux-gnu/libgnutls.so.30+0xe7678) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#4 0x5558ba470035 in test_tls_generate_cert tests/unit/crypto-tls-x509-helpers.c:234:5
#5 0x5558ba464e4a in test_migrate_tls_x509_start_common tests/qtest/migration-test.c:1058:5
#6 0x5558ba462c8a in test_migrate_tls_x509_start_default_host tests/qtest/migration-test.c:1123:12
#7 0x5558ba45ab40 in test_precopy_common tests/qtest/migration-test.c:1786:21
#8 0x5558ba450015 in test_precopy_unix_tls_x509_default_host tests/qtest/migration-test.c:2077:5
#9 0x5558ba46d3c7 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
(and similar reports).
The only function currently provided to deinit a QCryptoTLSCertReq is
test_tls_discard_cert(), which also removes the on-disk certificate
file. For the migration tests we need to retain the on-disk files
until we've finished running the test, so the simplest fix is to
provide a new function test_tls_deinit_cert() which does only the
cleanup of the QCryptoTLSCertReq, and call it in the right places.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In migrate_get_socket_address() we leak the SocketAddressList:
(cd build/asan && \
ASAN_OPTIONS="fast_unwind_on_malloc=0:strip_path_prefix=/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../"
QTEST_QEMU_BINARY=./qemu-system-x86_64 \
./tests/qtest/migration-test --tap -k -p /x86_64/migration/multifd/tcp/tls/psk/match )
[...]
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x563d7f22f318 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f318) (BuildId: 2ad6282fb5d076c863ab87f41a345d46dc965ded)
#1 0x7f9de3b39c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x563d7f3a119c in qobject_input_start_list qapi/qobject-input-visitor.c:336:17
#3 0x563d7f390fbf in visit_start_list qapi/qapi-visit-core.c:80:10
#4 0x563d7f3882ef in visit_type_SocketAddressList /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-visit-sockets.c:519:10
#5 0x563d7f3658c9 in migrate_get_socket_address tests/qtest/migration-helpers.c:97:5
#6 0x563d7f362e24 in migrate_get_connect_uri tests/qtest/migration-helpers.c:111:13
#7 0x563d7f362bb2 in migrate_qmp tests/qtest/migration-helpers.c:222:23
#8 0x563d7f3533cd in test_precopy_common tests/qtest/migration-test.c:1817:5
#9 0x563d7f34dc1c in test_multifd_tcp_tls_psk_match tests/qtest/migration-test.c:3185:5
#10 0x563d7f365337 in migration_test_wrapper tests/qtest/migration-helpers.c:458:5
The code fishes out the SocketAddress from the list to return it, and the
callers are freeing that, but nothing frees the list.
Since this function is called in only two places, the simple fix is to
make it return the SocketAddressList rather than just a SocketAddress,
and then the callers can easily access the SocketAddress, and free
the whole SocketAddressList when they're done.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In calc_dirtyrate_ready() we g_strdup() a string but then never free it:
Direct leak of 19 byte(s) in 2 object(s) allocated from:
#0 0x55ead613413e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: e7cd5c37b2987a1af682b43ee5240b98bb316737)
#1 0x7f7a13d39738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7f7a13d4e583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x55ead6266f48 in calc_dirtyrate_ready tests/qtest/migration-test.c:3409:14
#4 0x55ead62669fe in wait_for_calc_dirtyrate_complete tests/qtest/migration-test.c:3422:13
#5 0x55ead6253df7 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3562:9
#6 0x55ead626a407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
We also fail to unref the QMP rsp_return, so we leak that also.
Rather than duplicating the string, use the in-place value from
the qdict, and then unref the qdict.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In multifd_mapped_ram_fdset_end() we call qtest_qmp() but forgot
to unref the response QDict we get back, which means it is leaked:
Indirect leak of 4120 byte(s) in 1 object(s) allocated from:
#0 0x55c0c095d318 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f318) (BuildI
d: 07f667506452d6c467dbc06fd95191966d3e91b4)
#1 0x7f186f939c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55c0c0ae9b01 in qdict_new qobject/qdict.c:30:13
#3 0x55c0c0afc16c in parse_object qobject/json-parser.c:317:12
#4 0x55c0c0afb90f in parse_value qobject/json-parser.c:545:16
#5 0x55c0c0afb579 in json_parser_parse qobject/json-parser.c:579:14
#6 0x55c0c0afa21d in json_message_process_token qobject/json-streamer.c:92:12
#7 0x55c0c0bca2e5 in json_lexer_feed_char qobject/json-lexer.c:313:13
#8 0x55c0c0bc97ce in json_lexer_feed qobject/json-lexer.c:350:9
#9 0x55c0c0afabbc in json_message_parser_feed qobject/json-streamer.c:121:5
#10 0x55c0c09cbd52 in qmp_fd_receive tests/qtest/libqmp.c:86:9
#11 0x55c0c09be69b in qtest_qmp_receive_dict tests/qtest/libqtest.c:760:12
#12 0x55c0c09bca77 in qtest_qmp_receive tests/qtest/libqtest.c:741:27
#13 0x55c0c09bee9d in qtest_vqmp tests/qtest/libqtest.c:812:12
#14 0x55c0c09bd257 in qtest_qmp tests/qtest/libqtest.c:835:16
#15 0x55c0c0a87747 in multifd_mapped_ram_fdset_end tests/qtest/migration-test.c:2393:12
#16 0x55c0c0a85eb3 in test_file_common tests/qtest/migration-test.c:1978:9
#17 0x55c0c0a746a3 in test_multifd_file_mapped_ram_fdset tests/qtest/migration-test.c:2437:5
#18 0x55c0c0a93237 in migration_test_wrapper tests/qtest/migration-helpers.c:458:5
#19 0x7f186f958aed in test_case_run debian/build/deb/../../../glib/gtestutils.c:2930:15
#20 0x7f186f958aed in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3018:16
#21 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#22 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#23 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#24 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#25 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#26 0x7f186f958faa in g_test_run_suite debian/build/deb/../../../glib/gtestutils.c:3109:18
#27 0x7f186f959055 in g_test_run debian/build/deb/../../../glib/gtestutils.c:2231:7
#28 0x7f186f959055 in g_test_run debian/build/deb/../../../glib/gtestutils.c:2218:1
#29 0x55c0c0a6e427 in main tests/qtest/migration-test.c:4033:11
Unref the object after we've confirmed that it is what we expect.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
If you invoke the migration-test binary in such a way that it doesn't run
any tests, then we never call bootfile_create(), and at the end of
main() bootfile_delete() will try to unlink(NULL), which is not valid.
This can happen if for instance you tell the test binary to run a
subset of tests that turns out to be empty, like this:
(cd build/asan && QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --tap -k -p bang)
# random seed: R02S6501b289ff8ced4231ba452c3a87bc6f
# Skipping test: userfaultfd not available
1..0
../../tests/qtest/migration-test.c:182:12: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/unistd.h:858:48: note: nonnull attribute specified here
Handle this by making bootfile_delete() not needing to do anything
because bootfile_create() was never called.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed conflict with aee07f2563]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
I fumbled one of my last pull requests when fixing in-tree an issue
with commit 87d67fadb9 ("monitor: Stop removing non-duplicated
fds"). Basically mixed-up my `git add -p` and `git checkout -p` and
committed a piece of test infra that has not been reviewed yet.
This has not caused any bad symptoms because the test is not enabled
by default anywhere: make check doesn't use two qemu binaries and the
CI doesn't have PYTHON set for the compat tests. Besides, the test
works fine anyway, it would not break anything.
Remove this because it was never intended to be merged.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Freeform sections with titles are currently generating a TOC entry for
the first paragraph in the section after the header, which is not what
we want.
(Easiest to observe directly in the QMP reference manual's
"Introduction" section.)
When freeform sections are parsed, we create both a section header *and*
an empty, title-less section. This causes some problems with sphinx's
post-parse tree transforms, see also 2664f317 - this is a similar issue:
Sphinx doesn't like section-less titles and it also doesn't like
title-less sections.
Modify qapidoc.py to parse text directly into the preceding section
title as child nodes, eliminating the section duplication. This removes
the extra text from the TOC.
Only very, very lightly tested: "it looks right at a glance" ™️. I am
still in the process of rewriting qapidoc, so I didn't give it much
deeper thought.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-ID: <20240822204803.1649762-1-jsnow@redhat.com>
Commit 3e7ef738 plugged the use-after-free of the global nbd_server
object, but overlooked a use-after-free of nbd_server->listener.
Although this race is harder to hit, notice that our shutdown path
first drops the reference count of nbd_server->listener, then triggers
actions that can result in a pending client reaching the
nbd_blockdev_client_closed() callback, which in turn calls
qio_net_listener_set_client_func on a potentially stale object.
If we know we don't want any more clients to connect, and have already
told the listener socket to shut down, then we should not be trying to
update the listener socket's associated function.
Reproducer:
> #!/usr/bin/python3
>
> import os
> from threading import Thread
>
> def start_stop():
> while 1:
> os.system('virsh qemu-monitor-command VM \'{"execute": "nbd-server-start",
+"arguments":{"addr":{"type":"unix","data":{"path":"/tmp/nbd-sock"}}}}\'')
> os.system('virsh qemu-monitor-command VM \'{"execute": "nbd-server-stop"}\'')
>
> def nbd_list():
> while 1:
> os.system('/path/to/build/qemu-nbd -L -k /tmp/nbd-sock')
>
> def test():
> sst = Thread(target=start_stop)
> sst.start()
> nlt = Thread(target=nbd_list)
> nlt.start()
>
> sst.join()
> nlt.join()
>
> test()
Fixes: CVE-2024-7409
Fixes: 3e7ef738c8 ("nbd/server: CVE-2024-7409: Close stray clients at server-stop")
CC: qemu-stable@nongnu.org
Reported-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20240822143617.800419-2-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
A test run may create boot files several times. Delete the previous boot
file before creating a new one.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240823-san-v4-7-a24c6dfa4ceb@daynix.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The qtests are broken since a while in the MSYS2 job in the gitlab-CI,
likely due to some changes in the MSYS2 environment. So far nobody has
neither a clue what's going wrong here, nor an idea how to fix this
(in fact most QEMU developers even don't have a Windows environment
available for properly analyzing this problem), so we should disable the
qtests here for the time being to get at least test coverage again
for the remaining tests that are run here.
Since we already get compile-test coverage for the system emulation
in the cross-win64-system job, and since the MSYS2 job is one of the
longest running jobs in our CI (it takes more than 1 hour to complete),
let's seize the opportunity and also cut the run time by disabling
the system emulation completely here, including the libraries that
are only useful for system emulation. In case somebody ever figures
out the failure of the qtests on MSYS2, we can revert this patch
to get everything back.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240820170142.55324-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In commit 412d294ffd we tried to improve the error message printed when
the machine type is unknown, but we used the wrong variable, resulting in:
$ ./build/x86/qemu-system-aarch64 -M bang
qemu-system-aarch64: unsupported machine type: "(null)"
Use -machine help to list supported machines
Use the right variable, so we produce more helpful output:
$ ./build/x86/qemu-system-aarch64 -M bang
qemu-system-aarch64: unsupported machine type: "bang"
Use -machine help to list supported machines
Note that we must move the qdict_del() to below the error_setg(),
because machine_type points into the value of that qdict entry,
and deleting it will make the pointer invalid.
Cc: qemu-stable@nongnu.org
Fixes: 412d294ffd ("vl.c: select_machine(): add selected machine type to error message")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>