mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Fabiano Rosas	90e0eeb99b	migration/multifd: Add a couple of asserts for p->iov Check that p->iov is indeed always allocated and freed by the MultiFDMethods hooks. Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:37 -03:00
Fabiano Rosas	81b0ed8ad8	migration/multifd: Stop changing the packet on recv side As observed by Philippe, the multifd_ram_unfill_packet() function currently leaves the MultiFDPacket structure with mixed endianness. This is harmless, but ultimately not very clean. Stop touching the received packet and do the necessary work using stack variables instead. While here tweak the error strings and fix the space before semicolons. Also remove the "100 times bigger" comment because it's just one possible explanation for a size mismatch and it doesn't even match the code. CC: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	308d165c77	migration/multifd: Make MultiFDMethods const The methods are defined at module_init time and don't ever change. Make them const. Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	40c9471e40	migration/multifd: Move nocomp code into multifd-nocomp.c In preparation for adding new payload types to multifd, move most of the no-compression code into multifd-nocomp.c. Let's try to keep a semblance of layering by not mixing general multifd control flow with the details of transmitting pages of ram. There are still some pieces leftover, namely the p->normal, p->zero, etc variables that we use for zero page tracking and the packet allocation which is heavily dependent on the ram code. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	dc6327d99c	migration/multifd: Register nocomp ops dynamically Prior to moving the ram code into multifd-nocomp.c, change the code to register the nocomp ops dynamically so we don't need to have the ops structure defined in multifd.c. While here, move the ops struct initialization to the end of the file to make the next diff cleaner. Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	6f848dac4a	migration/multifd: Standardize on multifd ops names Add the multifd_ prefix to all functions and remove the useless docstrings. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	a0c78d815c	migration/multifd: Allow multifd sync without flush Separate the multifd sync from flushing the client data to the channels. These two operations are closely related but not strictly necessary to be executed together. The multifd sync is intrinsic to how multifd works. The multiple channels operate independently and may finish IO out of order in relation to each other. This applies also between the source and destination QEMU. Flushing the data that is left in the client-owned data structures (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, but that is particular to how the ram migration is implemented with several passes over dirty data. Make these two routines separate, allowing future code to call the sync by itself if needed. This also allows the usage of multifd_ram_send to be isolated to ram code. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	a71ef5c7f3	migration/multifd: Replace multifd_send_state->pages with client data Multifd currently has a simple scheduling mechanism that distributes work to the various channels by keeping storage space within each channel and an extra space that is given to the client. Each time the client fills the space with data and calls into multifd, that space is given to the next idle channel and a free storage space is taken from the channel and given to client for the next iteration. This means we always need (#multifd_channels + 1) memory slots to operate multifd. This is fine, except that the presence of this one extra memory slot doesn't allow different types of payloads to be processed at the same time in different channels, i.e. the data type of multifd_send_state->pages needs to be the same as p->pages. For each new data type different from MultiFDPage_t that is to be handled, this logic would need to be duplicated by adding new fields to multifd_send_state, to the channels and to multifd_send_pages(). Fix this situation by moving the extra slot into the client and using only the generic type MultiFDSendData in the multifd core. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	d7e58f412c	migration/multifd: Don't send ram data during SYNC Skip saving and loading any ram data in the packet in the case of a SYNC. This fixes a shortcoming of the current code which requires a reset of the MultiFDPages_t fields right after the previous pending_job finishes, otherwise the very next job might be a SYNC and multifd_send_fill_packet() will put the stale values in the packet. By not calling multifd_ram_fill_packet(), we can stop resetting MultiFDPages_t in the multifd core and leave that to the client code. Actually moving the reset function is not yet done because pages->num==0 is used by the client code to determine whether the MultiFDPages_t needs to be flushed. The subsequent patches will replace that with a generic flag that is not dependent on MultiFDPages_t. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	87bb9e953e	migration/multifd: Isolate ram pages packet data While we cannot yet disentangle the multifd packet from page data, we can make the code a bit cleaner by setting the page-related fields in a separate function. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	96d396bf50	migration/multifd: Remove total pages tracing The total_normal_pages and total_zero_pages elements are used only for the end tracepoints of the multifd threads. These are not super useful since they record per-channel numbers and are just the sum of all the pages that are transmitted per-packet, for which we already have tracepoints. Remove the totals from the tracing. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	5aff71767c	migration/multifd: Move pages accounting into multifd_send_zero_page_detect() All references to pages are being removed from the multifd worker threads in order to allow multifd to deal with different payload types. multifd_send_zero_page_detect() is called by all multifd migration paths that deal with pages and is the last spot where zero pages and normal page amounts are adjusted. Move the pages accounting into that function. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	9f0e108901	migration/multifd: Replace p->pages with an union pointer We want multifd to be able to handle more types of data than just ram pages. To start decoupling multifd from pages, replace p->pages (MultiFDPages_t) with the new type MultiFDSendData that hides the client payload inside an union. The general idea here is to isolate functions that need to handle MultiFDPages_t and move them in the future to multifd-ram.c, while multifd.c will stay with only the core functions that handle MultiFDSendData/MultiFDRecvData. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	0e427da096	migration/multifd: Make MultiFDPages_t:offset a flexible array member We're about to use MultiFDPages_t from inside the MultiFDSendData payload union, which means we cannot have pointers to allocated data inside the pages structure, otherwise we'd lose the reference to that memory once another payload type touches the union. Move the offset array into the end of the structure and turn it into a flexible array member, so it is allocated along with the rest of MultiFDSendData in the next patches. Note that other pointers, such as the ramblock pointer are still fine as long as the storage for them is not owned by the migration code and can be correctly released at some point. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	112f7d1b75	migration/multifd: Pass in MultiFDPages_t to file_write_ramblock_iov We want to stop dereferencing 'pages' so it can be replaced by an opaque pointer in the next patches. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	171056ec91	migration/multifd: Remove pages->allocated This value never changes and is always the same as page_count. We don't need a copy of it per-channel plus one in the extra slot. Remove it. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:34 -03:00
Fabiano Rosas	90fa121c6c	migration/multifd: Inline page_size and page_count The MultiFD*Params structures are for per-channel data. Constant values should not be there because that needlessly wastes cycles and storage. The page_size and page_count fall into this category so move them inline in multifd.h. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:34 -03:00
Fabiano Rosas	bc112a6c90	migration/multifd: Reduce access to p->pages I'm about to replace the p->pages pointer with an opaque pointer, so do a cleanup now to reduce direct accesses to p->page, which makes the next diffs cleaner. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:34 -03:00
Peter Maydell	4c107870e8	migration/multifd: Free MultiFDRecvParams::data In multifd_recv_setup() we allocate (among other things) * a MultiFDRecvData struct to multifd_recv_state::data * a MultiFDRecvData struct to each multfd_recv_state->params[i].data (Then during execution we might swap these pointers around.) But in multifd_recv_cleanup() we free multifd_recv_state->data in multifd_recv_cleanup_state() but we don't ever free the multifd_recv_state->params[i].data. This results in a memory leak reported by LeakSanitizer: (cd build/asan && \ ASAN_OPTIONS="fast_unwind_on_malloc=0:strip_path_prefix=/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../" \ QTEST_QEMU_BINARY=./qemu-system-x86_64 \ ./tests/qtest/migration-test --tap -k -p /x86_64/migration/multifd/file/mapped-ram ) [...] Direct leak of 72 byte(s) in 3 object(s) allocated from: #0 0x561cc0afcfd8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8) (BuildId: be72e086d4e47b172b0a72779972213fd9916466) #1 0x7f89d37acc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13 #2 0x561cc1e9c83c in multifd_recv_setup migration/multifd.c:1606:19 #3 0x561cc1e68618 in migration_ioc_process_incoming migration/migration.c:972:9 #4 0x561cc1e3ac59 in migration_channel_process_incoming migration/channel.c:45:9 #5 0x561cc1e4fa0b in file_accept_incoming_migration migration/file.c:132:5 #6 0x561cc30f2c0c in qio_channel_fd_source_dispatch io/channel-watch.c:84:12 #7 0x7f89d37a3c43 in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28 #8 0x7f89d37a3c43 in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7 #9 0x561cc3b21659 in glib_pollfds_poll util/main-loop.c:287:9 #10 0x561cc3b1ff93 in os_host_main_loop_wait util/main-loop.c:310:5 #11 0x561cc3b1fb5c in main_loop_wait util/main-loop.c:589:11 #12 0x561cc1da2917 in qemu_main_loop system/runstate.c:801:9 #13 0x561cc3796c1c in qemu_default_main system/main.c:37:14 #14 0x561cc3796c67 in main system/main.c:48:12 #15 0x7f89d163bd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #16 0x7f89d163be3f in __libc_start_main csu/../csu/libc-start.c:392:3 #17 0x561cc0a79fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: be72e086d4e47b172b0a72779972213fd9916466) Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x561cc0afcfd8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8) (BuildId: be72e086d4e47b172b0a72779972213fd9916466) #1 0x7f89d37acc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13 #2 0x561cc1e9bed9 in multifd_recv_setup migration/multifd.c:1588:32 #3 0x561cc1e68618 in migration_ioc_process_incoming migration/migration.c:972:9 #4 0x561cc1e3ac59 in migration_channel_process_incoming migration/channel.c:45:9 #5 0x561cc1e4fa0b in file_accept_incoming_migration migration/file.c:132:5 #6 0x561cc30f2c0c in qio_channel_fd_source_dispatch io/channel-watch.c:84:12 #7 0x7f89d37a3c43 in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28 #8 0x7f89d37a3c43 in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7 #9 0x561cc3b21659 in glib_pollfds_poll util/main-loop.c:287:9 #10 0x561cc3b1ff93 in os_host_main_loop_wait util/main-loop.c:310:5 #11 0x561cc3b1fb5c in main_loop_wait util/main-loop.c:589:11 #12 0x561cc1da2917 in qemu_main_loop system/runstate.c:801:9 #13 0x561cc3796c1c in qemu_default_main system/main.c:37:14 #14 0x561cc3796c67 in main system/main.c:48:12 #15 0x7f89d163bd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #16 0x7f89d163be3f in __libc_start_main csu/../csu/libc-start.c:392:3 #17 0x561cc0a79fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: be72e086d4e47b172b0a72779972213fd9916466) SUMMARY: AddressSanitizer: 96 byte(s) leaked in 4 allocation(s). Free the params[i].data too. Cc: qemu-stable@nongnu.org Fixes: `d117ed0699` ("migration/multifd: Allow receiving pages without packets") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-08-20 12:44:13 -03:00
Fabiano Rosas	0bd5b9284f	migration/multifd: Fix multifd_send_setup cleanup when channel creation fails When a channel fails to create, the code currently just returns. This is wrong for two reasons: 1) Channel n+1 will not get to initialize it's semaphores, leading to an assert when terminate_threads tries to post to it: qemu-system-x86_64: ../util/qemu-thread-posix.c:92: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed. 2) (theoretical) If channel n-1 already started creation it will defeat the purpose of the channels_created logic which is in place to avoid migrate_fd_cleanup() to run while channels are still being created. This cannot really happen today because the current failure cases for multifd_new_send_channel_create() are all synchronous, resulting from qio_channel_file_new_path() getting a bad filename. This would hit all channels equally. But I don't want to set a trap for future people, so have all channels try to create (even if failing), and only fail after the channels_created semaphore has been posted. While here, remove the error_report_err call. There's one already at migrate_fd_cleanup later on. Cc: qemu-stable@nongnu.org Reported-by: Jim Fehlig <jfehlig@suse.com> Fixes: `b7b03eb614` ("migration/multifd: Add outgoing QIOChannelFile support") Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-08-02 09:47:40 -03:00
Peter Xu	60ce47675d	migration: Rename thread debug names The postcopy thread names on dest QEMU are slightly confusing, partly I'll need to blame myself on `36f62f11e4` ("migration: Postcopy preemption preparation on channel creation"). E.g., "fault-fast" reads like a fast version of "fault-default", but it's actually the fast version of "postcopy/listen". Taking this chance, rename all the migration threads with proper rules. Considering we only have 15 chars usable, prefix all threads with "mig/", meanwhile identify src/dst threads properly this time. So now most thread names will look like "mig/DIR/xxx", where DIR will be "src"/"dst", except the bg-snapshot thread which doesn't have a direction. For multifd threads, making them "mig/{src\|dst}/{send\|recv}_%d". We used to have "live_migration" thread for a very long time, now it's called "mig/src/main". We may hope to have "mig/dst/main" soon but not yet. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Zhijian Li (Fujitsu) <lizhijian@fujitsu.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-06-21 09:47:59 -03:00
Yuan Liu	d9d3e4f243	migration/multifd: put IOV initialization into compression method Different compression methods may require different numbers of IOVs. Based on streaming compression of zlib and zstd, all pages will be compressed to a data block, so two IOVs are needed for packet header and compressed data block. Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-06-14 14:01:28 -03:00
Yuan Liu	5ef7e26bdb	migration/multifd: solve zero page causing multiple page faults Implemented recvbitmap tracking of received pages in multifd. If the zero page appears for the first time in the recvbitmap, this page is not checked and set. If the zero page has already appeared in the recvbitmap, there is no need to check the data but directly set the data to 0, because it is unlikely that the zero page will be migrated multiple times. Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240401154110.2028453-2-yuan1.liu@intel.com [peterx: touch up the comment, as the bitmap is used outside postcopy now] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-04-23 18:36:01 -04:00
Fabiano Rosas	8fa1a21c6e	migration/multifd: Fix clearing of mapped-ram zero pages When the zero page detection is done in the multifd threads, we need to iterate the second part of the pages->offset array and clear the file bitmap for each zero page. The piece of code we merged to do that is wrong. The reason this has passed all the tests is because the bitmap is initialized with zeroes already, so clearing the bits only really has an effect during live migration and when a data page goes from having data to no data. Fixes: `303e6f54f9` ("migration/multifd: Implement zero page transmission on the multifd thread.") Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240321201242.6009-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-22 12:12:08 -04:00
Fabiano Rosas	bd4480b0d0	migration: Revert mapped-ram multifd support to fd: URI This reverts commit `decdc76772` in full and also the relevant migration-tests from `7a09f09283`. After the addition of the new QAPI-based migration address API in 8.2 we've been converting an "fd:" URI into a SocketAddress, missing the fact that the "fd:" syntax could also be used for a plain file instead of a socket. This is a problem because the SocketAddress is part of the API, so we're effectively asking users to create a "socket" channel to pass in a plain file. The easiest way to fix this situation is to deprecate the usage of both SocketAddress and "fd:" when used with a plain file for migration. Since this has been possible since 8.2, we can wait until 9.1 to deprecate it. For 9.0, however, we should avoid adding further support to migration to a plain file using the old "fd:" syntax or the new SocketAddress API, and instead require the usage of either the old-style "file:" URI or the FileMigrationArgs::filename field of the new API with the "/dev/fdset/NN" syntax, both of which are already supported. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240319210941.1907-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-22 12:12:08 -04:00
Hao Xiang	303e6f54f9	migration/multifd: Implement zero page transmission on the multifd thread. 1. Add zero_pages field in MultiFDPacket_t. 2. Implements the zero page detection and handling on the multifd threads for non-compression, zlib and zstd compression backends. 3. Added a new value 'multifd' in ZeroPageDetection enumeration. 4. Adds zero page counters and updates multifd send/receive tracing format to track the newly added counters. Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240311180015.3359271-5-hao.xiang@linux.dev Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-11 16:57:09 -04:00
Fabiano Rosas	c3cdf3fb18	migration/multifd: Allow clearing of the file_bmap from multifd We currently only need to clear the mapped-ram file bitmap from the migration thread during save_zero_page. We're about to add support for zero page detection on the multifd thread, so allow ramblock_set_file_bmap_atomic() to also clear the bits. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240311180015.3359271-3-hao.xiang@linux.dev Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-11 16:56:52 -04:00
Fabiano Rosas	61dec06082	migration/multifd: Don't fsync when closing QIOChannelFile Commit bc38feddeb ("io: fsync before closing a file channel") added a fsync/fdatasync at the closing point of the QIOChannelFile to ensure integrity of the migration stream in case of QEMU crash. The decision to do the sync at qio_channel_close() was not the best since that function runs in the main thread and the fsync can cause QEMU to hang for several minutes, depending on the migration size and disk speed. To fix the hang, remove the fsync from qio_channel_file_close(). At this moment, the migration code is the only user of the fsync and we're taking the tradeoff of not having a sync at all, leaving the responsibility to the upper layers. Fixes: bc38feddeb ("io: fsync before closing a file channel") Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240305195629.9922-1-farosas@suse.de Link: https://lore.kernel.org/r/20240305174332.2553-1-farosas@suse.de [peterx: add more comment to the qio_channel_close()] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-11 14:41:40 -04:00
Peter Xu	1a6e217c35	migration/multifd: Document two places for mapped-ram Add two documentations for mapped-ram migration on two spots that may not be extremely clear. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240301091524.39900-1-peterx@redhat.com Cc: Prasad Pandit <ppandit@redhat.com> [peterx: fix two English errors per Prasad] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-04 08:31:11 +08:00
Fabiano Rosas	decdc76772	migration/multifd: Add mapped-ram support to fd: URI If we receive a file descriptor that points to a regular file, there's nothing stopping us from doing multifd migration with mapped-ram to that file. Enable the fd: URI to work with multifd + mapped-ram. Note that the fds passed into multifd are duplicated because we want to avoid cross-thread effects when doing cleanup (i.e. close(fd)). The original fd doesn't need to be duplicated because monitor_get_fd() transfers ownership to the caller. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240229153017.2221-23-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	a49d15a38d	migration/multifd: Support incoming mapped-ram stream format For the incoming mapped-ram migration we need to read the ramblock headers, get the pages bitmap and send the host address of each non-zero page to the multifd channel thread for writing. Usage on HMP is: (qemu) migrate_set_capability multifd on (qemu) migrate_set_capability mapped-ram on (qemu) migrate_incoming file:migfile (the ram.h include needs to move because we've been previously relying on it being included from migration.c. Now file.h will start including multifd.h before migration.o is processed) Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-22-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	f427d90b98	migration/multifd: Support outgoing mapped-ram stream format The new mapped-ram stream format uses a file transport and puts ram pages in the migration file at their respective offsets and can be done in parallel by using the pwritev system call which takes iovecs and an offset. Add support to enabling the new format along with multifd to make use of the threading and page handling already in place. This requires multifd to stop sending headers and leaving the stream format to the mapped-ram code. When it comes time to write the data, we need to call a version of qio_channel_write that can take an offset. Usage on HMP is: (qemu) stop (qemu) migrate_set_capability multifd on (qemu) migrate_set_capability mapped-ram on (qemu) migrate_set_parameter max-bandwidth 0 (qemu) migrate_set_parameter multifd-channels 8 (qemu) migrate file:migfile Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-21-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	2dd7ee7a51	migration/multifd: Add incoming QIOChannelFile support On the receiving side we don't need to differentiate between main channel and threads, so whichever channel is defined first gets to be the main one. And since there are no packets, use the atomic channel count to index into the params array. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-19-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	b7b03eb614	migration/multifd: Add outgoing QIOChannelFile support Allow multifd to open file-backed channels. This will be used when enabling the mapped-ram migration stream format which expects a seekable transport. The QIOChannel read and write methods will use the preadv/pwritev versions which don't update the file offset at each call so we can reuse the fd without re-opening for every channel. Contrary to the socket migration, the file migration doesn't need an asynchronous channel creation process, so expose multifd_channel_connect() and call it directly. Note that this is just setup code and multifd cannot yet make use of the file channels. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240229153017.2221-18-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	a8a3e7102c	migration/multifd: Add a wrapper for channels_created We'll need to access multifd_send_state->channels_created from outside multifd.c, so introduce a helper for that. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-17-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	d117ed0699	migration/multifd: Allow receiving pages without packets Currently multifd does not need to have knowledge of pages on the receiving side because all the information needed is within the packets that come in the stream. We're about to add support to mapped-ram migration, which cannot use packets because it expects the ramblock section in the migration file to contain only the guest pages data. Add a data structure to transfer pages between the ram migration code and the multifd receiving threads. We don't want to reuse MultiFDPages_t for two reasons: a) multifd threads don't really need to know about the data they're receiving. b) the receiving side has to be stopped to load the pages, which means we can experiment with larger granularities than page size when transferring data. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-16-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	06833d83f8	migration/multifd: Allow multifd without packets For the upcoming support to the new 'mapped-ram' migration stream format, we cannot use multifd packets because each write into the ramblock section in the migration file is expected to contain only the guest pages. They are written at their respective offsets relative to the ramblock section header. There is no space for the packet information and the expected gains from the new approach come partly from being able to write the pages sequentially without extraneous data in between. The new format also simply doesn't need the packets and all necessary information can be taken from the standard migration headers with some (future) changes to multifd code. Use the presence of the mapped-ram capability to decide whether to send packets. This only moves code under multifd_use_packets(), it has no effect for now as mapped-ram cannot yet be enabled with multifd. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-15-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	9db1912513	migration/multifd: Decouple recv method from pages Next patches will abstract the type of data being received by the channels, so do some cleanup now to remove references to pages and dependency on 'normal_num'. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-14-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	4aac6b1e9b	migration/multifd: Cleanup multifd_recv_sync_main Some minor cleanups and documentation for multifd_recv_sync_main. Use thread_count as done in other parts of the code. Remove p->id from the multifd_recv_state sync, since that is global and not tied to a channel. Add documentation for the sync steps. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Peter Xu	c9a7e83c9d	migration/multifd: Drop unnecessary helper to destroy IOC Both socket_send_channel_destroy() and multifd_send_channel_destroy() are unnecessary wrappers to destroy an IOC, as the only thing to do is to release the final IOC reference. We have plenty of code that destroys an IOC using direct unref() already; keep that style. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	72b90b9687	migration/multifd: Cleanup outgoing_args in state destroy outgoing_args is a global cache of socket address to be reused in multifd. Freeing the cache in per-channel destructor is more or less a hack. Move it to multifd_send_cleanup_state() so it only get checked once. Use a small helper to do so because it's internal of socket.c. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	770de49c00	migration/multifd: Make multifd_channel_connect() return void It never fails, drop the retval and also the Error**. Suggested-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	0518b5d8d3	migration/multifd: Drop registered_yank With a clear definition of p->c protocol, where we only set it up if the channel is fully established (TLS or non-TLS), registered_yank boolean will have equal meaning of "p->c != NULL". Drop registered_yank by checking p->c instead. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	9221e3c6a2	migration/multifd: Cleanup TLS iochannel referencing Commit `a1af605bd5` ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake") introduced a thread for TLS channels, which will resolve the issue on blocking the main thread. However in the same commit p->c is slightly abused just to be able to pass over the pointer "p" into the thread. That's the major reason we'll need to conditionally free the io channel in the fault paths. To clean it up, using a separate structure to pass over both "p" and "tioc" in the tls handshake thread. Then we can make it a rule that p->c will never be set until the channel is completely setup. With that, we can drop the tricky conditional unref of the io channel in the error path. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Fabiano Rosas	d13f0026c7	migration/multifd: Release recv sem_sync earlier Now that multifd_recv_terminate_threads() is called only once, release the recv side sem_sync earlier like we do for the send side. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240220224138.24759-6-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Fabiano Rosas	11dd7be575	migration/multifd: Remove p->quit from recv side Like we did on the sending side, replace the p->quit per-channel flag with a global atomic 'exiting' flag. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240220224138.24759-5-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Fabiano Rosas	93fa9dc2e0	migration/multifd: Add a synchronization point for channel creation It is possible that one of the multifd channels fails to be created at multifd_new_send_channel_async() while the rest of the channel creation tasks are still in flight. This could lead to multifd_save_cleanup() executing the qemu_thread_join() loop too early and not waiting for the threads which haven't been created yet, leading to the freeing of resources that the newly created threads will try to access and crash. Add a synchronization point after which there will be no attempts at thread creation and therefore calling multifd_save_cleanup() past that point will ensure it properly waits for the threads. A note about performance: Prior to this patch, if a channel took too long to be established, other channels could finish connecting first and already start taking load. Now we're bounded by the slowest-connecting channel. Reported-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-7-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Fabiano Rosas	2576ae488e	migration/multifd: Unify multifd and TLS connection paths During multifd channel creation (multifd_send_new_channel_async) when TLS is enabled, the multifd_channel_connect function is called twice, once to create the TLS handshake thread and another time after the asynchrounous TLS handshake has finished. This creates a slightly confusing call stack where multifd_channel_connect() is called more times than the number of channels. It also splits error handling between the two callers of multifd_channel_connect() causing some code duplication. Lastly, it gets in the way of having a single point to determine whether all channel creation tasks have been initiated. Refactor the code to move the reentrancy one level up at the multifd_new_send_channel_async() level, de-duplicating the error handling and allowing for the next patch to introduce a synchronization point common to all the multifd channel creation, regardless of TLS. Note that the previous code would never fail once p->c had been set. This patch changes this assumption, which affects refcounting, so add comments around object_unref to explain the situation. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-6-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Fabiano Rosas	bd8b0a8f82	migration/multifd: Move multifd_send_setup error handling in to the function Hide the error handling inside multifd_send_setup to make it cleaner for the next patch to move the function around. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-4-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Fabiano Rosas	a2a63c4abd	migration/multifd: Remove p->running We currently only need p->running to avoid calling qemu_thread_join() on a non existent thread if the thread has never been created. However, there are at least two bugs in this logic: 1) On the sending side, p->running is set too early and qemu_thread_create() can be skipped due to an error during TLS handshake, leaving the flag set and leading to a crash when multifd_send_cleanup() calls qemu_thread_join(). 2) During exit, the multifd thread clears the flag while holding the channel lock. The counterpart at multifd_send_cleanup() reads the flag outside of the lock and might free the mutex while the multifd thread still has it locked. Fix the first issue by setting the flag right before creating the thread. Rename it from p->running to p->thread_created to clarify its usage. Fix the second issue by not clearing the flag at the multifd thread exit. We don't have any use for that. Note that these bugs are straight-forward logic issues and not race conditions. There is still a gap for races to affect this code due to multifd_send_cleanup() being allowed to run concurrently with the thread creation loop. This issue is solved in the next patches. Cc: qemu-stable <qemu-stable@nongnu.org> Fixes: `2964714015` ("migration/tls: add support for multifd tls-handshake") Reported-by: Avihai Horon <avihaih@nvidia.com> Reported-by: chenyuhui5@huawei.com Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00

1 2 3 4

178 Commits