mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Peter Maydell	7d4e29ef80	QAPI patches patches for 2024-03-04 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmXlaSISHGFybWJydUBy ZWRoYXQuY29tAAoJEDhwtADrkYZTdZ8P/iMgqLoAFkCCjwfkUc/rqZUezK52Ynr7 LYwOPI/xcYD7EnVogdRgFgjWFNoivQLP5yKsU/eRTk29pwdDzTscFm/0ztTQX/Gb ypWV+GBcu5J8mKbp1KF5w68aDD8Bat4WRfEgDQ1DV7v6CoMiUzTiF3CGXkYzqK5Y kYNq97vdEkBFvFdOl/7scs/XXN2jG27egDhMp68RTxnPHlXZiAO9/2Bul3uVe3x0 fzQ2ViYv0qLnjE/PwENDqqE3Thv3Sxp5iEeQQ6GWi07EVh07UtHpOM3RYyrTU0Sb VrTApSrg0oxlkOuR0CBd9Fi+timtbokBL0DWyUpXNTfIEZfLtA9H+8riUg3EOcDp r7a4SI/27VdPxX6Kc6zA3bi+/j1o7CLTW2LGEwuZs52nmixoo1HTWPIFdyh13g/V QjNbun0fViHb0FVLiyDlXF/7Y+EWUWIyqwwGqbvve1DyUHQmo3CUQAKGOpkeKSBe 4eGciVDgpBoKhtw9Kv6LCDj2cwZKC8DxBMibf7GHkOnAsX2mnyuHcey7HvYNCoF+ yYz7oIEXdlL2eWqg7CfBZK7lniCDln50RI4Ll1v+J4r1v1kRZGMLesTYXCdNc4ku yb4kpU4t22/RODffLE7K+fc3Onwze3fcfxlZMN66F+wFtk4KdPR2aQBE66bB8J99 vuSKlTbT4cGL =s9AR -----END PGP SIGNATURE----- Merge tag 'pull-qapi-2024-03-04' of https://repo.or.cz/qemu/armbru into staging QAPI patches patches for 2024-03-04 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmXlaSISHGFybWJydUBy # ZWRoYXQuY29tAAoJEDhwtADrkYZTdZ8P/iMgqLoAFkCCjwfkUc/rqZUezK52Ynr7 # LYwOPI/xcYD7EnVogdRgFgjWFNoivQLP5yKsU/eRTk29pwdDzTscFm/0ztTQX/Gb # ypWV+GBcu5J8mKbp1KF5w68aDD8Bat4WRfEgDQ1DV7v6CoMiUzTiF3CGXkYzqK5Y # kYNq97vdEkBFvFdOl/7scs/XXN2jG27egDhMp68RTxnPHlXZiAO9/2Bul3uVe3x0 # fzQ2ViYv0qLnjE/PwENDqqE3Thv3Sxp5iEeQQ6GWi07EVh07UtHpOM3RYyrTU0Sb # VrTApSrg0oxlkOuR0CBd9Fi+timtbokBL0DWyUpXNTfIEZfLtA9H+8riUg3EOcDp # r7a4SI/27VdPxX6Kc6zA3bi+/j1o7CLTW2LGEwuZs52nmixoo1HTWPIFdyh13g/V # QjNbun0fViHb0FVLiyDlXF/7Y+EWUWIyqwwGqbvve1DyUHQmo3CUQAKGOpkeKSBe # 4eGciVDgpBoKhtw9Kv6LCDj2cwZKC8DxBMibf7GHkOnAsX2mnyuHcey7HvYNCoF+ # yYz7oIEXdlL2eWqg7CfBZK7lniCDln50RI4Ll1v+J4r1v1kRZGMLesTYXCdNc4ku # yb4kpU4t22/RODffLE7K+fc3Onwze3fcfxlZMN66F+wFtk4KdPR2aQBE66bB8J99 # vuSKlTbT4cGL # =s9AR # -----END PGP SIGNATURE----- # gpg: Signature made Mon 04 Mar 2024 06:24:34 GMT # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * tag 'pull-qapi-2024-03-04' of https://repo.or.cz/qemu/armbru: migration: simplify exec migration functions qapi: New strv_from_str_list() qapi: New QAPI_LIST_LENGTH() docs/devel/writing-monitor-commands: Minor improvements docs/devel/writing-monitor-commands: Repair a decade of rot qapi: Reject "Returns" section when command doesn't return anything qga/qapi-schema: Fix guest-set-memory-blocks documentation qga/qapi-schema: Tweak documentation of fsfreeze commands qga/qapi-schema: Clean up "Returns" sections qga/qapi-schema: Delete useless "Returns" sections qga/qapi-schema: Move error documentation to new "Errors" sections qapi/yank: Tweak @yank's error description for consistency qapi: Clean up "Returns" sections qapi: Delete useless "Returns" sections qapi: Move error documentation to new "Errors" sections qapi: New documentation section tag "Errors" qapi: Slightly clearer error message for invalid "Returns" section qapi: Memorize since & returns sections Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-03-05 11:20:15 +00:00
Peter Maydell	c90cfb5294	Migartion pull request for 20240304 - Bryan's fix on multifd compression level API - Fabiano's mapped-ram series (base + multifd only) - Steve's amend on cpr document in qapi/ -----BEGIN PGP SIGNATURE----- iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZeUjKhIccGV0ZXJ4QHJl ZGhhdC5jb20ACgkQO1/MzfOr1wbv5QD/ZexBUsmZA5qyxgGvZ2yvlUBEGNOvtmKY kRdiYPU7khMA/0N43rn4LcqKCoq4+T+EAnYizGjIyhH/7BRUyn4DUxgO =AeEn -----END PGP SIGNATURE----- Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu into staging Migartion pull request for 20240304 - Bryan's fix on multifd compression level API - Fabiano's mapped-ram series (base + multifd only) - Steve's amend on cpr document in qapi/ # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZeUjKhIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wbv5QD/ZexBUsmZA5qyxgGvZ2yvlUBEGNOvtmKY # kRdiYPU7khMA/0N43rn4LcqKCoq4+T+EAnYizGjIyhH/7BRUyn4DUxgO # =AeEn # -----END PGP SIGNATURE----- # gpg: Signature made Mon 04 Mar 2024 01:26:02 GMT # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal] # gpg: aka "Peter Xu <peterx@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu: (27 commits) migration/multifd: Document two places for mapped-ram tests/qtest/migration: Add a multifd + mapped-ram migration test migration/multifd: Add mapped-ram support to fd: URI migration/multifd: Support incoming mapped-ram stream format migration/multifd: Support outgoing mapped-ram stream format migration/multifd: Prepare multifd sync for mapped-ram migration migration/multifd: Add incoming QIOChannelFile support migration/multifd: Add outgoing QIOChannelFile support migration/multifd: Add a wrapper for channels_created migration/multifd: Allow receiving pages without packets migration/multifd: Allow multifd without packets migration/multifd: Decouple recv method from pages migration/multifd: Rename MultiFDSend\|RecvParams::data to compress_data tests/qtest/migration: Add tests for mapped-ram file-based migration migration/ram: Add incoming 'mapped-ram' migration migration/ram: Add outgoing 'mapped-ram' migration migration: Add mapped-ram URI compatibility check migration/ram: Introduce 'mapped-ram' migration capability migration/qemu-file: add utility methods for working with seekable channels io: fsync before closing a file channel ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # migration/ram.c	2024-03-05 11:19:58 +00:00
Steve Sistare	018d5fb1f9	migration: simplify exec migration functions Simplify the exec migration code by using list utility functions. As a side effect, this also fixes a minor memory leak. On function return, "g_auto(GStrv) argv" frees argv and each element, which is wrong, because the function does not own the individual elements. To compensate, the code uses g_steal_pointer which NULLs argv and prevents the destructor from running, but argv is leaked. Fixes: `cbab4face5` ("migration: convert exec backend ...") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Message-ID: <20240227153321.467343-4-armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2024-03-04 07:12:40 +01:00
Peter Xu	1a6e217c35	migration/multifd: Document two places for mapped-ram Add two documentations for mapped-ram migration on two spots that may not be extremely clear. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240301091524.39900-1-peterx@redhat.com Cc: Prasad Pandit <ppandit@redhat.com> [peterx: fix two English errors per Prasad] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-04 08:31:11 +08:00
Fabiano Rosas	decdc76772	migration/multifd: Add mapped-ram support to fd: URI If we receive a file descriptor that points to a regular file, there's nothing stopping us from doing multifd migration with mapped-ram to that file. Enable the fd: URI to work with multifd + mapped-ram. Note that the fds passed into multifd are duplicated because we want to avoid cross-thread effects when doing cleanup (i.e. close(fd)). The original fd doesn't need to be duplicated because monitor_get_fd() transfers ownership to the caller. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240229153017.2221-23-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	a49d15a38d	migration/multifd: Support incoming mapped-ram stream format For the incoming mapped-ram migration we need to read the ramblock headers, get the pages bitmap and send the host address of each non-zero page to the multifd channel thread for writing. Usage on HMP is: (qemu) migrate_set_capability multifd on (qemu) migrate_set_capability mapped-ram on (qemu) migrate_incoming file:migfile (the ram.h include needs to move because we've been previously relying on it being included from migration.c. Now file.h will start including multifd.h before migration.o is processed) Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-22-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	f427d90b98	migration/multifd: Support outgoing mapped-ram stream format The new mapped-ram stream format uses a file transport and puts ram pages in the migration file at their respective offsets and can be done in parallel by using the pwritev system call which takes iovecs and an offset. Add support to enabling the new format along with multifd to make use of the threading and page handling already in place. This requires multifd to stop sending headers and leaving the stream format to the mapped-ram code. When it comes time to write the data, we need to call a version of qio_channel_write that can take an offset. Usage on HMP is: (qemu) stop (qemu) migrate_set_capability multifd on (qemu) migrate_set_capability mapped-ram on (qemu) migrate_set_parameter max-bandwidth 0 (qemu) migrate_set_parameter multifd-channels 8 (qemu) migrate file:migfile Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-21-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	9d01778af8	migration/multifd: Prepare multifd sync for mapped-ram migration The mapped-ram migration can be performed live or non-live, but it is always asynchronous, i.e. the source machine and the destination machine are not migrating at the same time. We only need some pieces of the multifd sync operations. multifd_send_sync_main() ------------------------ Issued by the ram migration code on the migration thread, causes the multifd send channels to synchronize with the migration thread and makes the sending side emit a packet with the MULTIFD_FLUSH flag. With mapped-ram we want to maintain the sync on the sending side because that provides ordering between the rounds of dirty pages when migrating live. MULTIFD_FLUSH ------------- On the receiving side, the presence of the MULTIFD_FLUSH flag on a packet causes the receiving channels to start synchronizing with the main thread. We're not using packets with mapped-ram, so there's no MULTIFD_FLUSH flag and therefore no channel sync on the receiving side. multifd_recv_sync_main() ------------------------ Issued by the migration thread when the ram migration flag RAM_SAVE_FLAG_MULTIFD_FLUSH is received, causes the migration thread on the receiving side to start synchronizing with the recv channels. Due to compatibility, this is also issued when RAM_SAVE_FLAG_EOS is received. For mapped-ram we only need to synchronize the channels at the end of migration to avoid doing cleanup before the channels have finished their IO. Make sure the multifd syncs are only issued at the appropriate times. Note that due to pre-existing backward compatibility issues, we have the multifd_flush_after_each_section property that can cause a sync to happen at EOS. Since the EOS flag is needed on the stream, allow mapped-ram to just ignore it. Also emit an error if any other unexpected flags are found on the stream. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240229153017.2221-20-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	2dd7ee7a51	migration/multifd: Add incoming QIOChannelFile support On the receiving side we don't need to differentiate between main channel and threads, so whichever channel is defined first gets to be the main one. And since there are no packets, use the atomic channel count to index into the params array. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-19-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	b7b03eb614	migration/multifd: Add outgoing QIOChannelFile support Allow multifd to open file-backed channels. This will be used when enabling the mapped-ram migration stream format which expects a seekable transport. The QIOChannel read and write methods will use the preadv/pwritev versions which don't update the file offset at each call so we can reuse the fd without re-opening for every channel. Contrary to the socket migration, the file migration doesn't need an asynchronous channel creation process, so expose multifd_channel_connect() and call it directly. Note that this is just setup code and multifd cannot yet make use of the file channels. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240229153017.2221-18-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	a8a3e7102c	migration/multifd: Add a wrapper for channels_created We'll need to access multifd_send_state->channels_created from outside multifd.c, so introduce a helper for that. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-17-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	d117ed0699	migration/multifd: Allow receiving pages without packets Currently multifd does not need to have knowledge of pages on the receiving side because all the information needed is within the packets that come in the stream. We're about to add support to mapped-ram migration, which cannot use packets because it expects the ramblock section in the migration file to contain only the guest pages data. Add a data structure to transfer pages between the ram migration code and the multifd receiving threads. We don't want to reuse MultiFDPages_t for two reasons: a) multifd threads don't really need to know about the data they're receiving. b) the receiving side has to be stopped to load the pages, which means we can experiment with larger granularities than page size when transferring data. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-16-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	06833d83f8	migration/multifd: Allow multifd without packets For the upcoming support to the new 'mapped-ram' migration stream format, we cannot use multifd packets because each write into the ramblock section in the migration file is expected to contain only the guest pages. They are written at their respective offsets relative to the ramblock section header. There is no space for the packet information and the expected gains from the new approach come partly from being able to write the pages sequentially without extraneous data in between. The new format also simply doesn't need the packets and all necessary information can be taken from the standard migration headers with some (future) changes to multifd code. Use the presence of the mapped-ram capability to decide whether to send packets. This only moves code under multifd_use_packets(), it has no effect for now as mapped-ram cannot yet be enabled with multifd. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-15-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	9db1912513	migration/multifd: Decouple recv method from pages Next patches will abstract the type of data being received by the channels, so do some cleanup now to remove references to pages and dependency on 'normal_num'. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-14-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	402dd7ac1c	migration/multifd: Rename MultiFDSend\|RecvParams::data to compress_data Use a more specific name for the compression data so we can use the generic for the multifd core code. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-13-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	2f6b8826a5	migration/ram: Add incoming 'mapped-ram' migration Add the necessary code to parse the format changes for the 'mapped-ram' capability. One of the more notable changes in behavior is that in the 'mapped-ram' case ram pages are restored in one go rather than constantly looping through the migration stream. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-11-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	c2d5c4a7cb	migration/ram: Add outgoing 'mapped-ram' migration Implement the outgoing migration side for the 'mapped-ram' capability. A bitmap is introduced to track which pages have been written in the migration file. Pages are written at a fixed location for every ramblock. Zero pages are ignored as they'd be zero in the destination migration as well. The migration stream is altered to put the dirty pages for a ramblock after its header instead of having a sequential stream of pages that follow the ramblock headers. Without mapped-ram (current): With mapped-ram (new): --------------------- -------------------------------- \| ramblock 1 header \| \| ramblock 1 header \| --------------------- -------------------------------- \| ramblock 2 header \| \| ramblock 1 mapped-ram header \| --------------------- -------------------------------- \| ... \| \| padding to next 1MB boundary \| --------------------- \| ... \| \| ramblock n header \| -------------------------------- --------------------- \| ramblock 1 pages \| \| RAM_SAVE_FLAG_EOS \| \| ... \| --------------------- -------------------------------- \| stream of pages \| \| ramblock 2 header \| \| (iter 1) \| -------------------------------- \| ... \| \| ramblock 2 mapped-ram header \| --------------------- -------------------------------- \| RAM_SAVE_FLAG_EOS \| \| padding to next 1MB boundary \| --------------------- \| ... \| \| stream of pages \| -------------------------------- \| (iter 2) \| \| ramblock 2 pages \| \| ... \| \| ... \| --------------------- -------------------------------- \| ... \| \| ... \| --------------------- -------------------------------- \| RAM_SAVE_FLAG_EOS \| -------------------------------- \| ... \| -------------------------------- where: - ramblock header: the generic information for a ramblock, such as idstr, used_len, etc. - ramblock mapped-ram header: the new information added by this feature: bitmap of pages written, bitmap size and offset of pages in the migration file. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-10-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	8d9e0d4100	migration: Add mapped-ram URI compatibility check The mapped-ram migration format needs a channel that supports seeking to be able to write each page to an arbitrary offset in the migration stream. Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-9-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	4ed49feb44	migration/ram: Introduce 'mapped-ram' migration capability Add a new migration capability 'mapped-ram'. The core of the feature is to ensure that RAM pages are mapped directly to offsets in the resulting migration file instead of being streamed at arbitrary points. The reasons why we'd want such behavior are: - The resulting file will have a bounded size, since pages which are dirtied multiple times will always go to a fixed location in the file, rather than constantly being added to a sequential stream. This eliminates cases where a VM with, say, 1G of RAM can result in a migration file that's 10s of GBs, provided that the workload constantly redirties memory. - It paves the way to implement O_DIRECT-enabled save/restore of the migration stream as the pages are ensured to be written at aligned offsets. - It allows the usage of multifd so we can write RAM pages to the migration file in parallel. For now, enabling the capability has no effect. The next couple of patches implement the core functionality. Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-8-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	7f5b50a401	migration/qemu-file: add utility methods for working with seekable channels Add utility methods that will be needed when implementing 'mapped-ram' migration capability. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com> Link: https://lore.kernel.org/r/20240229153017.2221-7-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Fabiano Rosas	4aac6b1e9b	migration/multifd: Cleanup multifd_recv_sync_main Some minor cleanups and documentation for multifd_recv_sync_main. Use thread_count as done in other parts of the code. Remove p->id from the multifd_recv_state sync, since that is global and not tied to a channel. Add documentation for the sync steps. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240229153017.2221-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 15:42:04 +08:00
Bryan Zhang	b4014a2bf5	migration: Properly apply migration compression level parameters Some glue code was missing, so that using `qmp_migrate_set_parameters` to set `multifd-zstd-level` or `multifd-zlib-level` did not work. This commit adds the glue code to fix that. Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com> Link: https://lore.kernel.org/r/20240301035901.4006936-2-bryan.zhang@bytedance.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-03-01 14:14:55 +08:00
Richard Henderson	5d2203691e	migration: Remove qemu_host_page_size Replace with the maximum of the real host page size and the target page size. This is an exact replacement. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Acked-by: Helge Deller <deller@gmx.de> Message-Id: <20240102015808.132373-12-richard.henderson@linaro.org>	2024-02-29 11:35:36 -10:00
Cédric Le Goater	9425ef3f99	migration: Use migrate_has_error() in close_return_path_on_source() close_return_path_on_source() retrieves the migration error from the the QEMUFile '->to_dst_file' to know if a shutdown is required. This shutdown is required to exit the return-path thread. Avoid relying on '->to_dst_file' and use migrate_has_error() instead. (using to_dst_file is a heuristic to infer whether rp_state.from_dst_file might be stuck on a recvmsg(). Using a generic method for detecting errors is more reliable. We also want to reduce dependency on QEMUFile::last_error) Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> [added some words about the motivation for this patch] Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240226203122.22894-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Fabiano Rosas	22b04245f0	migration: Join the return path thread before releasing to_dst_file The return path thread might hang at a blocking system call. Before joining the thread we might need to issue a shutdown() on the socket file descriptor to release it. To determine whether the shutdown() is necessary we look at the QEMUFile error. Make sure we only clean up the QEMUFile after the return path has been waited for. This fixes a hang when qemu_savevm_state_setup() produced an error that was detected by migration_detect_error(). That skips migration_completion() so close_return_path_on_source() would get stuck waiting for the RP thread to terminate. Reported-by: Cédric Le Goater <clg@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240226203122.22894-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Fabiano Rosas	63f64d77f0	migration: Fix qmp_query_migrate mbps value The QMP command query_migrate might see incorrect throughput numbers if it runs after we've set the migration completion status but before migration_calculate_complete() has updated s->total_time and s->mbps. The migration status would show COMPLETED, but the throughput value would be the one from the last iteration and not the one from the whole migration. This will usually be a larger value due to the time period being smaller (one iteration). Move migration_calculate_complete() earlier so that the status MIGRATION_STATUS_COMPLETED is only emitted after the final counters update. Keep everything under the BQL so the QMP thread sees the updates as atomic. Rename migration_calculate_complete to migration_completion_end to reflect its new purpose of also updating s->state. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240226143335.14282-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	cbdafc1b34	migration: options incompatible with cpr Fail the migration request if options are set that are incompatible with cpr. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-15-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	9867d4ddd0	migration: stop vm for cpr When migration for cpr is initiated, stop the vm and set state RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the possibility of ram and device state being out of sync, and guarantees that a guest in the suspended state remains suspended, because qmp_cont rejects a cont command in the RUN_STATE_FINISH_MIGRATE state. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-11-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	4af667f87c	migration: notifier error checking Check the status returned by migration notifiers for event type MIG_EVENT_PRECOPY_SETUP, and report errors. None of the notifiers return an error status at this time. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-10-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	bf78a046b9	migration: refactor migrate_fd_connect failures Move common code for the error path in migrate_fd_connect to a shared fail label. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	6835f5a1bc	migration: per-mode notifiers Keep a separate list of migration notifiers for each migration mode. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-8-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	5663dd3f1a	migration: MigrationNotifyFunc Define MigrationNotifyFunc to improve type safety and simplify migration notifiers. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	c763a23e41	migration: remove postcopy_after_devices postcopy_after_devices and migration_in_postcopy_after_devices are no longer used, so delete them. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-6-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	9d9babf78d	migration: MigrationEvent for notifiers Passing MigrationState to notifiers is unsound because they could access unstable migration state internals or even modify the state. Instead, pass the minimal info needed in a new MigrationEvent struct, which could be extended in the future if needed. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-5-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	3e7757301c	migration: convert to NotifierWithReturn Change all migration notifiers to type NotifierWithReturn, so notifiers can return an error status in a future patch. For now, pass NULL for the notifier error parameter, and do not check the return value. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-4-git-send-email-steven.sistare@oracle.com [peterx: dropped unexpected update to roms/seabios-hppa] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	d91f33c72e	migration: remove error from notifier data Remove the error object from opaque data passed to notifiers. Use the new error parameter passed to the notifier instead. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Steve Sistare	be19d836cd	notify: pass error to notifier with return Pass an error object as the third parameter to "notifier with return" notifiers, so clients no longer need to bundle an error object in the opaque data. The new parameter is used in a later patch. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-2-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	c9a7e83c9d	migration/multifd: Drop unnecessary helper to destroy IOC Both socket_send_channel_destroy() and multifd_send_channel_destroy() are unnecessary wrappers to destroy an IOC, as the only thing to do is to release the final IOC reference. We have plenty of code that destroys an IOC using direct unref() already; keep that style. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	72b90b9687	migration/multifd: Cleanup outgoing_args in state destroy outgoing_args is a global cache of socket address to be reused in multifd. Freeing the cache in per-channel destructor is more or less a hack. Move it to multifd_send_cleanup_state() so it only get checked once. Use a small helper to do so because it's internal of socket.c. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	770de49c00	migration/multifd: Make multifd_channel_connect() return void It never fails, drop the retval and also the Error**. Suggested-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	0518b5d8d3	migration/multifd: Drop registered_yank With a clear definition of p->c protocol, where we only set it up if the channel is fully established (TLS or non-TLS), registered_yank boolean will have equal meaning of "p->c != NULL". Drop registered_yank by checking p->c instead. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Peter Xu	9221e3c6a2	migration/multifd: Cleanup TLS iochannel referencing Commit `a1af605bd5` ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake") introduced a thread for TLS channels, which will resolve the issue on blocking the main thread. However in the same commit p->c is slightly abused just to be able to pass over the pointer "p" into the thread. That's the major reason we'll need to conditionally free the io channel in the fault paths. To clean it up, using a separate structure to pass over both "p" and "tioc" in the tls handshake thread. Then we can make it a rule that p->c will never be set until the channel is completely setup. With that, we can drop the tricky conditional unref of the io channel in the error path. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240222095301.171137-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Fabiano Rosas	d13f0026c7	migration/multifd: Release recv sem_sync earlier Now that multifd_recv_terminate_threads() is called only once, release the recv side sem_sync earlier like we do for the send side. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240220224138.24759-6-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Fabiano Rosas	11dd7be575	migration/multifd: Remove p->quit from recv side Like we did on the sending side, replace the p->quit per-channel flag with a global atomic 'exiting' flag. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240220224138.24759-5-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-28 11:31:28 +08:00
Fabiano Rosas	93fa9dc2e0	migration/multifd: Add a synchronization point for channel creation It is possible that one of the multifd channels fails to be created at multifd_new_send_channel_async() while the rest of the channel creation tasks are still in flight. This could lead to multifd_save_cleanup() executing the qemu_thread_join() loop too early and not waiting for the threads which haven't been created yet, leading to the freeing of resources that the newly created threads will try to access and crash. Add a synchronization point after which there will be no attempts at thread creation and therefore calling multifd_save_cleanup() past that point will ensure it properly waits for the threads. A note about performance: Prior to this patch, if a channel took too long to be established, other channels could finish connecting first and already start taking load. Now we're bounded by the slowest-connecting channel. Reported-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-7-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Fabiano Rosas	2576ae488e	migration/multifd: Unify multifd and TLS connection paths During multifd channel creation (multifd_send_new_channel_async) when TLS is enabled, the multifd_channel_connect function is called twice, once to create the TLS handshake thread and another time after the asynchrounous TLS handshake has finished. This creates a slightly confusing call stack where multifd_channel_connect() is called more times than the number of channels. It also splits error handling between the two callers of multifd_channel_connect() causing some code duplication. Lastly, it gets in the way of having a single point to determine whether all channel creation tasks have been initiated. Refactor the code to move the reentrancy one level up at the multifd_new_send_channel_async() level, de-duplicating the error handling and allowing for the next patch to introduce a synchronization point common to all the multifd channel creation, regardless of TLS. Note that the previous code would never fail once p->c had been set. This patch changes this assumption, which affects refcounting, so add comments around object_unref to explain the situation. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-6-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Fabiano Rosas	dd904bc13f	migration/multifd: Move multifd_send_setup into migration thread We currently have an unfavorable situation around multifd channels creation and the migration thread execution. We create the multifd channels with qio_channel_socket_connect_async -> qio_task_run_in_thread, but only connect them at the multifd_new_send_channel_async callback, called from qio_task_complete, which is registered as a glib event. So at multifd_send_setup() we create the channels, but they will only be actually usable after the whole multifd_send_setup() calling stack returns back to the main loop. Which means that the migration thread is already up and running without any possibility for the multifd channels to be ready on time. We currently rely on the channels-ready semaphore blocking multifd_send_sync_main() until channels start to come up and release it. However there have been bugs recently found when a channel's creation fails and multifd_send_cleanup() is allowed to run while other channels are still being created. Let's start to organize this situation by moving the multifd_send_setup() call into the migration thread. That way we unblock the main-loop to dispatch the completion callbacks and actually have a chance of getting the multifd channels ready for when the migration thread needs them. The next patches will deal with the synchronization aspects. Note that this takes multifd_send_setup() out of the BQL. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-5-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Fabiano Rosas	bd8b0a8f82	migration/multifd: Move multifd_send_setup error handling in to the function Hide the error handling inside multifd_send_setup to make it cleaner for the next patch to move the function around. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-4-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Fabiano Rosas	a2a63c4abd	migration/multifd: Remove p->running We currently only need p->running to avoid calling qemu_thread_join() on a non existent thread if the thread has never been created. However, there are at least two bugs in this logic: 1) On the sending side, p->running is set too early and qemu_thread_create() can be skipped due to an error during TLS handshake, leaving the flag set and leading to a crash when multifd_send_cleanup() calls qemu_thread_join(). 2) During exit, the multifd thread clears the flag while holding the channel lock. The counterpart at multifd_send_cleanup() reads the flag outside of the lock and might free the mutex while the multifd thread still has it locked. Fix the first issue by setting the flag right before creating the thread. Rename it from p->running to p->thread_created to clarify its usage. Fix the second issue by not clearing the flag at the multifd thread exit. We don't have any use for that. Note that these bugs are straight-forward logic issues and not race conditions. There is still a gap for races to affect this code due to multifd_send_cleanup() being allowed to run concurrently with the thread creation loop. This issue is solved in the next patches. Cc: qemu-stable <qemu-stable@nongnu.org> Fixes: `2964714015` ("migration/tls: add support for multifd tls-handshake") Reported-by: Avihai Horon <avihaih@nvidia.com> Reported-by: chenyuhui5@huawei.com Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Fabiano Rosas	e1921f10d9	migration/multifd: Join the TLS thread We're currently leaking the resources of the TLS thread by not joining it and also overwriting the p->thread pointer altogether. Fixes: `a1af605bd5` ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake") Cc: qemu-stable <qemu-stable@nongnu.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240206215118.6171-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:18 +08:00
Avihai Horon	3205bebd4f	migration: Fix logic of channels and transport compatibility check The commit in the fixes line mistakenly modified the channels and transport compatibility check logic so it now checks multi-channel support only for socket transport type. Thus, running multifd migration using a transport other than socket that is incompatible with multi-channels (such as "exec") would lead to a segmentation fault instead of an error message. For example: (qemu) migrate_set_capability multifd on (qemu) migrate -d "exec:cat > /tmp/vm_state" Segmentation fault (core dumped) Fix it by checking multi-channel compatibility for all transport types. Cc: qemu-stable <qemu-stable@nongnu.org> Fixes: `d95533e1cd` ("migration: modify migration_channels_and_uri_compatible() for new QAPI syntax") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240125162528.7552-2-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-07 09:53:00 +08:00
Peter Xu	488c84acb4	migration/multifd: Optimize sender side to be lockless When reviewing my attempt to refactor send_prepare(), Fabiano suggested we try out with dropping the mutex in multifd code [1]. I thought about that before but I never tried to change the code. Now maybe it's time to give it a stab. This only optimizes the sender side. The trick here is multifd has a clear provider/consumer model, that the migration main thread publishes requests (either pending_job/pending_sync), while the multifd sender threads are consumers. Here we don't have a lot of complicated data sharing, and the jobs can logically be submitted lockless. Arm the code with atomic weapons. Two things worth mentioning: - For multifd_send_pages(): we can use qatomic_load_acquire() when trying to find a free channel, but that's expensive if we attach one ACQUIRE per channel. Instead, keep the qatomic_read() on reading the pending_job flag as we do already, meanwhile use one smp_mb_acquire() after the loop to guarantee the memory ordering. - For pending_sync: it doesn't have any extra data required since now p->flags are never touched, it should be safe to not use memory barrier. That's different from pending_job. Provide rich comments for all the lockless operations to state how they are paired. With that, we can remove the mutex. [1] https://lore.kernel.org/r/87o7d1jlu5.fsf@suse.de Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-24-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-06 11:05:49 +08:00
Peter Xu	98ea497d8b	migration/multifd: Fix MultiFDSendParams.packet_num race As reported correctly by Fabiano [1] (while per Fabiano, it sourced back to Elena's initial report in Oct 2023), MultiFDSendParams.packet_num is buggy to be assigned and stored. Consider two consequent operations of: (1) queue a job into multifd send thread X, then (2) queue another sync request to the same send thread X. Then the MultiFDSendParams.packet_num will be assigned twice, and the first assignment can get lost already. To avoid that, we move the packet_num assignment from p->packet_num into where the thread will fill in the packet. Use atomic operations to protect the field, making sure there's no race. Note that atomic fetch_add() may not be good for scaling purposes, however multifd should be fine as number of threads should normally not go beyond 16 threads. Let's leave that concern for later but fix the issue first. There's also a trick on how to make it always work even on 32 bit hosts for uint64_t packet number. Switching to uintptr_t as of now to simply the case. It will cause packet number to overflow easier on 32 bit, but that shouldn't be a major concern for now as 32 bit systems is not the major audience for any performance concerns like what multifd wants to address. We also need to move multifd_send_state definition upper, so that multifd_send_fill_packet() can reference it. [1] https://lore.kernel.org/r/87o7d1jlu5.fsf@suse.de Reported-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-23-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:11 +08:00
Peter Xu	cde85c37ca	migration/multifd: Stick with send/recv on function names Most of the multifd code uses send/recv to represent the two sides, but some rare cases use save/load. Since send/recv is the majority, replacing the save/load use cases to use send/recv globally. Now we reach a consensus on the naming. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-22-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:11 +08:00
Peter Xu	5e6ea8a1d6	migration/multifd: Cleanup multifd_load_cleanup() Use similar logic to cleanup the recv side. Note that multifd_recv_terminate_threads() may need some similar rework like the sender side, but let's leave that for later. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-21-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	12808db3b8	migration/multifd: Cleanup multifd_save_cleanup() Shrink the function by moving relevant works into helpers: move the thread join()s into multifd_send_terminate_threads(), then create two more helpers to cover channel/state cleanups. Add a TODO entry for the thread terminate process because p->running is still buggy. We need to fix it at some point but not yet covered. Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-20-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	f88f86c4ee	migration/multifd: Rewrite multifd_queue_page() The current multifd_queue_page() is not easy to read and follow. It is not good with a few reasons: - No helper at all to show what exactly does a condition mean; in short, readability is low. - Rely on pages->ramblock being cleared to detect an empty queue. It's slightly an overload of the ramblock pointer, per Fabiano [1], which I also agree. - Contains a self recursion, even if not necessary.. Rewrite this function. We add some comments to make it even clearer on what it does. [1] https://lore.kernel.org/r/87wmrpjzew.fsf@suse.de Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-19-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	3b40964a86	migration/multifd: Change retval of multifd_send_pages() Using int is an overkill when there're only two options. Change it to a boolean. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-18-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	d6556d174a	migration/multifd: Change retval of multifd_queue_page() Using int is an overkill when there're only two options. Change it to a boolean. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-17-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	3ab4441d97	migration/multifd: Split multifd_send_terminate_threads() Split multifd_send_terminate_threads() into two functions: - multifd_send_set_error(): used when an error happened on the sender side, set error and quit state only - multifd_send_terminate_threads(): used only by the main thread to kick all multifd send threads out of sleep, for the last recycling. Use multifd_send_set_error() in the three old call sites where only the error will be set. Use multifd_send_terminate_threads() in the last one where the main thread will kick the multifd threads at last in multifd_save_cleanup(). Both helpers will need to set quitting=1. Suggested-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-16-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	859ebaf346	migration/multifd: Forbid spurious wakeups Now multifd's logic is designed to have no spurious wakeup. I still remember a talk to Juan and he seems to agree we should drop it now, and if my memory was right it was there because multifd used to hit that when still debugging. Let's drop it and see what can explode; as long as it's not reaching soft-freeze. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-15-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	25a1f87875	migration/multifd: Move header prepare/fill into send_prepare() This patch redefines the interfacing of ->send_prepare(). It further simplifies multifd_send_thread() especially on zero copy. Now with the new interface, we require the hook to do all the work for preparing the IOVs to send. After it's completed, the IOVs should be ready to be dumped into the specific multifd QIOChannel later. So now the API looks like: p->pages -----------> send_prepare() -------------> IOVs This also prepares for the case where the input can be extended to even not any p->pages. But that's for later. This patch will achieve similar goal of what Fabiano used to propose here: https://lore.kernel.org/r/20240126221943.26628-1-farosas@suse.de However the send() interface may not be necessary. I'm boldly attaching a "Co-developed-by" for Fabiano. Co-developed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-14-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	452b205702	migration/multifd: multifd_send_prepare_header() Introduce a helper multifd_send_prepare_header() to setup the header packet for multifd sender. It's fine to setup the IOV[0] _before_ send_prepare() because the packet buffer is already ready, even if the content is to be filled in. With this helper, we can already slightly clean up the zero copy path. Note that I explicitly put it into multifd.h, because I want it inlined directly into multifd*.c where necessary later. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-13-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	8a9ef17380	migration/multifd: Move trace_multifd_send\|recv() Move them into fill/unfill of packets. With that, we can further cleanup the send/recv thread procedure, and remove one more temp var. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-12-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	db7e1cc510	migration/multifd: Move total_normal_pages accounting Just like the previous patch, move the accounting for total_normal_pages on both src/dst sides into the packet fill/unfill procedures. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-11-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	05b7ec1890	migration/multifd: Rename p->num_packets and clean it up This field, no matter whether on src or dest, is only used for debugging purpose. They can even be removed already, unless it still more or less provide some accounting on "how many packets are sent/recved for this thread". The other more important one is called packet_num, which is embeded in the multifd packet headers (MultiFDPacket_t). So let's keep them for now, but make them much easier to understand, by doing below: - Rename both of them to packets_sent / packets_recved, the old name (num_packets) are waaay too confusing when we already have MultiFDPacket_t.packets_num. - Avoid worrying on the "initial packet": we know we will send it, that's good enough. The accounting won't matter a great deal to start with 0 or with 1. - Move them to where we send/recv the packets. They're: - multifd_send_fill_packet() for senders. - multifd_recv_unfill_packet() for receivers. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-10-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	83c560fb42	migration/multifd: Drop pages->num check in sender thread Now with a split SYNC handler, we always have pages->num set for pending_job==true. Assert it instead. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-9-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	e3cce9af10	migration/multifd: Simplify locking in sender thread The sender thread will yield the p->mutex before IO starts, trying to not block the requester thread. This may be unnecessary lock optimizations, because the requester can already read pending_job safely even without the lock, because the requester is currently the only one who can assign a task. Drop that lock complication on both sides: (1) in the sender thread, always take the mutex until job done (2) in the requester thread, check pending_job clear lockless Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-8-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	f5f48a7891	migration/multifd: Separate SYNC request with normal jobs Multifd provide a threaded model for processing jobs. On sender side, there can be two kinds of job: (1) a list of pages to send, or (2) a sync request. The sync request is a very special kind of job. It never contains a page array, but only a multifd packet telling the dest side to synchronize with sent pages. Before this patch, both requests use the pending_job field, no matter what the request is, it will boost pending_job, while multifd sender thread will decrement it after it finishes one job. However this should be racy, because SYNC is special in that it needs to set p->flags with MULTIFD_FLAG_SYNC, showing that this is a sync request. Consider a sequence of operations where: - migration thread enqueue a job to send some pages, pending_job++ (0->1) - [...before the selected multifd sender thread wakes up...] - migration thread enqueue another job to sync, pending_job++ (1->2), setup p->flags=MULTIFD_FLAG_SYNC - multifd sender thread wakes up, found pending_job==2 - send the 1st packet with MULTIFD_FLAG_SYNC and list of pages - send the 2nd packet with flags==0 and no pages This is not expected, because MULTIFD_FLAG_SYNC should hopefully be done after all the pages are received. Meanwhile, the 2nd packet will be completely useless, which contains zero information. I didn't verify above, but I think this issue is still benign in that at least on the recv side we always receive pages before handling MULTIFD_FLAG_SYNC. However that's not always guaranteed and just tricky. One other reason I want to separate it is using p->flags to communicate between the two threads is also not clearly defined, it's very hard to read and understand why accessing p->flags is always safe; see the current impl of multifd_send_thread() where we tried to cache only p->flags. It doesn't need to be that complicated. This patch introduces pending_sync, a separate flag just to show that the requester needs a sync. Alongside, we remove the tricky caching of p->flags now because after this patch p->flags should only be used by multifd sender thread now, which will be crystal clear. So it is always thread safe to access p->flags. With that, we can also safely convert the pending_job into a boolean, because we don't support >1 pending jobs anyway. Always use atomic ops to access both flags to make sure no cache effect. When at it, drop the initial setting of "pending_job = 0" because it's always allocated using g_new0(). Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-7-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	efd8c5439d	migration/multifd: Drop MultiFDSendParams.normal[] array This array is redundant when p->pages exists. Now we extended the life of p->pages to the whole period where pending_job is set, it should be safe to always use p->pages->offset[] rather than p->normal[]. Drop the array. Alongside, the normal_num is also redundant, which is the same to p->pages->num. This doesn't apply to recv side, because there's no extra buffering on recv side, so p->normal[] array is still needed. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	836eca47f6	migration/multifd: Postpone reset of MultiFDPages_t Now we reset MultiFDPages_t object in the multifd sender thread in the middle of the sending job. That's not necessary, because the "pages" struct will not be reused anyway until pending_job is cleared. Move that to the end after the job is completed, provide a helper to reset a "pages" object. Use that same helper when free the object too. This prepares us to keep using p->pages in the follow up patches, where we may drop p->normal[]. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	15f3f21d59	migration/multifd: Drop MultiFDSendParams.quit, cleanup error paths Multifd send side has two fields to indicate error quits: - MultiFDSendParams.quit - &multifd_send_state->exiting Merge them into the global one. The replacement is done by changing all p->quit checks into the global var check. The global check doesn't need any lock. A few more things done on top of this altogether: - multifd_send_terminate_threads() Moving the xchg() of &multifd_send_state->exiting upper, so as to cover the tracepoint, migrate_set_error() and migrate_set_state(). - multifd_send_sync_main() In the 2nd loop, add one more check over the global var to make sure we don't keep the looping if QEMU already decided to quit. - multifd_tls_outgoing_handshake() Use multifd_send_terminate_threads() to set the error state. That has a benefit of updating MigrationState.error to that error too, so we can persist that 1st error we hit in that specific channel. - multifd_new_send_channel_async() Take similar approach like above, drop the migrate_set_error() because multifd_send_terminate_threads() already covers that. Unwrap the helper multifd_new_send_channel_cleanup() along the way; not really needed. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	48c0f5d56f	migration/multifd: multifd_send_kick_main() When a multifd sender thread hit errors, it always needs to kick the main thread by kicking all the semaphores that it can be waiting upon. Provide a helper for it and deduplicate the code. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	8888a552bf	migration/multifd: Drop stale comment for multifd zero copy We've already done that with multifd_flush_after_each_section, for multifd in general. Drop the stale "TODO-like" comment. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
William Roche	06152b89db	migration: prevent migration when VM has poisoned memory A memory page poisoned from the hypervisor level is no longer readable. The migration of a VM will crash Qemu when it tries to read the memory address space and stumbles on the poisoned page with a similar stack trace: Program terminated with signal SIGBUS, Bus error. #0 _mm256_loadu_si256 #1 buffer_zero_avx2 #2 select_accel_fn #3 buffer_is_zero #4 save_zero_page #5 ram_save_target_page_legacy #6 ram_save_host_page #7 ram_find_and_save_block #8 ram_save_iterate #9 qemu_savevm_state_iterate #10 migration_iteration_run #11 migration_thread #12 qemu_thread_start To avoid this VM crash during the migration, prevent the migration when a known hardware poison exists on the VM. Signed-off-by: William Roche <william.roche@oracle.com> Link: https://lore.kernel.org/r/20240130190640.139364-2-william.roche@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:41:58 +08:00
Fabiano Rosas	44d0d456d7	migration: Centralize BH creation and dispatch Now that the migration state reference counting is correct, further wrap the bottom half dispatch process to avoid future issues. Move BH creation and scheduling together and wrap the dispatch with an intermediary function that will ensure we always keep the ref/unref balanced. Also move the responsibility of deleting the BH into the wrapper and remove the now unnecessary pointers. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240119233922.32588-6-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Fabiano Rosas	699d9476a0	migration: Add a wrapper to qemu_bh_schedule Wrap qemu_bh_schedule() to ensure we always hold a reference to the current_migration object. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240119233922.32588-5-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Fabiano Rosas	9cf268965d	migration: Reference migration state around loadvm_postcopy_handle_run_bh We need to hold a reference to the current_migration object around async calls to avoid it been freed while still in use. Even on this load-side function, we might still use the MigrationState, e.g to check for capabilities. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240119233922.32588-4-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Fabiano Rosas	59094cfa7a	migration: Take reference to migration state around bg_migration_vm_start_bh We need to hold a reference to the current_migration object around async calls to avoid it been freed while still in use. Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240119233922.32588-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Fabiano Rosas	27eb8499ed	migration: Fix use-after-free of migration state object We're currently allowing the process_incoming_migration_bh bottom-half to run without holding a reference to the 'current_migration' object, which leads to a segmentation fault if the BH is still live after migration_shutdown() has dropped the last reference to current_migration. In my system the bug manifests as migrate_multifd() returning true when it shouldn't and multifd_load_shutdown() calling multifd_recv_terminate_threads() which crashes due to an uninitialized multifd_recv_state. Fix the issue by holding a reference to the object when scheduling the BH and dropping it before returning from the BH. The same is already done for the cleanup_bh at migrate_fd_cleanup_schedule(). Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1969 Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240119233922.32588-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Fabiano Rosas	0a5d1108ab	migration/yank: Use channel features Stop using outside knowledge about the io channels when registering yank functions. Query for features instead. The yank method for all channels used with migration code currently is to call the qio_channel_shutdown() function, so query for QIO_CHANNEL_FEATURE_SHUTDOWN. We could add a separate feature in the future for indicating whether a channel supports yanking, but that seems overkill at the moment. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20230911171320.24372-9-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Peter Xu	b0504edd40	migration: Drop unnecessary check in ram's pending_exact() When the migration frameworks fetches the exact pending sizes, it means this check: remaining_size < s->threshold_size Must have been done already, actually at migration_iteration_run(): if (must_precopy <= s->threshold_size) { qemu_savevm_state_pending_exact(&must_precopy, &can_postcopy); That should be after one round of ram_state_pending_estimate(). It makes the 2nd check meaningless and can be dropped. To say it in another way, when reaching ->state_pending_exact(), we unconditionally sync dirty bits for precopy. Then we can drop migrate_get_current() there too. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240117075848.139045-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Peter Xu	a8629e0c2f	migration: Make threshold_size an uint64_t It's always used to compare against another uint64_t. Make it always clear that it's never a negative. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240117075848.139045-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Markus Armbruster	918f620d30	migration: Plug memory leak on HMP migrate error path hmp_migrate() leaks @caps when qmp_migrate() fails. Plug the leak with g_autoptr(). Fixes: `967f2de5c9` (migration: Implement MigrateChannelList to hmp migration flow.) v8.2.0-rc0 Fixes: CID 1533125 Signed-off-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/20240117140722.3979657-1-armbru@redhat.com [peterx: fix CID number as reported by Peter Maydell] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Paolo Bonzini	73b4987858	userfaultfd: use 1ULL to build ioctl masks There is no need to use the Linux-internal __u64 type, 1ULL is guaranteed to be wide enough. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20240117160313.175609-1-pbonzini@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-29 11:02:12 +08:00
Nick Briggs	44ce1b5d2f	migration/rdma: define htonll/ntohll only if not predefined Solaris has #defines for htonll and ntohll which cause syntax errors when compiling code that attempts to (re)define these functions.. Signed-off-by: Nick Briggs <nicholas.h.briggs@gmail.com> Link: https://lore.kernel.org/r/65a04a7d.497ab3.3e7bef1f@gateway.sonic.net Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-16 11:16:10 +08:00
Fabiano Rosas	e3b8ad5c13	migration: Report error in incoming migration We're not currently reporting the errors set with migrate_set_error() when incoming migration fails. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240104142144.9680-5-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-16 11:16:09 +08:00
Fabiano Rosas	6074f81625	migration/multifd: Change multifd_pages_init argument The 'size' argument is actually the number of pages that fit in a multifd packet. Change it to uint32_t and rename. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240104142144.9680-4-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-16 11:16:09 +08:00
Fabiano Rosas	9346fa1870	migration/multifd: Remove QEMUFile from where it is not needed Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240104142144.9680-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-16 11:16:09 +08:00
Fabiano Rosas	dca1bc7f24	migration/multifd: Remove MultiFDPages_t::packet_num This was introduced by commit `34c55a94b1` ("migration: Create multipage support") and never used. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240104142144.9680-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-16 11:16:09 +08:00
Het Gala	0770ad438d	migration: Simplify initial conditionals in migration for better readability The inital conditional statements in qmp migration functions is harder to understand than necessary. It is better to get all errors out of the way in the beginning itself to have better readability and error handling. Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20231205080039.197615-1-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-16 11:16:09 +08:00
Stefan Hajnoczi	a4a411fbaf	Replace "iothread lock" with "BQL" in comments The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL) in their names. Update the code comments to use "BQL" instead of "iothread lock". Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Message-id: 20240102153529.486531-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2024-01-08 10:45:43 -05:00
Stefan Hajnoczi	195801d700	system/cpus: rename qemu_mutex_lock_iothread() to bql_lock() The Big QEMU Lock (BQL) has many names and they are confusing. The actual QemuMutex variable is called qemu_global_mutex but it's commonly referred to as the BQL in discussions and some code comments. The locking APIs, however, are called qemu_mutex_lock_iothread() and qemu_mutex_unlock_iothread(). The "iothread" name is historic and comes from when the main thread was split into into KVM vcpu threads and the "iothread" (now called the main loop thread). I have contributed to the confusion myself by introducing a separate --object iothread, a separate concept unrelated to the BQL. The "iothread" name is no longer appropriate for the BQL. Rename the locking APIs to: - void bql_lock(void) - void bql_unlock(void) - bool bql_locked(void) There are more APIs with "iothread" in their names. Subsequent patches will rename them. There are also comments and documentation that will be updated in later patches. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Fabiano Rosas <farosas@suse.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Acked-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-id: 20240102153529.486531-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2024-01-08 10:45:43 -05:00
Peter Maydell	c8193acc07	migration 1st pull for 9.0 - We lost Juan and Leo in the maintainers file - Steven's suspend state fix - Steven's fix for coverity on migrate_mode - Avihai's migration cleanup series -----BEGIN PGP SIGNATURE----- iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZZY0TxIccGV0ZXJ4QHJl ZGhhdC5jb20ACgkQO1/MzfOr1wbSxgEAoM5g3wkc22lpAlRpU+hJUqT9NVOVQSK+ Fk7XJYTdSgABAKzykA6hAmU5Kj+yVI6jI874SVZbs2FWpFs4osvsKk4D =sfuM -----END PGP SIGNATURE----- Merge tag 'migration-20240104-pull-request' of https://gitlab.com/peterx/qemu into staging migration 1st pull for 9.0 - We lost Juan and Leo in the maintainers file - Steven's suspend state fix - Steven's fix for coverity on migrate_mode - Avihai's migration cleanup series # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZZY0TxIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wbSxgEAoM5g3wkc22lpAlRpU+hJUqT9NVOVQSK+ # Fk7XJYTdSgABAKzykA6hAmU5Kj+yVI6jI874SVZbs2FWpFs4osvsKk4D # =sfuM # -----END PGP SIGNATURE----- # gpg: Signature made Thu 04 Jan 2024 04:30:07 GMT # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown] # gpg: aka "Peter Xu <peterx@redhat.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-20240104-pull-request' of https://gitlab.com/peterx/qemu: (26 commits) migration: fix coverity migrate_mode finding migration/multifd: Remove unnecessary usage of local Error migration: Remove unnecessary usage of local Error migration: Fix migration_channel_read_peek() error path migration/multifd: Remove error_setg() in migration_ioc_process_incoming() migration/multifd: Fix leaking of Error in TLS error flow migration/multifd: Simplify multifd_channel_connect() if else statement migration/multifd: Fix error message in multifd_recv_initial_packet() migration: Remove errp parameter in migration_fd_process_incoming() migration: Refactor migration_incoming_setup() migration: Remove nulling of hostname in migrate_init() migration: Remove migrate_max_downtime() declaration tests/qtest: postcopy migration with suspend tests/qtest: precopy migration with suspend tests/qtest: option to suspend during migration tests/qtest: migration events migration: preserve suspended for bg_migration migration: preserve suspended for snapshot migration: preserve suspended runstate migration: propagate suspended runstate ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-01-05 13:35:25 +00:00
Steve Sistare	b12635ff08	migration: fix coverity migrate_mode finding Coverity diagnoses a possible out-of-range array index here ... static GSList migration_blockers[MIG_MODE__MAX]; fill_source_migration_info() { GSList cur_blocker = migration_blockers[migrate_mode()]; ... because it does not know that MIG_MODE__MAX will never be returned as a migration mode. To fix, assert so in migrate_mode(). Fixes: `fa3673e497` ("migration: per-mode blockers") Reported-by: Peter Maydell <peter.maydell@linaro.org> Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/1699907025-215450-1-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	3fc58efa93	migration/multifd: Remove unnecessary usage of local Error According to Error API, usage of ERRP_GUARD() or a local Error instead of errp is needed if errp is passed to void functions, where it is later dereferenced to see if an error occurred. There are several places in multifd.c that use local Error although it is not needed. Change these places to use errp directly. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20231231093016.14204-12-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	b6f4c0c788	migration: Remove unnecessary usage of local Error According to Error API, usage of ERRP_GUARD() or a local Error instead of errp is needed if errp is passed to void functions, where it is later dereferenced to see if an error occurred. There are several places in migration.c that use local Error although it is not needed. Change these places to use errp directly. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20231231093016.14204-11-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	4f8cf323e8	migration: Fix migration_channel_read_peek() error path migration_channel_read_peek() calls qio_channel_readv_full() and handles both cases of return value == 0 and return value < 0 the same way, by calling error_setg() with errp. However, if return value < 0, errp is already set, so calling error_setg() with errp will lead to an assert. Fix it by handling these cases separately, calling error_setg() with errp only in return value == 0 case. Fixes: `6720c2b327` ("migration: check magic value for deciding the mapping of channels") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20231231093016.14204-10-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	1d3886f837	migration/multifd: Remove error_setg() in migration_ioc_process_incoming() If multifd_load_setup() fails in migration_ioc_process_incoming(), error_setg() is called with errp. This will lead to an assert because in that case errp already contains an error. Fix it by removing the redundant error_setg(). Fixes: `6720c2b327` ("migration: check magic value for deciding the mapping of channels") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20231231093016.14204-9-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	6ae208ce96	migration/multifd: Fix leaking of Error in TLS error flow If there is an error in multifd TLS handshake task, multifd_tls_outgoing_handshake() retrieves the error with qio_task_propagate_error() but never frees it. Fix it by freeing the obtained Error. In addition, the error is not reported at all, so report it with migrate_set_error(). Fixes: `2964714015` ("migration/tls: add support for multifd tls-handshake") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20231231093016.14204-8-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	a4395f5d3c	migration/multifd: Simplify multifd_channel_connect() if else statement The else branch in multifd_channel_connect() is redundant because when the if branch is taken the function returns. Simplify the code by removing the else branch. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20231231093016.14204-7-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	c77b40859a	migration/multifd: Fix error message in multifd_recv_initial_packet() In multifd_recv_initial_packet(), if MultiFDInit_t->id is greater than the configured number of multifd channels, an irrelevant error message about multifd version is printed. Change the error message to a relevant one about the channel id. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20231231093016.14204-6-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	b0cf3bfc69	migration: Remove errp parameter in migration_fd_process_incoming() Errp parameter in migration_fd_process_incoming() is unused. Remove it. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20231231093016.14204-5-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	c606f3baa2	migration: Refactor migration_incoming_setup() Commit `6720c2b327` ("migration: check magic value for deciding the mapping of channels") extracted the only code that could fail in migration_incoming_setup(). Now migration_incoming_setup() can't fail, so refactor it to return void and remove errp parameter. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20231231093016.14204-4-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	80caba6955	migration: Remove nulling of hostname in migrate_init() MigrationState->hostname is set to NULL in migrate_init(). This is redundant because it is already freed and set to NULL in migrade_fd_cleanup(). Remove it. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20231231093016.14204-3-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Avihai Horon	17b9483baa	migration: Remove migrate_max_downtime() declaration migrate_max_downtime() has been removed long ago, but its declaration was mistakenly left. Remove it. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20231231093016.14204-2-avihaih@nvidia.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Steve Sistare	49a5020697	migration: preserve suspended for bg_migration Do not wake a suspended guest during bg_migration, and restore the prior state at finish rather than unconditionally running. Allow the additional state transitions that occur. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1704312341-66640-9-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Steve Sistare	58b105703e	migration: preserve suspended for snapshot Restoring a snapshot can break a suspended guest. Snapshots suffer from the same suspended-state issues that affect live migration, plus they must handle an additional problematic scenario, which is that a running vm must remain running if it loads a suspended snapshot. To save, the existing vm_stop call now completely stops the suspended state. Finish with vm_resume to leave the vm in the state it had prior to the save, correctly restoring the suspended state. To load, if the snapshot is not suspended, then vm_stop + vm_resume correctly handles all states, and leaves the vm in the state it had prior to the load. However, if the snapshot is suspended, restoration is trickier. First, call vm_resume to restore the state to suspended so the current state matches the saved state. Then, if the pre-load state is running, call wakeup to resume running. Prior to these changes, the vm_stop to RUN_STATE_SAVE_VM and RUN_STATE_RESTORE_VM did not change runstate if the current state was suspended, but now it does, so allow these transitions. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1704312341-66640-8-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Steve Sistare	b4e9ddccd1	migration: preserve suspended runstate A guest that is migrated in the suspended state automaticaly wakes and continues execution. This is wrong; the guest should end migration in the same state it started. The root cause is that the outgoing migration code automatically wakes the guest, then saves the RUNNING runstate in global_state_store(), hence the incoming migration code thinks the guest is running and continues the guest if autostart is true. On the outgoing side, delete the call to qemu_system_wakeup_request(). Now that vm_stop completely stops a vm in the suspended state (from the preceding patches), the existing call to vm_stop_force_state is sufficient to correctly migrate all vmstate. On the incoming side, call vm_start if the pre-migration state was running or suspended. For the latter, vm_start correctly restores the suspended state, and a future system_wakeup monitor request will cause the vm to resume running. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1704312341-66640-7-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Steve Sistare	d3c86c99f3	migration: propagate suspended runstate If the outgoing machine was previously suspended, propagate that to the incoming side via global_state, so a subsequent vm_start restores the suspended state. To maintain backward and forward compatibility, reclaim some space from the runstate member. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1704312341-66640-6-git-send-email-steven.sistare@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-04 09:52:42 +08:00
Richard Henderson	a77ffe9595	migration: Constify VMState Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20231221031652.119827-67-richard.henderson@linaro.org>	2023-12-30 07:38:06 +11:00
Richard Henderson	2027001919	migration: Make VMStateDescription.subsections const Allow the array of pointers to itself be const. Propagate this through the copies of this field. Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20231221031652.119827-2-richard.henderson@linaro.org>	2023-12-29 11:17:30 +11:00
Stefan Hajnoczi	b0839e87af	dirtylimit dirtyrate pull request 20231225 Nitpick about an unused parameter Please apply, thanks, Yong -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEaF0CINwmSCgVLlfC3/Ij1rP+y5wFAmWJVtkWHHlvbmcuaHVh bmdAc21hcnR4LmNvbQAKCRDf8iPWs/7LnLexEADjuQ7dtbv2vq1ksqDq6fct++FK sGPe6P9UAyJrB0qxW3jKRn2P4E/xiPHpmi+BbOOu8PppeZt86NFlzun681vMpkKu d3oAlwRtX8eklk3oub+yBXzkAfsGr+EHEWms/12ATsd6apItxi47zUUw/2aznujz TlZR7qOD1kjvPHOVnP07fQ3NYiPIITpzSqVgo3fxE+upmb+dnELIkyGTAvJY2Pjz Uh4CtG/sQq8OJJwUn1TFmhB1Ujw9oo8v/lOtIpsxfcKtQcAwyvgmyfndOMZ/h6O3 ri3axypg8PyGyidYEh0JSoR3OVUIt5eoqPNp+sQTNeZsjTIwKjpBGUOLXbIA56nB Z/ZRbxSKj+JRD9qQZ2NODE1QurgEuT51aWkEeBEsx0OtUINcjZkwWeQmiasJQw1T 4z4Hez0ipuET9MmeTbWkVh9rO3Bsxi9QxiXEiN6YPNqTqLhwomXU+MIlrCsZQzjB l4Jij9byeWL9ntx3qwtSWJGHV6Qql16UkPo+c2Co/q2rdx/FqiYrMgCtLee1yo5n lSCKmrVMLaLyQv7jnLY8txhU7/CXNnwij4YSgfi7nujOgM6L3uwYc9Tp5kcQHE/b o/Mr6oOi79Z+rdByk4pPUKaJV7FnbWMoUgjsqWPJgedGt0K9StRsXCGR1wYMUVdU R8Dv0IZyyGuSf6MhNw== =Y00D -----END PGP SIGNATURE----- Merge tag 'dirtylimit-dirtyrate-pull-request-20231225' of https://github.com/newfriday/qemu into staging dirtylimit dirtyrate pull request 20231225 Nitpick about an unused parameter Please apply, thanks, Yong # -----BEGIN PGP SIGNATURE----- # # iQJKBAABCAA0FiEEaF0CINwmSCgVLlfC3/Ij1rP+y5wFAmWJVtkWHHlvbmcuaHVh # bmdAc21hcnR4LmNvbQAKCRDf8iPWs/7LnLexEADjuQ7dtbv2vq1ksqDq6fct++FK # sGPe6P9UAyJrB0qxW3jKRn2P4E/xiPHpmi+BbOOu8PppeZt86NFlzun681vMpkKu # d3oAlwRtX8eklk3oub+yBXzkAfsGr+EHEWms/12ATsd6apItxi47zUUw/2aznujz # TlZR7qOD1kjvPHOVnP07fQ3NYiPIITpzSqVgo3fxE+upmb+dnELIkyGTAvJY2Pjz # Uh4CtG/sQq8OJJwUn1TFmhB1Ujw9oo8v/lOtIpsxfcKtQcAwyvgmyfndOMZ/h6O3 # ri3axypg8PyGyidYEh0JSoR3OVUIt5eoqPNp+sQTNeZsjTIwKjpBGUOLXbIA56nB # Z/ZRbxSKj+JRD9qQZ2NODE1QurgEuT51aWkEeBEsx0OtUINcjZkwWeQmiasJQw1T # 4z4Hez0ipuET9MmeTbWkVh9rO3Bsxi9QxiXEiN6YPNqTqLhwomXU+MIlrCsZQzjB # l4Jij9byeWL9ntx3qwtSWJGHV6Qql16UkPo+c2Co/q2rdx/FqiYrMgCtLee1yo5n # lSCKmrVMLaLyQv7jnLY8txhU7/CXNnwij4YSgfi7nujOgM6L3uwYc9Tp5kcQHE/b # o/Mr6oOi79Z+rdByk4pPUKaJV7FnbWMoUgjsqWPJgedGt0K9StRsXCGR1wYMUVdU # R8Dv0IZyyGuSf6MhNw== # =Y00D # -----END PGP SIGNATURE----- # gpg: Signature made Mon 25 Dec 2023 05:18:01 EST # gpg: using RSA key 685D0220DC264828152E57C2DFF223D6B3FECB9C # gpg: issuer "yong.huang@smartx.com" # gpg: Good signature from "Yong Huang <yong.huang@smartx.com>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 685D 0220 DC26 4828 152E 57C2 DFF2 23D6 B3FE CB9C * tag 'dirtylimit-dirtyrate-pull-request-20231225' of https://github.com/newfriday/qemu: migration/dirtyrate: Remove an extra parameter Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-12-26 06:06:50 -05:00
Wafer	4918712fb1	migration/dirtyrate: Remove an extra parameter vcpu_dirty_stat_collect() has an unused parameter so remove it. Signed-off-by: Wafer <wafer@jaguarmicro.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <20231204012230.4123-1-wafer@jaguarmicro.com>	2023-12-25 18:05:47 +08:00
Stefan Hajnoczi	b49f4755c7	block: remove AioContext locking This is the big patch that removes aio_context_acquire()/aio_context_release() from the block layer and affected block layer users. There isn't a clean way to split this patch and the reviewers are likely the same group of people, so I decided to do it in one patch. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Message-ID: <20231205182011.1976568-7-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-12-21 22:49:27 +01:00
Stefan Hajnoczi	019f8c19df	Migration patches for rc3: - One more memleak regression fix from Het -----BEGIN PGP SIGNATURE----- iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZWoLbRIccGV0ZXJ4QHJl ZGhhdC5jb20ACgkQO1/MzfOr1wahYwD+OsD7CaZYjkl9KSooRfblEenD6SdfhAdC oZc07f2UxocA/0s1keDBZUUcZOiGYPDFV5his4Jw4F+RRD1YIpVWZg4J =T0/r -----END PGP SIGNATURE----- Merge tag 'migration-20231201-pull-request' of https://github.com/xzpeter/qemu into staging Migration patches for rc3: - One more memleak regression fix from Het # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZWoLbRIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wahYwD+OsD7CaZYjkl9KSooRfblEenD6SdfhAdC # oZc07f2UxocA/0s1keDBZUUcZOiGYPDFV5his4Jw4F+RRD1YIpVWZg4J # =T0/r # -----END PGP SIGNATURE----- # gpg: Signature made Fri 01 Dec 2023 11:35:57 EST # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [full] # gpg: aka "Peter Xu <peterx@redhat.com>" [full] # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-20231201-pull-request' of https://github.com/xzpeter/qemu: migration: Plug memory leak with migration URIs Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-12-04 08:01:24 -05:00
Het Gala	bc1d54ee51	migration: Plug memory leak with migration URIs migrate_uri_parse() allocates memory to 'channel' if the user opts for old syntax - uri, which is leaked because there is no code for freeing 'channel'. So, free channel to avoid memory leak in case where 'channels' is empty and uri parsing is required. Fixes: `5994024f` ("migration: Implement MigrateChannelList to qmp migration flow") Signed-off-by: Het Gala <het.gala@nutanix.com> Suggested-by: Markus Armbruster <armbru@redhat.com> Tested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Link: https://lore.kernel.org/r/20231129204301.131228-1-het.gala@nutanix.com Signed-off-by: Peter Xu <peterx@redhat.com>	2023-12-01 11:01:28 -05:00
Zongmin Zhou	41581265aa	migration: free 'saddr' since be no longer used Since socket_parse() will allocate memory for 'saddr',and its value will pass to 'addr' that allocated by migrate_uri_parse(), then 'saddr' will no longer used,need to free. But due to 'saddr->u' is shallow copying the contents of the union, the members of this union containing allocated strings,and will be used after that. So just free 'saddr' itself without doing a deep free on the contents of the SocketAddress. Fixes: `72a8192e22` ("migration: convert migration 'uri' into 'MigrateAddress'") Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231120031428.908295-1-zhouzongmin@kylinos.cn>	2023-11-30 09:51:24 +01:00
Fabiano Rosas	0a08c7947b	migration/multifd: Stop setting p->ioc before connecting This is being shadowed but the assignments at multifd_channel_connect() and multifd_tls_channel_connect() . Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-ID: <20231110200241.20679-2-farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-11-30 09:50:10 +01:00
Michael Tokarev	e3fc69343c	migration/rdma.c: spelling fix: asume Fixes: `67c31c9c1a` "migration: Don't abuse qemu_file transferred for RDMA" Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2023-11-15 11:59:54 +03:00
Kevin Wolf	f5a3a270fe	block: Mark bdrv_filter_bs() and callers GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_filter_bs() need to hold a reader lock for the graph because it calls bdrv_filter_child(), which accesses bs->file/backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20231027155333.420094-4-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-11-07 19:14:19 +01:00
Stefan Hajnoczi	bb59f3548f	vfio queue: * Support for non 64b IOVA space * Introduction of a PCIIOMMUOps callback structure to ease future extensions * Fix for a buffer overrun when writing the VF token * PPC cleanups preparing ground for IOMMUFD support -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmVI+bIACgkQUaNDx8/7 7KHW4g/9FmgX0k2Elm1BAul3slJtuBT8/iHKfK19rhXICxhxS5xBWJA8FmosTWAT 91YqQJhOHARxLd9VROfv8Fq8sAo+Ys8bP3PTXh5satjY5gR9YtmMSVqvsAVLn7lv a/0xp7wPJt2UeKzvRNUqFXNr7yHPwxFxbJbmmAJbNte8p+TfE2qvojbJnu7BjJbg sTtS/vFWNJwtuNYTkMRoiZaUKEoEZ8LnslOqKUjgeO59g4i3Dq8e2JCmHANPFWUK cWmr7AqcXgXEnLSDWTtfN53bjcSCYkFVb4WV4Wv1/7hUF5jQ4UR0l3B64xWe0M3/ Prak3bWOM/o7JwLBsgaWPngXA9V0WFBTXVF4x5qTwhuR1sSV8MxUvTKxI+qqiEzA FjU89oSZ+zXId/hEUuTL6vn1Th8/6mwD0L9ORchNOQUKzCjBzI4MVPB09nM3AdPC LGThlufsZktdoU2KjMHpc+gMIXQYsxkgvm07K5iZTZ5eJ4tV5KB0aPvTZppGUxe1 YY9og9F3hxjDHQtEuSY2rzBQI7nrUpd1ZI5ut/3ZgDWkqD6aGRtMme4n4GsGsYb2 Ht9+d2RL9S8uPUh+7rV8K/N3+vXgXRaEYTuAScKtflEbA7YnZA5nUdMng8x0kMTQ Y73XCd4UGWDfSSZsgaIHGkM/MRIHgmlrfcwPkWqWW9vF+92O6Hw= =/Du0 -----END PGP SIGNATURE----- Merge tag 'pull-vfio-20231106' of https://github.com/legoater/qemu into staging vfio queue: * Support for non 64b IOVA space * Introduction of a PCIIOMMUOps callback structure to ease future extensions * Fix for a buffer overrun when writing the VF token * PPC cleanups preparing ground for IOMMUFD support # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmVI+bIACgkQUaNDx8/7 # 7KHW4g/9FmgX0k2Elm1BAul3slJtuBT8/iHKfK19rhXICxhxS5xBWJA8FmosTWAT # 91YqQJhOHARxLd9VROfv8Fq8sAo+Ys8bP3PTXh5satjY5gR9YtmMSVqvsAVLn7lv # a/0xp7wPJt2UeKzvRNUqFXNr7yHPwxFxbJbmmAJbNte8p+TfE2qvojbJnu7BjJbg # sTtS/vFWNJwtuNYTkMRoiZaUKEoEZ8LnslOqKUjgeO59g4i3Dq8e2JCmHANPFWUK # cWmr7AqcXgXEnLSDWTtfN53bjcSCYkFVb4WV4Wv1/7hUF5jQ4UR0l3B64xWe0M3/ # Prak3bWOM/o7JwLBsgaWPngXA9V0WFBTXVF4x5qTwhuR1sSV8MxUvTKxI+qqiEzA # FjU89oSZ+zXId/hEUuTL6vn1Th8/6mwD0L9ORchNOQUKzCjBzI4MVPB09nM3AdPC # LGThlufsZktdoU2KjMHpc+gMIXQYsxkgvm07K5iZTZ5eJ4tV5KB0aPvTZppGUxe1 # YY9og9F3hxjDHQtEuSY2rzBQI7nrUpd1ZI5ut/3ZgDWkqD6aGRtMme4n4GsGsYb2 # Ht9+d2RL9S8uPUh+7rV8K/N3+vXgXRaEYTuAScKtflEbA7YnZA5nUdMng8x0kMTQ # Y73XCd4UGWDfSSZsgaIHGkM/MRIHgmlrfcwPkWqWW9vF+92O6Hw= # =/Du0 # -----END PGP SIGNATURE----- # gpg: Signature made Mon 06 Nov 2023 22:35:30 HKT # gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1 # gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [unknown] # gpg: aka "Cédric Le Goater <clg@kaod.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1 * tag 'pull-vfio-20231106' of https://github.com/legoater/qemu: (22 commits) vfio/common: Move vfio_host_win_add/del into spapr.c vfio/spapr: Make vfio_spapr_create/remove_window static vfio/container: Move spapr specific init/deinit into spapr.c vfio/container: Move vfio_container_add/del_section_window into spapr.c vfio/container: Move IBM EEH related functions into spapr_pci_vfio.c util/uuid: Define UUID_STR_LEN from UUID_NONE string util/uuid: Remove UUID_FMT_LEN vfio/pci: Fix buffer overrun when writing the VF token util/uuid: Add UUID_STR_LEN definition hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps test: Add some tests for range and resv-mem helpers virtio-iommu: Consolidate host reserved regions and property set ones virtio-iommu: Implement set_iova_ranges() callback virtio-iommu: Record whether a probe request has been issued range: Introduce range_inverse_array() virtio-iommu: Introduce per IOMMUDevice reserved regions util/reserved-region: Add new ReservedRegion helpers range: Make range_compare() public virtio-iommu: Rename reserved_regions into prop_resv_regions vfio: Collect container iova range info ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-11-07 09:41:52 +08:00
Juan Quintela	0983125b40	migration: Unlock mutex in error case We were not unlocking bitmap mutex on the error case. To fix it forever change to enclose the code with WITH_QEMU_LOCK_GUARD(). Coverity CID 1523750. Fixes: `a2326705e5` ("migration: Stop migration immediately in RDMA error paths") Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231103074245.55166-1-quintela@redhat.com>	2023-11-03 10:48:37 +01:00
Cédric Le Goater	721da0396c	util/uuid: Add UUID_STR_LEN definition qemu_uuid_unparse() includes a trailing NUL when writing the uuid string and the buffer size should be UUID_FMT_LEN + 1 bytes. Add a define for this size and use it where required. Cc: Fam Zheng <fam@euphon.net> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: "Denis V. Lunev" <den@openvz.org> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-11-03 09:20:31 +01:00
Het Gala	967f2de5c9	migration: Implement MigrateChannelList to hmp migration flow. Integrate MigrateChannelList with all transport backends (socket, exec and rdma) for both src and dest migration endpoints for hmp migration. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-ID: <20231023182053.8711-14-farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-11-02 11:35:04 +01:00
Het Gala	5994024fd1	migration: Implement MigrateChannelList to qmp migration flow. Integrate MigrateChannelList with all transport backends (socket, exec and rdma) for both src and dest migration endpoints for qmp migration. For current series, limit the size of MigrateChannelList to single element (single interface) as runtime check. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-13-farosas@suse.de>	2023-11-02 11:35:04 +01:00
Het Gala	d95533e1cd	migration: modify migration_channels_and_uri_compatible() for new QAPI syntax migration_channels_and_uri_compatible() check for transport mechanism suitable for multifd migration gets executed when the caller calls old uri syntax. It needs it to be run when using the modern MigrateChannel QAPI syntax too. After URI -> 'MigrateChannel' : migration_channels_and_uri_compatible() -> migration_channels_and_transport_compatible() passes object as argument and check for valid transport mechanism. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-12-farosas@suse.de>	2023-11-02 11:35:04 +01:00
Het Gala	074dbce5fc	migration: New migrate and migrate-incoming argument 'channels' MigrateChannelList allows to connect accross multiple interfaces. Add MigrateChannelList struct as argument to migration QAPIs. We plan to include multiple channels in future, to connnect multiple interfaces. Hence, we choose 'MigrateChannelList' as the new argument over 'MigrateChannel' to make migration QAPIs future proof. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-10-farosas@suse.de>	2023-11-02 11:35:04 +01:00
Fabiano Rosas	02afba63e9	migration: Convert the file backend to the new QAPI syntax Convert the file: URI to accept a FileMigrationArgs to be compatible with the new migration QAPI. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-9-farosas@suse.de>	2023-11-02 11:35:04 +01:00
Het Gala	cbab4face5	migration: convert exec backend to accept MigrateAddress. Exec transport backend for 'migrate'/'migrate-incoming' QAPIs accept new wire protocol of MigrateAddress struct. It is achived by parsing 'uri' string and storing migration parameters required for exec connection into strList struct. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-8-farosas@suse.de>	2023-11-02 11:35:04 +01:00
Het Gala	3fa9642ff7	migration: convert rdma backend to accept MigrateAddress RDMA based transport backend for 'migrate'/'migrate-incoming' QAPIs accept new wire protocol of MigrateAddress struct. It is achived by parsing 'uri' string and storing migration parameters required for RDMA connection into well defined InetSocketAddress struct. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-7-farosas@suse.de>	2023-11-02 11:35:03 +01:00
Het Gala	34dfc5e407	migration: convert socket backend to accept MigrateAddress Socket transport backend for 'migrate'/'migrate-incoming' QAPIs accept new wire protocol of MigrateAddress struct. It is achived by parsing 'uri' string and storing migration parameters required for socket connection into well defined SocketAddress struct. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-6-farosas@suse.de>	2023-11-02 11:35:03 +01:00
Het Gala	72a8192e22	migration: convert migration 'uri' into 'MigrateAddress' This patch parses 'migrate' and 'migrate-incoming' QAPI's 'uri' string containing migration connection related information and stores them inside well defined 'MigrateAddress' struct. Fabiano fixed for "file" transport. Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com> Signed-off-by: Het Gala <het.gala@nutanix.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231023182053.8711-4-farosas@suse.de> Message-ID: <20231023182053.8711-5-farosas@suse.de>	2023-11-02 11:35:03 +01:00
Peter Xu	88577f3242	migration: Change ram_dirty_bitmap_reload() retval to bool Now we have a Error** passed into the return path thread stack, which is even clearer than an int retval. Change ram_dirty_bitmap_reload() and the callers to use a bool instead to replace errnos. Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231017202633.296756-5-peterx@redhat.com>	2023-11-02 11:35:03 +01:00
Peter Xu	f8c543e808	migration: Allow network to fail even during recovery Normally the postcopy recover phase should only exist for a super short period, that's the duration when QEMU is trying to recover from an interrupted postcopy migration, during which handshake will be carried out for continuing the procedure with state changes from PAUSED -> RECOVER -> POSTCOPY_ACTIVE again. Here RECOVER phase should be super small, that happens right after the admin specified a new but working network link for QEMU to reconnect to dest QEMU. However there can still be case where the channel is broken in this small RECOVER window. If it happens, with current code there's no way the src QEMU can got kicked out of RECOVER stage. No way either to retry the recover in another channel when established. This patch allows the RECOVER phase to fail itself too - we're mostly ready, just some small things missing, e.g. properly kick the main migration thread out when sleeping on rp_sem when we found that we're at RECOVER stage. When this happens, it fails the RECOVER itself, and rollback to PAUSED stage. Then the user can retry another round of recovery. To make it even stronger, teach QMP command migrate-pause to explicitly kick src/dst QEMU out when needed, so even if for some reason the migration thread didn't got kicked out already by a failing rethrn-path thread, the admin can also kick it out. This will be an super, super corner case, but still try to cover that. One can try to test this with two proxy channels for migration: (a) socat unix-listen:/tmp/src.sock,reuseaddr,fork tcp:localhost:10000 (b) socat tcp-listen:10000,reuseaddr,fork unix:/tmp/dst.sock So the migration channel will be: (a) (b) src -> /tmp/src.sock -> tcp:10000 -> /tmp/dst.sock -> dst Then to make QEMU hang at RECOVER stage, one can do below: (1) stop the postcopy using QMP command postcopy-pause (2) kill the 2nd proxy (b) (3) try to recover the postcopy using /tmp/src.sock on src (4) src QEMU will go into RECOVER stage but won't be able to continue from there, because the channel is actually broken at (b) Before this patch, step (4) will make src QEMU stuck in RECOVER stage, without a way to kick the QEMU out or continue the postcopy again. After this patch, (4) will quickly fail qemu and bounce back to PAUSED stage. Admin can also kick QEMU from (4) into PAUSED when needed using migrate-pause when needed. After bouncing back to PAUSED stage, one can recover again. Reported-by: Xiaohui Li <xiaohli@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2111332 Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231017202633.296756-3-peterx@redhat.com>	2023-11-02 11:35:03 +01:00
Peter Xu	7aa6070d09	migration: Refactor error handling in source return path rp_state.error was a boolean used to show error happened in return path thread. That's not only duplicating error reporting (migrate_set_error), but also not good enough in that we only do error_report() and set it to true, we never can keep a history of the exact error and show it in query-migrate. To make this better, a few things done: - Use error_setg() rather than error_report() across the whole lifecycle of return path thread, keeping the error in an Error*. - With above, no need to have mark_source_rp_bad(), remove it, alongside with rp_state.error itself. - Use migrate_set_error() to apply that captured error to the global migration object when error occured in this thread. - Do the same when detected qemufile error in source return path We need to re-export qemu_file_get_error_obj() to do the last one. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231017202633.296756-2-peterx@redhat.com>	2023-11-02 11:35:03 +01:00
Steve Sistare	fa3673e497	migration: per-mode blockers Extend the blocker interface so that a blocker can be registered for one or more migration modes. The existing interfaces register a blocker for all modes, and the new interfaces take a varargs list of modes. Internally, maintain a separate blocker list per mode. The same Error object may be added to multiple lists. When a block is deleted, it is removed from every list, and the Error is freed. No functional change until a new mode is added. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1698263069-406971-3-git-send-email-steven.sistare@oracle.com>	2023-11-01 16:13:59 +01:00
Steve Sistare	eea1e5c9d6	migration: mode parameter Create a mode migration parameter that can be used to select alternate migration algorithms. The default mode is normal, representing the current migration algorithm, and does not need to be explicitly set. No functional change until a new mode is added, except that the mode is shown by the 'info migrate' command. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1698263069-406971-2-git-send-email-steven.sistare@oracle.com>	2023-11-01 16:13:58 +01:00
Peter Xu	3e5f3bcdc2	migration: Add tracepoints for downtime checkpoints This patch is inspired by Joao Martin's patch here: https://lore.kernel.org/r/20230926161841.98464-1-joao.m.martins@oracle.com Add tracepoints for major downtime checkpoints on both src and dst. They share the same tracepoint with a string showing its stage. Besides the checkpoints in the previous patch, this patch also added destination checkpoints. On src, we have these checkpoints added: - src-downtime-start: right before vm stops on src - src-vm-stopped: after vm is fully stopped - src-iterable-saved: after all iterables saved (END sections) - src-non-iterable-saved: after all non-iterable saved (FULL sections) - src-downtime-stop: migration fully completed On dst, we have these checkpoints added: - dst-precopy-loadvm-completes: after loadvm all done for precopy - dst-precopy-bh-: record BH steps to resume VM for precopy - dst-postcopy-bh-: record BH steps to resume VM for postcopy On dst side, we don't have a good way to trace total time consumed by iterable or non-iterable for now. We can mark it by 1st time receiving a FULL / END section, but rather than that let's just rely on the other tracepoints added for vmstates to back up the information. With this patch, one can enable "vmstate_downtime*" tracepoints and it'll enable all tracepoints for downtime measurements necessary. Drop loadvm_postcopy_handle_run_bh() tracepoint alongside, because they service the same purpose, which was only for postcopy. We then have unified prefix for all downtime relevant tracepoints. Co-developed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-6-peterx@redhat.com>	2023-11-01 16:13:58 +01:00
Peter Xu	93bdf888fa	migration: migration_stop_vm() helper Provide a helper for non-COLO use case of migration to stop a VM. This prepares for adding some downtime relevant tracepoints to migration, where they may or may not apply to COLO. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-5-peterx@redhat.com>	2023-11-01 16:13:58 +01:00
Peter Xu	3c80f14272	migration: Add per vmstate downtime tracepoints We have a bunch of savevm_section* tracepoints, they're good to analyze migration stream, but not always suitable if someone would like to analyze the migration downtime. Two major problems: - savevm_section* tracepoints are dumping all sections, we only care about the sections that contribute to the downtime - They don't have an identifier to show the type of sections, so no way to filter downtime information either easily. We can add type into the tracepoints, but instead of doing so, this patch kept them untouched, instead of adding a bunch of downtime specific tracepoints, so one can enable "vmstate_downtime*" tracepoints and get a full picture of how the downtime is distributed across iterative and non-iterative vmstate save/load. Note that here both save() and load() need to be traced, because both of them may contribute to the downtime. The contribution is not a simple "add them together", though: consider when the src is doing a save() of device1 while the dest can be load()ing for device2, so they can happen concurrently. Tracking both sides make sense because device load() and save() can be imbalanced, one device can save() super fast, but load() super slow, vice versa. We can't figure that out without tracing both. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-4-peterx@redhat.com>	2023-11-01 16:13:58 +01:00
Peter Xu	e22ffad03a	migration: Add migration_downtime_start\|end() helpers Unify the three users on recording downtimes with the same pair of helpers. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-3-peterx@redhat.com>	2023-11-01 16:13:58 +01:00
Peter Xu	62f5da7dd1	migration: Set downtime_start even for postcopy Postcopy calculates its downtime separately. It always sets MigrationState.downtime properly, but not MigrationState.downtime_start. Make postcopy do the same as other modes on properly recording the timestamp when the VM is going to be stopped. Drop the temporary variable in postcopy_start() along the way. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231030163346.765724-2-peterx@redhat.com>	2023-11-01 16:13:58 +01:00
Peter Xu	caa91b3c44	migration: Check in savevm_state_handler_insert for dups Before finally register one SaveStateEntry, we detect for duplicated entries. This could be helpful to notify us asap instead of get silent migration failures which could be hard to diagnose. For example, this patch will generate a message like this (if without previous fixes on x2apic) as long as we wants to boot a VM instance with "-smp 200,maxcpus=288,sockets=2,cores=72,threads=2" and QEMU will bail out even before VM starts: savevm_state_handler_insert: Detected duplicate SaveStateEntry: id=apic, instance_id=0x0 Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231020090731.28701-10-quintela@redhat.com>	2023-11-01 16:13:58 +01:00
Juan Quintela	485fb95546	migration: Hack to maintain backwards compatibility for ppc Current code does: - register pre_2_10_vmstate_dummy_icp with "icp/server" and instance dependinfg on cpu number - for newer machines, it register vmstate_icp with "icp/server" name and instance 0 - now it unregisters "icp/server" for the 1st instance. This is wrong at many levels: - we shouldn't have two VMSTATEDescriptions with the same name - In case this is the only solution that we can came with, it needs to be: * register pre_2_10_vmstate_dummy_icp * unregister pre_2_10_vmstate_dummy_icp * register real vmstate_icp Created vmstate_replace_hack_for_ppc() with warnings left and right that it is a hack. CC: Cedric Le Goater <clg@kaod.org> CC: Daniel Henrique Barboza <danielhb413@gmail.com> CC: David Gibson <david@gibson.dropbear.id.au> CC: Greg Kurz <groug@kaod.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231020090731.28701-8-quintela@redhat.com>	2023-11-01 16:13:58 +01:00
Juan Quintela	be07a0ed22	qemu-file: Make qemu_fflush() return errors This let us simplify code of this shape. qemu_fflush(f); int ret = qemu_file_get_error(f); if (ret) { return ret; } into: int ret = qemu_fflush(f); if (ret) { return ret; } I updated all callers where there is any error check. qemu_fclose() don't need to check for f->last_error because qemu_fflush() returns it at the beggining of the function. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-13-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-10-31 08:44:33 +01:00
Juan Quintela	0f8596180a	migration: Remove transferred atomic counter After last commit, it is a write only variable. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-12-quintela@redhat.com>	2023-10-31 08:44:33 +01:00
Juan Quintela	897fd8bdce	migration: Use migration_transferred_bytes() There are only two differnces with the old value: - the amount of QEMUFile that hasn't yet been flushed. It can be discussed what is more exact, the new or the old one. - the amount of transferred bytes that we forgot to account for (the newer is better, i.e. exact). Notice that this two values are used to: a - present to the user b - calculate the rate_limit So a few KB here and there is not going to make a difference. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-11-quintela@redhat.com>	2023-10-31 08:44:33 +01:00
Juan Quintela	fc55cf318a	qemu-file: Simplify qemu_file_get_error() If we pass a NULL error is the same that returning directly the value. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-10-quintela@redhat.com>	2023-10-31 08:44:33 +01:00
Juan Quintela	0743f41fd2	migration: migration_rate_limit_reset() don't need the QEMUFile Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231025091117.6342-9-quintela@redhat.com>	2023-10-31 08:44:33 +01:00

1 2 3 4 5 ...

2310 Commits