mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Dmitry Frolov	0926c002c7	migration: fix-possible-int-overflow stat64_add() takes uint64_t as 2nd argument, but both "p->next_packet_size" and "p->packet_len" are uint32_t. Thus, theyr sum may overflow uint32_t. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Frolov <frolov@swemel.ru> Link: https://lore.kernel.org/r/20241113140509.325732-2-frolov@swemel.ru Signed-off-by: Peter Xu <peterx@redhat.com>	2024-11-13 13:02:46 -05:00
Peter Xu	4daff81efb	migration: Check current_migration in migration_is_running() Report shows that commit `34a8892dec` broke iotest 055: https://lore.kernel.org/r/b8806360-a2b6-4608-83a3-db67e264c733@linaro.org Denis Rastyogin reported more such issue: https://lore.kernel.org/r/20241107114256.106831-1-gerben@altlinux.org In this merge, the migration_is_idle() function was replaced with migrate_is_running(). However, the null pointer check for `s` was removed, leading to a dereference of `s` when using qemu-system-x86_64 -hda *.vdi. When replacing migration_is_idle() with "!migration_is_running()", it was overlooked that the idle helper also checks for current_migration being available first. Sample stack dump: migration_is_running is_busy migrate_add_blocker_modes migrate_add_blocker_normal vmdk_open bdrv_open_driver bdrv_open_common bdrv_open_inherit bdrv_open blk_new_open blockdev_init drive_new drive_init_func qemu_opts_foreach configure_blockdev qemu_create_early_backends qemu_init main The check would be there if the whole series was applied, but since the last patches in the previous series rely on some other patches to land first, we need to recover the behavior of migration_is_idle() first before that whole set will be merged. I left migration_is_active / migration_is_device alone, as I don't think it's possible for them to hit uninitialized current_migration. Also they're prone to removal soon from VFIO side. Cc: Peter Maydell <peter.maydell@linaro.org> Fixes: `34a8892dec` ("migration: Drop migration_is_idle()") Reported-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reported-by: Denis Rastyogin <gerben@altlinux.org> Tested-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241105182725.2393425-1-peterx@redhat.com [peterx: enhance commit msg] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-11-13 13:02:45 -05:00
Peter Maydell	6b829602e2	* Various bug fixes * Big cleanup of deprecated machines * Power11 support for spapr * XIVE improvements * Goodbye to Cedric and David as ppc reviewers, thank you both o7 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmcoEicACgkQZ7MCdqhi HK5M8Q//fz+ZkJndXkBjb1Oinx+q+eVtNm2JrvcWIsXyhG3K+6VxYPp69H+SRv/Z TWuUqMQPxq8mhQvBJlDAttp/oaUEiOcCRvs/iUoBN12L4mVxXfdoT88TZ4frN3eP 8bePq+DW2N/7gpmsJm5CyEZPpcf9AjVHgLRp3KYFkOJ/14uzvuwnocU39gl+2IUh MXHTedQgMNXaKorJXk1NVdM6NxMuVhOvwxAs6ya2gwhxyA5tteo5PiQOnDJWkejf xg3RRsNzGYcs1Qg/3kFIf3RfEB0aYbPxROM8IfPaJWKN5KnMggj/JAkHyK1x/V3J wml7+cB0doMt/yRiuYJhXpyrtOqpvjRWPA6RhxECWW2kwrovv8NAF8IrFnw9NvOQ QC66ZaaFcbAcFrVT1e/iggU76d01II6m4OAgKcXw+FRHgps4VU9y83j7ApNnNUWN IXp9hkzoHi5VwX0FrG4ELUr2iEf1HASMvM8EZ/0AxzWj5iNtQB8lFsrEdaGVXyIS M5JaJeNjCn4koCyYaFSctH5eKtbzIwnGWnDcdTwaOuQ+9itBvY8O+HZalE6sAc5S kLFZ7i/Ut/qxbY5pMumt8LKD4pR1SsOxFB8dJCmn/f/tvRGtIVsoY6btNe4M0+24 42MxZbWO6W379C32bwbtsPiGA+aLSgShjP4cWm9cgRjz4RJFnwg= =vmIG -----END PGP SIGNATURE----- Merge tag 'pull-ppc-for-9.2-1-20241104' of https://gitlab.com/npiggin/qemu into staging * Various bug fixes * Big cleanup of deprecated machines * Power11 support for spapr * XIVE improvements * Goodbye to Cedric and David as ppc reviewers, thank you both o7 # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmcoEicACgkQZ7MCdqhi # HK5M8Q//fz+ZkJndXkBjb1Oinx+q+eVtNm2JrvcWIsXyhG3K+6VxYPp69H+SRv/Z # TWuUqMQPxq8mhQvBJlDAttp/oaUEiOcCRvs/iUoBN12L4mVxXfdoT88TZ4frN3eP # 8bePq+DW2N/7gpmsJm5CyEZPpcf9AjVHgLRp3KYFkOJ/14uzvuwnocU39gl+2IUh # MXHTedQgMNXaKorJXk1NVdM6NxMuVhOvwxAs6ya2gwhxyA5tteo5PiQOnDJWkejf # xg3RRsNzGYcs1Qg/3kFIf3RfEB0aYbPxROM8IfPaJWKN5KnMggj/JAkHyK1x/V3J # wml7+cB0doMt/yRiuYJhXpyrtOqpvjRWPA6RhxECWW2kwrovv8NAF8IrFnw9NvOQ # QC66ZaaFcbAcFrVT1e/iggU76d01II6m4OAgKcXw+FRHgps4VU9y83j7ApNnNUWN # IXp9hkzoHi5VwX0FrG4ELUr2iEf1HASMvM8EZ/0AxzWj5iNtQB8lFsrEdaGVXyIS # M5JaJeNjCn4koCyYaFSctH5eKtbzIwnGWnDcdTwaOuQ+9itBvY8O+HZalE6sAc5S # kLFZ7i/Ut/qxbY5pMumt8LKD4pR1SsOxFB8dJCmn/f/tvRGtIVsoY6btNe4M0+24 # 42MxZbWO6W379C32bwbtsPiGA+aLSgShjP4cWm9cgRjz4RJFnwg= # =vmIG # -----END PGP SIGNATURE----- # gpg: Signature made Mon 04 Nov 2024 00:15:35 GMT # gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE # gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE * tag 'pull-ppc-for-9.2-1-20241104' of https://gitlab.com/npiggin/qemu: (67 commits) MAINTAINERS: Remove myself as reviewer MAINTAINERS: Remove myself from XIVE MAINTAINERS: Remove myself from the PowerNV machines hw/ppc: Consolidate ppc440 initial mapping creation functions hw/ppc: Consolidate e500 initial mapping creation functions tests/qtest: Add XIVE tests for the powernv10 machine pnv/xive2: TIMA CI ops using alternative offsets or byte lengths pnv/xive2: TIMA support for 8-byte OS context push for PHYP pnv/xive: Update PIPR when updating CPPR pnv/xive: Add special handling for pool targets ppc/xive2: Support "Pull Thread Context to Odd Thread Reporting Line" ppc/xive2: Change context/ring specific functions to be generic ppc/xive2: Support "Pull Thread Context to Register" operation ppc/xive2: Allow 1-byte write of Target field in TIMA ppc/xive2: Dump the VP-group and crowd tables with 'info pic' ppc/xive2: Dump more NVP state with 'info pic' pnv/xive2: Support for "OS LGS Push" TIMA operation ppc/xive2: Support TIMA "Pull OS Context to Odd Thread Reporting Line" pnv/xive2: Define OGEN field in the TIMA pnv/xive: TIMA patch sets pre-req alignment and formatting changes ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-11-05 10:05:59 +00:00
Harsh Prateek Bora	24ee9229fe	ppc/spapr: remove deprecated machine pseries-2.9 Commit `1392617d35` intended to tag pseries-2.1 - 2.11 machines as deprecated with reasons mentioned in its commit log. Removing pseries-2.9 specific code with this patch for now. While at it, also remove the pre-2.10 migration hacks which now become obsolete. Suggested-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>	2024-11-04 09:10:29 +10:00
Maciej S. Szmigiero	00b4b21653	migration/multifd: Zero p->flags before starting filling a packet This way there aren't stale flags there. p->flags can't contain SYNC to be sent at the next RAM packet since syncs are now handled separately in multifd_send_thread. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Link: https://lore.kernel.org/r/1c96b6cdb797e6f035eb1a4ad9bfc24f4c7f5df8.1730203967.git.maciej.szmigiero@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Maciej S. Szmigiero	b0350c5195	migration/ram: Add load start trace event There's a RAM load complete trace event but there wasn't its start equivalent. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/94ddfa7ecb83a78f73b82867dd30c8767592d257.1730203967.git.maciej.szmigiero@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Peter Xu	34a8892dec	migration: Drop migration_is_idle() Now with the current migration_is_running(), it will report exactly the opposite of what will be reported by migration_is_idle(). Drop migration_is_idle(), instead use "!migration_is_running()" which should be identical on functionality. In reality, most of the idle check is inverted, so it's even easier to write with "migrate_is_running()" check. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Peter Xu	f018eb62b2	migration: Drop migration_is_setup_or_active() This helper is mostly the same as migration_is_running(), except that one has COLO reported as true, the other has CANCELLING reported as true. Per my past years experience on the state changes, none of them should matter. To make it slightly safer, report both COLO \|\| CANCELLING to be true in migration_is_running(), then drop the other one. We kept the 1st only because the name is simpler, and clear enough. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Peter Xu	64dcd2c9c6	migration: Unexport ram_mig_init() It's only used within migration/. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Peter Xu	a4ddab3581	migration: Unexport dirty_bitmap_mig_init() It's only used within migration/, so it shouldn't be exported. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Peter Xu	7fc8beb16e	migration: Take migration object refcount earlier for threads Both migration thread or background snapshot thread will take a refcount of the migration object at the entrace of the thread function. That makes sense, because it protects the object from being freed by the main thread in migration_shutdown() later, but it might still race with it if the thread is scheduled too late. Consider the case right after pthread_create() happened, VM shuts down with the object released, but right after that the migration thread finally got created, referencing MigrationState* in the opaque pointer which is already freed. The only 100% safe way to make sure it won't get freed is taking the refcount right before the thread is created, meanwhile when BQL is held. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Thomas Huth	88c3b57f48	migration/dirtyrate: Silence warning about strcpy() on OpenBSD The linker on OpenBSD complains: ld: warning: dirtyrate.c:447 (../src/migration/dirtyrate.c:447)(...): warning: strcpy() is almost always misused, please use strlcpy() It's currently not a real problem in this case since both arrays have the same size (256 bytes). But just in case somebody changes the size of the source array in the future, let's better play safe and use g_strlcpy() here instead, with an additional check that the string has been copied as a whole. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Link: https://lore.kernel.org/r/20241022063402.184213-1-thuth@redhat.com [peterx: Fix over-80 chars] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Hyman Huang	52ac968ab2	migration: Support periodic RAMBlock dirty bitmap sync When VM is configured with huge memory, the current throttle logic doesn't look like to scale, because migration_trigger_throttle() is only called for each iteration, so it won't be invoked for a long time if one iteration can take a long time. The periodic dirty sync aims to fix the above issue by synchronizing the ramblock from remote dirty bitmap and, when necessary, triggering the CPU throttle multiple times during a long iteration. This is a trade-off between synchronization overhead and CPU throttle impact. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/f61f1b3653f2acf026901103e1c73d157d38b08f.1729146786.git.yong.huang@smartx.com [peterx: make prev_cnt global, and reset for each migration] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Hyman Huang	6a39ba7cab	migration: Remove "rs" parameter in migration_bitmap_sync_precopy The global static variable ram_state in fact is referred to by the "rs" parameter in migration_bitmap_sync_precopy. For ease of calling by the callees, use the global variable directly in migration_bitmap_sync_precopy and remove "rs" parameter. The migration_bitmap_sync_precopy will be exported in the next commit. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/283c335d61463bf477160da91b24da45cdaf3e43.1729146786.git.yong.huang@smartx.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Hyman Huang	d481cec756	migration: Move cpu-throttle.c from system to migration Move cpu-throttle.c from system to migration since it's only used for migration; this makes us avoid exporting the util functions and variables in misc.h but export them in migration.h when implementing the periodic ramblock dirty sync feature in the upcoming commits. Since CPU throttle timers are only used in migration, move their registry to migration_object_init. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/c1b3efaa0cb49e03d422e9da97bdb65cc3d234d1.1729146786.git.yong.huang@smartx.com [peterx: Fix build on MacOS on cocoa.m, not move cpu-throttle.h yet] [peterx: Fix subject spelling, per pm215] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Hyman Huang	054e5d66e5	migration: Stop CPU throttling conditionally Since CPU throttling only occurs when auto-converge is on, stop it conditionally. Signed-off-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/f0c787080bb9ab0c37952f0ca5bfaa525d5ddd14.1729146786.git.yong.huang@smartx.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Hanna Czenczek	37dfcba1a0	migration: Ensure vmstate_save() sets errp migration/savevm.c contains some calls to vmstate_save() that are followed by migrate_set_error() if the integer return value indicates an error. migrate_set_error() requires that the `Error ` object passed to it is set. Therefore, vmstate_save() is assumed to always set errp on error. Right now, that assumption is not met: vmstate_save_state_v() (called internally by vmstate_save()) will not set errp if vmstate_subsection_save() or vmsd->post_save() fail. Fix that by adding an errp parameter to vmstate_subsection_save(), and by generating a generic error in case post_save() fails (as is already done for pre_save()). Without this patch, qemu will crash after vmstate_subsection_save() or post_save() have failed inside of a vmstate_save() call (unless migrate_set_error() then happen to discard the new error because s->error is already set). This happens e.g. when receiving the state from a virtio-fs back-end (virtiofsd) fails. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Link: https://lore.kernel.org/r/20241015170437.310358-1-hreitz@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Peter Xu	e620b1e477	migration: Put thread names together with macros Keep migration thread names together, so it's easier to see a list of all possible migration threads. Still two functional changes below besides the macro defintions: - There's one dirty rate thread that we overlooked before, now we add that too and name it as "mig/dirtyrate" following the old rules. - The old name "mig/src/rp-thr" has "-thr" but it may not be useful if it's a thread name anyway, while "rp" can be slightly hard to read. Taking this chance to rename it to "mig/src/return", hopefully a better name. Reviewed-by: Fabiano Rosas <farosas@suse.de> Acked-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Link: https://lore.kernel.org/r/20241011153652.517440-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Peter Xu	6dd4f44c4f	migration: Cleanup migrate_fd_cleanup() on accessing to_dst_file The cleanup function can in many cases needs cleanup on its own. The major thing we want to do here is not referencing to_dst_file when without the file mutex. When at it, touch things elsewhere too to make it look slightly better in general. One thing to mention is, migration_thread has its own "running" boolean, so it doesn't need to rely on to_dst_file being non-NULL. Multifd has a dependency so it needs to be skipped if to_dst_file is not yet set; add a richer comment for such reason. Resolves: Coverity CID 1527402 Reported-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240919163042.116767-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Yuan Liu	2e49d6a20b	migration/multifd: fix build error when qpl compression is enabled The page_size member has been removed from the MultiFDSendParams and MultiFDRecvParams. The function multifd_ram_page_size is used to provide the page size in the multifd compressor. Fixes: `90fa121c6c` ("migration/multifd: Inline page_size and page_count") Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Link: https://lore.kernel.org/r/20241008104527.3516755-1-yuan1.liu@intel.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-09 08:30:53 -04:00
Dr. David Alan Gilbert	3ba55a33e8	migration/postcopy: Use uffd helpers Use the uffd_copy_page, uffd_zero_page and uffd_wakeup helpers rather than calling ioctl ourselves. They return -errno on error, and print an error_report themselves. I think this actually makes postcopy_place_page actually more consistent in it's callers. Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240919134626.166183-7-dave@treblig.org [peterx: fix i386 build] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-08 15:28:55 -04:00
Dr. David Alan Gilbert	6242b36102	migration: Remove unused socket_send_channel_create_sync socket_send_channel_create_sync only use was removed by `d0edb8a173` ("migration: Create the postcopy preempt channel asynchronously") Remove it. Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240919134626.166183-5-dave@treblig.org Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-08 15:28:55 -04:00
Fabiano Rosas	73581a041e	migration: Deprecate zero-blocks capability The zero-blocks capability was meant to be used along with the block migration, which has been removed already in commit `eef0bae3a7` ("migration: Remove block migration"). Setting zero-blocks is currently a noop, but the outright removal of the capability would cause and error in case some users are still setting it. Put the capability through the deprecation process. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240919134626.166183-4-dave@treblig.org Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-08 15:28:55 -04:00
Dr. David Alan Gilbert	21ed5ff606	migration: Remove unused migrate_zero_blocks migrate_zero_blocks is unused since `eef0bae3a7` ("migration: Remove block migration") Remove it. Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240919134626.166183-3-dave@treblig.org Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-08 15:28:55 -04:00
Dr. David Alan Gilbert	a5d8d13842	migration: Remove migrate_cap_set migrate_cap_set has been unused since `18d154f575` ("migration: Remove 'blk/-b' option from migrate commands") Remove it. Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240919134626.166183-2-dave@treblig.org Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-08 15:28:55 -04:00
Fabiano Rosas	68e0fca625	migration/multifd: Ensure packet->ramblock is null-terminated Coverity points out that the current usage of strncpy to write the ramblock name allows the field to not have an ending '\0' in case idstr is already not null-terminated (e.g. if it's larger than 256 bytes). This is currently harmless because the packet->ramblock field is never touched again on the source side. The destination side reads only up to the field's size from the stream and forces the last byte to be 0. We're still open to a programming error in the future in case this field is ever passed into a function that expects a null-terminated string. Change from strncpy to QEMU's pstrcpy, which puts a '\0' at the end of the string and doesn't fill the extra space with zeros. (there's no spillage between iterations of fill_packet because after commit `87bb9e953e` ("migration/multifd: Isolate ram pages packet data") the packet is always zeroed before filling) Resolves: Coverity CID 1560071 Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240919150611.17074-1-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-08 15:28:55 -04:00
Marc-André Lureau	85f99eb2cb	migration: fix -Werror=maybe-uninitialized false-positive ../migration/ram.c:1873:23: error: ‘dirty’ may be used uninitialized [-Werror=maybe-uninitialized] When 'block' != NULL, 'dirty' is initialized. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Acked-by: Peter Xu <peterx@redhat.com>	2024-10-02 16:14:29 +04:00
Marc-André Lureau	7cea863719	migration: fix -Werror=maybe-uninitialized false-positives ../migration/dirtyrate.c:186:5: error: ‘records’ may be used uninitialized [-Werror=maybe-uninitialized] ../migration/dirtyrate.c:168:12: error: ‘gen_id’ may be used uninitialized [-Werror=maybe-uninitialized] ../migration/migration.c:2273:5: error: ‘file’ may be used uninitialized [-Werror=maybe-uninitialized] Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com>	2024-10-02 16:14:29 +04:00
Pierrick Bouvier	d13526f77a	migration: remove return after g_assert_not_reached() This patch is part of a series that moves towards a consistent use of g_assert_not_reached() rather than an ad hoc mix of different assertion mechanisms. Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240919044641.386068-31-pierrick.bouvier@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2024-09-24 13:53:35 +02:00
Pierrick Bouvier	fe1f1a8070	migration: replace assert(false) with g_assert_not_reached() This patch is part of a series that moves towards a consistent use of g_assert_not_reached() rather than an ad hoc mix of different assertion mechanisms. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Message-ID: <20240919044641.386068-14-pierrick.bouvier@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2024-09-24 13:53:35 +02:00
Pierrick Bouvier	0c79effdc7	migration: replace assert(0) with g_assert_not_reached() This patch is part of a series that moves towards a consistent use of g_assert_not_reached() rather than an ad hoc mix of different assertion mechanisms. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240919044641.386068-5-pierrick.bouvier@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2024-09-24 13:53:35 +02:00
Fabiano Rosas	4ce5622908	migration/multifd: Fix rb->receivedmap cleanup race Fix a segmentation fault in multifd when rb->receivedmap is cleared too early. After commit `5ef7e26bdb` ("migration/multifd: solve zero page causing multiple page faults"), multifd started using the rb->receivedmap bitmap, which belongs to ram.c and is initialized and freed from the ram SaveVMHandlers. Multifd threads are live until migration_incoming_state_destroy(), which is called after qemu_loadvm_state_cleanup(), leading to a crash when accessing rb->receivedmap. process_incoming_migration_co() ... qemu_loadvm_state() multifd_nocomp_recv() qemu_loadvm_state_cleanup() ramblock_recv_bitmap_set_offset() rb->receivedmap = NULL set_bit_atomic(..., rb->receivedmap) ... migration_incoming_state_destroy() multifd_recv_cleanup() multifd_recv_terminate_threads(NULL) Move the loadvm cleanup into migration_incoming_state_destroy(), after multifd_recv_cleanup() to ensure multifd threads have already exited when rb->receivedmap is cleared. Adjust the postcopy listen thread comment to indicate that we still want to skip the cpu synchronization. CC: qemu-stable@nongnu.org Fixes: `5ef7e26bdb` ("migration/multifd: solve zero page causing multiple page faults") Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240917185802.15619-3-farosas@suse.de [peterx: added comment in migration_incoming_state_destroy()] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-09-18 14:27:39 -04:00
Fabiano Rosas	90a384d461	migration/savevm: Remove extra load cleanup calls There are two qemu_loadvm_state_cleanup() calls that were introduced when qemu_loadvm_state_setup() was still called before loading the configuration section, so there was state to be cleaned up if the header checks failed. However, commit `9e14b84908` ("migration/savevm: load_header before load_setup") has moved that configuration section part to qemu_loadvm_state_header() which now happens before qemu_loadvm_state_setup(). Remove the cleanup calls that are now misplaced. Note that we didn't use Fixes because it's benign to cleanup() even if setup() is not invoked. So this patch is not needed for stable, as it falls into cleanup category. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240917185802.15619-2-farosas@suse.de [peterx: added last paragraph of commit message] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-09-18 14:27:39 -04:00
Stefan Weil	cb0ed522a5	migration/multifd: Fix loop conditions in multifd_zstd_send_prepare and multifd_zstd_recv GitHub's CodeQL reports four critical errors which are fixed by this commit: Unsigned difference expression compared to zero An expression (u - v > 0) with unsigned values u, v is only false if u == v, so all changed expressions did not work as expected. Signed-off-by: Stefan Weil <sw@weilnetz.de> Link: https://lore.kernel.org/r/20240910054138.1458555-1-sw@weilnetz.de [peterx: Fix mangled email for author] Signed-off-by: Peter Xu <peterx@redhat.com>	2024-09-18 14:27:24 -04:00
Peter Xu	561ce01493	migration/multifd: Fix build for qatzip The qatzip series was based on an older commit, it applied cleanly even though it has conflicts. Neither CI nor myself found the build will break as it's skipped by default when qatzip library was missing. Fix the build issues. No need to copy stable as it just landed 9.2. Cc: Yichen Wang <yichen.wang@bytedance.com> Cc: Bryan Zhang <bryan.zhang@bytedance.com> Cc: Hao Xiang <hao.xiang@linux.dev> Cc: Yuan Liu <yuan1.liu@intel.com> Fixes: `80484f9459` ("migration: Introduce 'qatzip' compression method") Link: https://lore.kernel.org/r/20240910210450.3835123-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-09-17 17:50:45 -04:00
Bryan Zhang	80484f9459	migration: Introduce 'qatzip' compression method Adds support for 'qatzip' as an option for the multifd compression method parameter, and implements using QAT for 'qatzip' compression and decompression. Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com> Signed-off-by: Hao Xiang <hao.xiang@linux.dev> Signed-off-by: Yichen Wang <yichen.wang@bytedance.com> Link: https://lore.kernel.org/r/20240830232722.58272-5-yichen.wang@bytedance.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-09-09 10:55:40 -04:00
Bryan Zhang	86c6eb1f39	migration: Add migration parameters for QATzip Adds support for migration parameters to control QATzip compression level. Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Bryan Zhang <bryan.zhang@bytedance.com> Signed-off-by: Hao Xiang <hao.xiang@linux.dev> Signed-off-by: Yichen Wang <yichen.wang@bytedance.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Link: https://lore.kernel.org/r/20240830232722.58272-4-yichen.wang@bytedance.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-09-09 10:55:39 -04:00
Fabiano Rosas	62e1af13bb	migration/multifd: Add documentation for multifd methods Add documentation clarifying the usage of the multifd methods. The general idea is that the client code calls into multifd to trigger send/recv of data and multifd then calls these hooks back from the worker threads at opportune moments so the client can process a portion of the data. Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:37 -03:00
Fabiano Rosas	90e0eeb99b	migration/multifd: Add a couple of asserts for p->iov Check that p->iov is indeed always allocated and freed by the MultiFDMethods hooks. Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:37 -03:00
Fabiano Rosas	405e352d28	migration/multifd: Fix p->iov leak in multifd-uadk.c The send_cleanup() hook should free the p->iov that was allocated at send_setup(). This was missed because the UADK code is conditional on the presence of the accelerator, so it's not tested by default. Fixes: `819dd20636` ("migration/multifd: Add UADK initialization") Reported-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	81b0ed8ad8	migration/multifd: Stop changing the packet on recv side As observed by Philippe, the multifd_ram_unfill_packet() function currently leaves the MultiFDPacket structure with mixed endianness. This is harmless, but ultimately not very clean. Stop touching the received packet and do the necessary work using stack variables instead. While here tweak the error strings and fix the space before semicolons. Also remove the "100 times bigger" comment because it's just one possible explanation for a size mismatch and it doesn't even match the code. CC: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	308d165c77	migration/multifd: Make MultiFDMethods const The methods are defined at module_init time and don't ever change. Make them const. Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	40c9471e40	migration/multifd: Move nocomp code into multifd-nocomp.c In preparation for adding new payload types to multifd, move most of the no-compression code into multifd-nocomp.c. Let's try to keep a semblance of layering by not mixing general multifd control flow with the details of transmitting pages of ram. There are still some pieces leftover, namely the p->normal, p->zero, etc variables that we use for zero page tracking and the packet allocation which is heavily dependent on the ram code. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	dc6327d99c	migration/multifd: Register nocomp ops dynamically Prior to moving the ram code into multifd-nocomp.c, change the code to register the nocomp ops dynamically so we don't need to have the ops structure defined in multifd.c. While here, move the ops struct initialization to the end of the file to make the next diff cleaner. Reviewed-by: Prasad Pandit <pjp@fedoraproject.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	6f848dac4a	migration/multifd: Standardize on multifd ops names Add the multifd_ prefix to all functions and remove the useless docstrings. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	a0c78d815c	migration/multifd: Allow multifd sync without flush Separate the multifd sync from flushing the client data to the channels. These two operations are closely related but not strictly necessary to be executed together. The multifd sync is intrinsic to how multifd works. The multiple channels operate independently and may finish IO out of order in relation to each other. This applies also between the source and destination QEMU. Flushing the data that is left in the client-owned data structures (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, but that is particular to how the ram migration is implemented with several passes over dirty data. Make these two routines separate, allowing future code to call the sync by itself if needed. This also allows the usage of multifd_ram_send to be isolated to ram code. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:36 -03:00
Fabiano Rosas	a71ef5c7f3	migration/multifd: Replace multifd_send_state->pages with client data Multifd currently has a simple scheduling mechanism that distributes work to the various channels by keeping storage space within each channel and an extra space that is given to the client. Each time the client fills the space with data and calls into multifd, that space is given to the next idle channel and a free storage space is taken from the channel and given to client for the next iteration. This means we always need (#multifd_channels + 1) memory slots to operate multifd. This is fine, except that the presence of this one extra memory slot doesn't allow different types of payloads to be processed at the same time in different channels, i.e. the data type of multifd_send_state->pages needs to be the same as p->pages. For each new data type different from MultiFDPage_t that is to be handled, this logic would need to be duplicated by adding new fields to multifd_send_state, to the channels and to multifd_send_pages(). Fix this situation by moving the extra slot into the client and using only the generic type MultiFDSendData in the multifd core. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	d7e58f412c	migration/multifd: Don't send ram data during SYNC Skip saving and loading any ram data in the packet in the case of a SYNC. This fixes a shortcoming of the current code which requires a reset of the MultiFDPages_t fields right after the previous pending_job finishes, otherwise the very next job might be a SYNC and multifd_send_fill_packet() will put the stale values in the packet. By not calling multifd_ram_fill_packet(), we can stop resetting MultiFDPages_t in the multifd core and leave that to the client code. Actually moving the reset function is not yet done because pages->num==0 is used by the client code to determine whether the MultiFDPages_t needs to be flushed. The subsequent patches will replace that with a generic flag that is not dependent on MultiFDPages_t. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	87bb9e953e	migration/multifd: Isolate ram pages packet data While we cannot yet disentangle the multifd packet from page data, we can make the code a bit cleaner by setting the page-related fields in a separate function. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00
Fabiano Rosas	96d396bf50	migration/multifd: Remove total pages tracing The total_normal_pages and total_zero_pages elements are used only for the end tracepoints of the multifd threads. These are not super useful since they record per-channel numbers and are just the sum of all the pages that are transmitted per-packet, for which we already have tracepoints. Remove the totals from the tracing. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>	2024-09-03 16:24:35 -03:00

1 2 3 4 5 ...

2373 Commits