mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Juan Quintela	4703d1958c	migration: Simplify decompress_data_with_multi_threads() Doing a break to do another break is just confused. Just call return when we know we want to return. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-14-quintela@redhat.com>	2023-10-17 22:14:51 +02:00
Juan Quintela	1fd03d41b8	migration: Move update_compress_threads_counts() to ram-compress.c Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-9-quintela@redhat.com>	2023-10-17 22:14:51 +02:00
Juan Quintela	f504789de5	migration: Create ram_compressed_pages() Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-8-quintela@redhat.com>	2023-10-17 22:14:51 +02:00
Juan Quintela	6f60900573	migration: Create populate_compress() So we don't have to access compression_counters from outside ram-compress.c. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-7-quintela@redhat.com>	2023-10-17 22:14:51 +02:00
Juan Quintela	809f188a1a	migration: Move compression_counters cleanup ram-compress.c Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-6-quintela@redhat.com>	2023-10-17 22:14:51 +02:00
Juan Quintela	b88a3306fd	migration: RDMA is not compatible with anything else So give an error instead of just ignoring the other methods. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-ID: <20230613145757.10131-4-quintela@redhat.com>	2023-10-17 22:14:51 +02:00
Fabiano Rosas	967e388987	migration/multifd: Clarify Error usage in multifd_channel_connect The function is currently called from two sites, one always gives it a NULL Error and the other always gives it a non-NULL Error. In the non-NULL case, all it does it trace the error and return. One of the callers already have tracing, add a tracepoint to the other and stop passing the error into the function. Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231012134343.23757-4-farosas@suse.de>	2023-10-17 09:25:14 +02:00
Fabiano Rosas	ee8a7c9c46	migration/multifd: Unify multifd_send_thread error paths The preferred usage of the Error type is to always set both the return code and the error when a failure happens. As all code called from the send thread follows this pattern, we'll always have the return code and the error set at the same time. Aside from the convention, in this piece of code this must be the case, otherwise the if (ret != 0) would be exiting the thread without calling multifd_send_terminate_threads() which is incorrect. Unify both paths to make it clear that both are taken when there's an error. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231012134343.23757-3-farosas@suse.de>	2023-10-17 09:25:14 +02:00
Fabiano Rosas	0e92f64448	migration/multifd: Remove direct "socket" references We're about to enable support for other transports in multifd, so remove direct references to sockets. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231012134343.23757-2-farosas@suse.de>	2023-10-17 09:25:14 +02:00
Fabiano Rosas	8697eb8577	migration/ram: Merge save_zero_page functions We don't need to do this in two pieces. One single function makes it easier to grasp, specially since it removes the indirection on the return value handling. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011184604.32364-7-farosas@suse.de>	2023-10-17 09:25:14 +02:00
Fabiano Rosas	ccc09db87c	migration/ram: Move xbzrle zero page handling into save_zero_page It makes a bit more sense to have the zero page handling of xbzrle right where we save the zero page. Also invert the exit condition to remove one level of indentation which makes the next patch easier to grasp. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011184604.32364-6-farosas@suse.de>	2023-10-17 09:25:14 +02:00
Fabiano Rosas	1e43f165d0	migration/ram: Stop passing QEMUFile around in save_zero_page We don't need the QEMUFile when we're already passing the PageSearchStatus. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011184604.32364-5-farosas@suse.de>	2023-10-17 09:25:14 +02:00
Fabiano Rosas	8f47d4ee43	migration/ram: Remove RAMState from xbzrle_cache_zero_page 'rs' is not used in that function. It's a leftover from commit `9360447d34` ("ram: Use MigrationStats for statistics"). Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011184604.32364-4-farosas@suse.de>	2023-10-17 09:25:14 +02:00
Nikolay Borisov	2f5ced5b93	migration/ram: Refactor precopy ram loading code Extract the ramblock parsing code into a routine that operates on the sequence of headers from the stream and another the parses the individual ramblock. This makes ram_load_precopy() easier to comprehend. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011184604.32364-3-farosas@suse.de>	2023-10-17 09:25:14 +02:00
Elena Ufimtseva	1618f55221	multifd: reset next_packet_len after sending pages Sometimes multifd sends just sync packet with no pages (normal_num is 0). In this case the old value is being preserved and being accounted for while only packet_len is being transferred. Reset it to 0 after sending and accounting for. Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011184358.97349-5-elena.ufimtseva@oracle.com>	2023-10-17 09:25:13 +02:00
Elena Ufimtseva	68b6e00048	multifd: fix counters in multifd_send_thread Previous commit `cbec7eb768` "migration/multifd: Compute transferred bytes correctly" removed accounting for packet_len in non-rdma case, but the next_packet_size only accounts for pages, not for the header packet (normal_pages * PAGE_SIZE) that is being sent as iov[0]. The packet_len part should be added to account for the size of MultiFDPacket and the array of the offsets. Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011184358.97349-4-elena.ufimtseva@oracle.com>	2023-10-17 09:25:13 +02:00
Elena Ufimtseva	60c7981aa3	migration: check for rate_limit_max for RATE_LIMIT_DISABLED In migration rate limiting atomic operations are used to read the rate limit variables and transferred bytes and they are expensive. Check first if rate_limit_max is equal to RATE_LIMIT_DISABLED and return false immediately if so. Note that with this patch we will also will stop flushing by not calling qemu_fflush() from migration_transferred_bytes() if the migration rate is not exceeded. This should be fine since migration thread calls in the loop migration_update_counters from migration_rate_limit() that calls the migration_transferred_bytes() and flushes there. Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011184358.97349-2-elena.ufimtseva@oracle.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	8f5a7faa4e	migration/rdma: Remove all "ret" variables that are used only once Change code that is: int ret; ... ret = foo(); if (ret[ < 0]?) { to: if (foo()[ < 0]) { Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-14-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	14e2fcbbf8	migration/rdma: Declare for index variables local Declare all variables that are only used inside a for loop inside the for statement. This makes clear that they are not used outside of the for loop. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-13-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	ebdb85f9d1	migration/rdma: Use i as for index instead of idx Once there, all the uses are local to the for, so declare the variable inside the for statement. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-12-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	a4832d299d	migration/rdma: Check sooner if we are in postcopy for save_page() Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-11-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	b1b3838722	migration/rdma: Remove qemu_ prefix from exported functions Functions are long enough even without this. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-10-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	10cb3336b1	migration/rdma: Move rdma constants from qemu-file.h to rdma.h Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-9-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	8b670f48ed	qemu-file: Remove QEMUFileHooks The only user was rdma, and its use is gone. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-8-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	e493008d50	migration/rdma: Create rdma_control_save_page() The only user of ram_control_save_page() and save_page() hook was rdma. Just move the function to rdma.c, rename it to rdma_control_save_page(). Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-7-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	a6323300e8	migration/rdma: Unfold hook_ram_load() There is only one flag called with: RAM_CONTROL_BLOCK_REG. Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-6-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	f6d6c089b7	migration/rdma: Remove all uses of RAM_CONTROL_HOOK Instead of going through ram_control_load_hook(), call qemu_rdma_registration_handle() directly. Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-5-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	5f5b8858dc	migration/rdma: Unfold ram_control_after_iterate() Once there: - Remove unused data parameter - unfold it in its callers - change all callers to call qemu_rdma_registration_stop() - We need to call QIO_CHANNEL_RDMA() after we check for migrate_rdma() Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-4-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	48408174a7	migration/rdma: Unfold ram_control_before_iterate() Once there: - Remove unused data parameter - unfold it in its callers. - change all callers to call qemu_rdma_registration_start() - We need to call QIO_CHANNEL_RDMA() after we check for migrate_rdma() Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-3-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	27fd25b0fb	migration: Create migrate_rdma() Helper to say if we are doing a migration over rdma. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011203527.9061-2-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Juan Quintela	d4f34485ca	migration: Non multifd migration don't care about multifd flushes RDMA was having trouble because migrate_multifd_flush_after_each_section() can only be true or false, but we don't want to send any flush when we are not in multifd migration. CC: Fabiano Rosas <farosas@suse.de Fixes: `294e5a4034` ("multifd: Only flush once each full round of memory") Reported-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011205548.10571-2-quintela@redhat.com>	2023-10-17 09:25:13 +02:00
Fiona Ebner	930e239d11	migration: hold the BQL during setup This is intended to be a semantic revert of commit `9b09503752` ("migration: run setup callbacks out of big lock"). There have been so many changes since that commit (e.g. a new setup callback dirty_bitmap_save_setup() that also needs to be adapted now), it's easier to do the revert manually. For snapshots, the bdrv_writev_vmstate() function is used during setup (in QIOChannelBlock backing the QEMUFile), but not holding the BQL while calling it could lead to an assertion failure. To understand how, first note the following: 1. Generated coroutine wrappers for block layer functions spawn the coroutine and use AIO_WAIT_WHILE()/aio_poll() to wait for it. 2. If the host OS switches threads at an inconvenient time, it can happen that a bottom half scheduled for the main thread's AioContext is executed as part of a vCPU thread's aio_poll(). An example leading to the assertion failure is as follows: main thread: 1. A snapshot-save QMP command gets issued. 2. snapshot_save_job_bh() is scheduled. vCPU thread: 3. aio_poll() for the main thread's AioContext is called (e.g. when the guest writes to a pflash device, as part of blk_pwrite which is a generated coroutine wrapper). 4. snapshot_save_job_bh() is executed as part of aio_poll(). 3. qemu_savevm_state() is called. 4. qemu_mutex_unlock_iothread() is called. Now qemu_get_current_aio_context() returns 0x0. 5. bdrv_writev_vmstate() is executed during the usual savevm setup via qemu_fflush(). But this function is a generated coroutine wrapper, so it uses AIO_WAIT_WHILE. There, the assertion assert(qemu_get_current_aio_context() == qemu_get_aio_context()); will fail. To fix it, ensure that the BQL is held during setup. While it would only be needed for snapshots, adapting migration too avoids additional logic for conditional locking/unlocking in the setup callbacks. Writing the header could (in theory) also trigger qemu_fflush() and thus bdrv_writev_vmstate(), so the locked section also covers the qemu_savevm_state_header() call, even for migration for consistency. The section around multifd_send_sync_main() needs to be unlocked to avoid a deadlock. In particular, the multifd_save_setup() function calls socket_send_channel_create() using multifd_new_send_channel_async() as a callback and then waits for the callback to signal via the channels_ready semaphore. The connection happens via qio_task_run_in_thread(), but the callback is only executed via qio_task_thread_result() which is scheduled for the main event loop. Without unlocking the section, the main thread would never get to process the task result and the callback meaning there would be no signal via the channels_ready semaphore. The comment in ram_init_bitmaps() was introduced by `4987783400` ("migration: fix incorrect memory_global_dirty_log_start outside BQL") and is removed, because it referred to the qemu_mutex_lock_iothread() call. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231013105839.415989-1-f.ebner@proxmox.com>	2023-10-17 09:25:13 +02:00
Nikolay Borisov	2aae1eb8da	migration: Add the configuration vmstate to the json writer Make the migration json writer part of MigrationState struct, allowing the 'configuration' object be serialized to json. This will facilitate the parsing of the 'configuration' object in the next patch that fixes analyze-migration.py for arm. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231009184326.15777-2-farosas@suse.de>	2023-10-17 09:14:32 +02:00
Dmitry Frolov	f75ed59f40	migration: fix RAMBlock add NULL check qemu_ram_block_from_host() may return NULL, which will be dereferenced w/o check. Usualy return value is checked for this function. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Frolov <frolov@swemel.ru> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010104851.802947-1-frolov@swemel.ru>	2023-10-17 09:14:32 +02:00
Peter Xu	8b2395970a	migration: Allow user to specify available switchover bandwidth Migration bandwidth is a very important value to live migration. It's because it's one of the major factors that we'll make decision on when to switchover to destination in a precopy process. This value is currently estimated by QEMU during the whole live migration process by monitoring how fast we were sending the data. This can be the most accurate bandwidth if in the ideal world, where we're always feeding unlimited data to the migration channel, and then it'll be limited to the bandwidth that is available. However in reality it may be very different, e.g., over a 10Gbps network we can see query-migrate showing migration bandwidth of only a few tens of MB/s just because there are plenty of other things the migration thread might be doing. For example, the migration thread can be busy scanning zero pages, or it can be fetching dirty bitmap from other external dirty sources (like vhost or KVM). It means we may not be pushing data as much as possible to migration channel, so the bandwidth estimated from "how many data we sent in the channel" can be dramatically inaccurate sometimes. With that, the decision to switchover will be affected, by assuming that we may not be able to switchover at all with such a low bandwidth, but in reality we can. The migration may not even converge at all with the downtime specified, with that wrong estimation of bandwidth, keeping iterations forever with a low estimation of bandwidth. The issue is QEMU itself may not be able to avoid those uncertainties on measuing the real "available migration bandwidth". At least not something I can think of so far. One way to fix this is when the user is fully aware of the available bandwidth, then we can allow the user to help providing an accurate value. For example, if the user has a dedicated channel of 10Gbps for migration for this specific VM, the user can specify this bandwidth so QEMU can always do the calculation based on this fact, trusting the user as long as specified. It may not be the exact bandwidth when switching over (in which case qemu will push migration data as fast as possible), but much better than QEMU trying to wildly guess, especially when very wrong. A new parameter "avail-switchover-bandwidth" is introduced just for this. So when the user specified this parameter, instead of trusting the estimated value from QEMU itself (based on the QEMUFile send speed), it trusts the user more by using this value to decide when to switchover, assuming that we'll have such bandwidth available then. Note that specifying this value will not throttle the bandwidth for switchover yet, so QEMU will always use the full bandwidth possible for sending switchover data, assuming that should always be the most important way to use the network at that time. This can resolve issues like "unconvergence migration" which is caused by hilarious low "migration bandwidth" detected for whatever reason. Reported-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231010221922.40638-1-peterx@redhat.com>	2023-10-17 09:14:32 +02:00
Philippe Mathieu-Daudé	1a36e4c9d0	migration: Use g_autofree to simplify ram_dirty_bitmap_reload() Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231011023627.86691-1-philmd@linaro.org>	2023-10-17 09:14:32 +02:00
Wei Wang	d7f5a04320	migration: refactor migration_completion Current migration_completion function is a bit long. Refactor the long implementation into different subfunctions: - migration_completion_precopy: completion code related to precopy - migration_completion_postcopy: completion code related to postcopy Rename await_return_path_close_on_source to close_return_path_on_source: It is renamed to match with open_return_path_on_source. This improves readability and is easier for future updates (e.g. add new subfunctions when completion code related to new features are needed). No functional changes intended. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230804093053.5037-1-wei.w.wang@intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-10-17 09:14:32 +02:00
Kevin Wolf	2b3912f135	block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_first_blk() and bdrv_is_root_node() need to hold a reader lock for the graph. These functions are the only functions in block-backend.c that access the parent list of a node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230929145157.45443-5-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-10-12 16:31:33 +02:00
Peter Xu	5e79a4bf03	migration: Add migration_rp_wait\|kick() It's just a simple wrapper for rp_sem on either wait() or kick(), make it even clearer on how it is used. Prepared to be used even for other things. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Message-ID: <20231004220240.167175-8-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-10-11 11:17:05 +02:00
Peter Xu	1015ff5476	migration: Remember num of ramblocks to sync during recovery Instead of only relying on the count of rp_sem, make the counter be part of RAMState so it can be used in both threads to synchronize on the process. rp_sem will be further reused in follow up patches, as a way to kick the main thread, e.g., on recovery failures. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-7-peterx@redhat.com>	2023-10-11 11:17:05 +02:00
Peter Xu	f4b897f485	qemufile: Always return a verbose error There're a lot of cases where we only have an errno set in last_error but without a detailed error description. When this happens, try to generate an error contains the errno as a descriptive error. This will be helpful in cases where one relies on the Error. E.g., migration state only caches Error in MigrationState.error. With this, we'll display correct error messages in e.g. query-migrate when the error was only set by qemu_file_set_error(). Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-6-peterx@redhat.com>	2023-10-11 11:17:05 +02:00
Peter Xu	2b2f6f411e	migration: Introduce migrate_has_error() Introduce a helper to detect whether MigrationState.error is set for whatever reason. This is preparation work for any thread (e.g. source return path thread) to setup errors in an unified way to MigrationState, rather than relying on its own way to set errors (mark_source_rp_bad()). Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-3-peterx@redhat.com>	2023-10-11 11:17:05 +02:00
Peter Xu	c94143e587	migration: Display error in query-migrate irrelevant of status Display it as long as being set, irrelevant of FAILED status. E.g., it may also be applicable to PAUSED stage of postcopy, to provide hint on what has gone wrong. The error_mutex seems to be overlooked when referencing the error, add it to be very safe. This will change QAPI behavior by showing up error message outside !FAILED status, but it's intended and doesn't expect to break anyone. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404 Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-2-peterx@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	2c88739cfd	migration/rdma: Replace flawed device detail dump by tracing qemu_rdma_dump_id() dumps RDMA device details to stdout. rdma_start_outgoing_migration() calls it via qemu_rdma_source_init() and qemu_rdma_resolve_host() to show source device details. rdma_start_incoming_migration() arranges its call via rdma_accept_incoming_migration() and qemu_rdma_accept() to show destination device details. Two issues: 1. rdma_start_outgoing_migration() can run in HMP context. The information should arguably go the monitor, not stdout. 2. ibv_query_port() failure is reported as error. Its callers remain unaware of this failure (qemu_rdma_dump_id() can't fail), so reporting this to the user as an error is problematic. Fixable, but the device detail dump is noise, except when troubleshooting. Tracing is a better fit. Similar function qemu_rdma_dump_id() was converted to tracing in commit `733252deb8` (Tracify migration/rdma.c). Convert qemu_rdma_dump_id(), too. While there, touch up qemu_rdma_dump_gid()'s outdated comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-54-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	ff4c919459	migration/rdma: Use error_report() & friends instead of stderr error_report() obeys -msg, reports the current error location if any, and reports to the current monitor if any. Reporting to stderr directly with fprintf() or perror() is wrong, because it loses all this. Fix the offenders. Bonus: resolves a FIXME about problematic use of errno. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-53-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	5cec563d0c	migration/rdma: Downgrade qemu_rdma_cleanup() errors to warnings Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_source_init(), qemu_rdma_connect(), rdma_start_incoming_migration(), and rdma_start_outgoing_migration() violate this principle: they call error_report() via qemu_rdma_cleanup(). Moreover, qemu_rdma_cleanup() can't fail. It is called on error paths, and QIOChannel close and finalization. Are the conditions it reports really errors? I doubt it. Downgrade qemu_rdma_cleanup()'s errors to warnings. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-52-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	b765d21e4a	migration/rdma: Silence qemu_rdma_register_and_get_keys() Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_write_one() violates this principle: it reports errors to stderr via qemu_rdma_register_and_get_keys(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up: silence qemu_rdma_register_and_get_keys(). I believe the caller's error reports suffice. If they don't, we need to convert to Error instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-51-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	7555c7713d	migration/rdma: Silence qemu_rdma_block_for_wrid() Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_post_send_control(), qemu_rdma_exchange_get_response(), and qemu_rdma_write_one() violate this principle: they call error_report(), fprintf(stderr, ...), and perror() via qemu_rdma_block_for_wrid(), qemu_rdma_poll(), and qemu_rdma_wait_comp_channel(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by dropping the error reporting from qemu_rdma_poll(), qemu_rdma_wait_comp_channel(), and qemu_rdma_block_for_wrid(). I believe the callers' error reports suffice. If they don't, we need to convert to Error instead. Bonus: resolves a FIXME about problematic use of errno. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-50-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	8dee156c1d	migration/rdma: Don't report received completion events as error When qemu_rdma_wait_comp_channel() receives an event from the completion channel, it reports an error "receive cm event while wait comp channel,cm event is T", where T is the numeric event type. However, the function fails only when T is a disconnect or device removal. Events other than these two are not actually an error, and reporting them as an error is wrong. If we need to report them to the user, we should use something else, and what to use depends on why we need to report them to the user. For now, report this error only when the function actually fails. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-49-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	01efb10637	migration/rdma: Silence qemu_rdma_reg_control() Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_source_init() and qemu_rdma_accept() violate this principle: they call error_report() via qemu_rdma_reg_control(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by dropping the error reporting from qemu_rdma_reg_control(). I believe the callers' error reports suffice. If they don't, we need to convert to Error instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-48-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	35b1561e3e	migration/rdma: Silence qemu_rdma_connect() Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_connect() violates this principle: it calls error_report() and perror(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up: replace perror() by changing error_setg() to error_setg_errno(), and drop error_report(). I believe the callers' error reports suffice then. If they don't, we need to convert to Error instead. Bonus: resolves a FIXME about problematic use of errno. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-47-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	e6696d3ee9	migration/rdma: Silence qemu_rdma_resolve_host() Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_resolve_host() violates this principle: it calls error_report(). Clean this up: drop error_report(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-46-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	07d5b94653	migration/rdma: Convert qemu_rdma_alloc_pd_cq() to Error Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_source_init() violates this principle: it calls error_report() via qemu_rdma_alloc_pd_cq(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by converting qemu_rdma_alloc_pd_cq() to Error. The conversion loses a piece of advice on one of two failure paths: Your mlock() limits may be too low. Please check $ ulimit -a # and search for 'ulimit -l' in the output Not worth retaining. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-45-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	3c0c3eba8d	migration/rdma: Convert qemu_rdma_post_recv_control() to Error Just for symmetry with qemu_rdma_post_send_control(). Error messages lose detail I consider of no use to users. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-44-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	f3805964f8	migration/rdma: Convert qemu_rdma_post_send_control() to Error Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_exchange_send() violates this principle: it calls error_report() via qemu_rdma_post_send_control(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by converting qemu_rdma_post_send_control() to Error. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-43-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	446e559c43	migration/rdma: Convert qemu_rdma_write() to Error Just for consistency with qemu_rdma_write_one() and qemu_rdma_write_flush(), and for slightly simpler code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-42-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	557c34ca60	migration/rdma: Convert qemu_rdma_write_one() to Error Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_write_flush() violates this principle: it calls error_report() via qemu_rdma_write_one(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by converting qemu_rdma_write_one() to Error. Bonus: resolves a FIXME about problematic use of errno. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-41-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	5609547781	migration/rdma: Convert qemu_rdma_write_flush() to Error Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qio_channel_rdma_writev() violates this principle: it calls error_report() via qemu_rdma_write_flush(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by converting qemu_rdma_write_flush() to Error. Necessitates setting an error when qemu_rdma_write_one() failed. Since this error will go away later in this series, simply use "FIXME temporary error message" there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-40-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	de1aa35f8d	migration/rdma: Convert qemu_rdma_reg_whole_ram_blocks() to Error Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_exchange_send() violates this principle: it calls error_report() via callback qemu_rdma_reg_whole_ram_blocks(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by converting the callback to Error. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-39-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	3765ec1f43	migration/rdma: Convert qemu_rdma_exchange_get_response() to Error Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qemu_rdma_exchange_send() and qemu_rdma_exchange_recv() violate this principle: they call error_report() via qemu_rdma_exchange_get_response(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by converting qemu_rdma_exchange_get_response() to Error. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-38-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	c4c78dce0b	migration/rdma: Convert qemu_rdma_exchange_send() to Error Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qio_channel_rdma_writev() violates this principle: it calls error_report() via qemu_rdma_exchange_send(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by converting qemu_rdma_exchange_send() to Error. Necessitates setting an error when qemu_rdma_post_recv_control(), callback(), or qemu_rdma_exchange_get_response() failed. Since these errors will go away later in this series, simply use "FIXME temporary error message" there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-37-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	96f363d839	migration/rdma: Convert qemu_rdma_exchange_recv() to Error Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. qio_channel_rdma_readv() violates this principle: it calls error_report() via qemu_rdma_exchange_recv(). I elected not to investigate how callers handle the error, i.e. precise impact is not known. Clean this up by converting qemu_rdma_exchange_recv() to Error. Necessitates setting an error when qemu_rdma_exchange_get_response() failed. Since this error will go away later in this series, simply use "FIXME temporary error message" there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-36-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	dcf07e72a4	migration/rdma: Drop "@errp is clear" guards around error_setg() These guards are all redundant now. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-35-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	071d5ae4f3	migration/rdma: Fix error handling around rdma_getaddrinfo() qemu_rdma_resolve_host() and qemu_rdma_dest_init() iterate over addresses to find one that works, holding onto the first Error from qemu_rdma_broken_ipv6_kernel() for use when no address works. Issues: 1. If @errp was &error_abort or &error_fatal, we'd terminate instead of trying the next address. Can't actually happen, since no caller passes these arguments. 2. When @errp is a pointer to a variable containing NULL, and qemu_rdma_broken_ipv6_kernel() fails, the variable no longer contains NULL. Subsequent iterations pass it again, violating Error usage rules. Dangerous, as setting an error would then trip error_setv()'s assertion. Works only because qemu_rdma_broken_ipv6_kernel() and the code following the loops carefully avoids setting a second error. 3. If qemu_rdma_broken_ipv6_kernel() fails, and then a later iteration finds a working address, @errp still holds the first error from qemu_rdma_broken_ipv6_kernel(). If we then run into another error, we report the qemu_rdma_broken_ipv6_kernel() failure instead. 4. If we don't run into another error, we leak the Error object. Use a local error variable, and propagate to @errp. This fixes 3. and also cleans up 1 and partly 2. Free this error when we have a working address. This fixes 4. Pass the local error variable to qemu_rdma_broken_ipv6_kernel() only until it fails. Pass null on any later iterations. This cleans up the remainder of 2. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-34-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	8fd471bd77	migration/rdma: Retire macro ERROR() ERROR() has become "error_setg() unless an error has been set already". Hiding the conditional in the macro is in the way of further work. Replace the macro uses by their expansion, and delete the macro. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-33-armbru@redhat.com>	2023-10-11 11:17:04 +02:00
Markus Armbruster	1718f238d1	migration/rdma: Delete inappropriate error_report() in macro ERROR() Functions that use an Error **errp parameter to return errors should not also report them to the user, because reporting is the caller's job. When the caller does, the error is reported twice. When it doesn't (because it recovered from the error), there is no error to report, i.e. the report is bogus. Macro ERROR() violates this principle. Delete the error_report() there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-32-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	e518b0050d	migration/rdma: Plug a memory leak and improve a message When migration capability @rdma-pin-all is true, but the server cannot honor it, qemu_rdma_connect() calls macro ERROR(), then returns success. ERROR() sets an error. Since qemu_rdma_connect() returns success, its caller rdma_start_outgoing_migration() duly assumes @errp is still clear. The Error object leaks. ERROR() additionally reports the situation to the user as an error: RDMA ERROR: Server cannot support pinning all memory. Will register memory dynamically. Is this an error or not? It actually isn't; we disable @rdma-pin-all and carry on. "Correcting" the user's configuration decisions that way feels problematic, but that's a topic for another day. Replace ERROR() by warn_report(). This plugs the memory leak, and emits a clearer message to the user. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-31-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	4a10217962	migration/rdma: Check negative error values the same way everywhere When a function returns 0 on success, negative value on error, checking for non-zero suffices, but checking for negative is clearer. So do that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-30-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	c0d77702d2	migration/rdma: Drop superfluous assignments to @ret Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-29-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	b86c94a49e	migration/rdma: Replace int error_state by bool errored All we do with the value of RDMAContext member @error_state is test whether it's zero. Change to bool and rename to @errored. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-28-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	ec48697453	migration/rdma: Dumb down remaining int error values to -1 This is just to make the error value more obvious. Callers don't mind. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-27-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	8c6513f750	migration/rdma: Return -1 instead of negative errno code Several functions return negative errno codes on failure. Callers check for specific codes exactly never. For some of the functions, callers couldn't check even if they wanted to, because the functions also return negative values that aren't errno codes, leaving readers confused on what the function actually returns. Clean up and simplify: return -1 instead of negative errno code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-26-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	0724982221	migration/rdma: Fix rdma_getaddrinfo() error checking rdma_getaddrinfo() returns 0 on success. On error, it returns one of the EAI_ error codes like getaddrinfo() does, or -1 with errno set. This is broken by design: POSIX implicitly specifies the EAI_ error codes to be non-zero, no more. They could clash with -1. Nothing we can do about this design flaw. Both callers of rdma_getaddrinfo() only recognize negative values as error. Works only because systems elect to make the EAI_ error codes negative. Best not to rely on that: change the callers to treat any non-zero value as failure. Also change them to return -1 instead of the value received from getaddrinfo() on failure, to avoid positive error values. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-25-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	0110c6b86a	migration/rdma: Fix QEMUFileHooks method return values The QEMUFileHooks methods don't come with a written contract. Digging through the code calling them, we find: * save_page(): Negative values RAM_SAVE_CONTROL_DELAYED and RAM_SAVE_CONTROL_NOT_SUPP are special. Any other negative value is an unspecified error. qemu_rdma_save_page() returns -EIO or rdma->error_state on error. I believe the latter is always negative. Nothing stops either of them to clash with the special values, though. Feels unlikely, but fix it anyway to return only the special values and -1. * before_ram_iterate(), after_ram_iterate(): Negative value means error. qemu_rdma_registration_start() and qemu_rdma_registration_stop() comply as far as I can tell. Make them comply obviously, by returning -1 on error. * hook_ram_load: Negative value means error. rdma_load_hook() already returns -1 on error. Leave it alone. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-24-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	d63f4016b1	migration/rdma: Drop dead qemu_rdma_data_init() code for !@host_port qemu_rdma_data_init() neglects to set an Error when it fails because @host_port is null. Fortunately, no caller passes null, so this is merely a latent bug. Drop the flawed code handling null argument. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-23-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	f35c0d9b07	migration/rdma: Fix qemu_get_cm_event_timeout() to always set error qemu_get_cm_event_timeout() neglects to set an error when it fails because rdma_get_cm_event() fails. Harmless, as its caller qemu_rdma_connect() substitutes a generic error then. Fix it anyway. qemu_rdma_connect() also sets the generic error when its own call of rdma_get_cm_event() fails. Make the error handling more obvious: set a specific error right after rdma_get_cm_event() fails. Delete the generic error. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-22-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	142bd685ae	migration/rdma: Fix qemu_rdma_broken_ipv6_kernel() to set error qemu_rdma_resolve_host() and qemu_rdma_dest_init() try addresses until they find on that works. If none works, they return the first Error set by qemu_rdma_broken_ipv6_kernel(), or else return a generic one. qemu_rdma_broken_ipv6_kernel() neglects to set an Error when ibv_open_device() fails. If a later address fails differently, we use that Error instead, or else the generic one. Harmless enough, but needs fixing all the same. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-21-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	de3e05e8b9	migration/rdma: Replace dangerous macro CHECK_ERROR_STATE() Hiding return statements in macros is a bad idea. Use a function instead, and open code the return part. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-20-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	8e262e0b3d	migration/rdma: Fix io_writev(), io_readv() methods to obey contract QIOChannelClass methods qio_channel_rdma_readv() and qio_channel_rdma_writev() violate their method contract when rdma->error_state is non-zero: 1. They return whatever is in rdma->error_state then. Only -1 will be fine. -2 will be misinterpreted as "would block". Anything less than -2 isn't defined in the contract. A positive value would be misinterpreted as success, but I believe that's not actually possible. 2. They neglect to set an error then. If something up the call stack dereferences the error when failure is returned, it will crash. If it ignores the return value and checks the error instead, it will miss the error. Crap like this happens when return statements hide in macros, especially when their uses are far away from the definition. I elected not to investigate how callers are impacted. Expand the two bad macro uses, so we can set an error and return -1. The next commit will then get rid of the macro altogether. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-19-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	1b6e1da6e7	migration/rdma: Ditch useless numeric error codes in error messages Several error messages include numeric error codes returned by failed functions: * ibv_poll_cq() returns an unspecified negative value. Useless. * rdma_accept and rdma_get_cm_event() return -1. Useless. * qemu_rdma_poll() returns either -1 or an unspecified negative value. Useless. * qemu_rdma_block_for_wrid(), qemu_rdma_write_flush(), qemu_rdma_exchange_send(), qemu_rdma_exchange_recv(), qemu_rdma_write() return a negative value that may or may not be an errno value. While reporting human-readable errno information (which a number is not) can be useful, reporting an error code that may or may not be an errno value is useless. Drop these error codes from the error messages. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-18-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	0bc2604508	migration/rdma: Fix or document problematic uses of errno We use errno after calling Libibverbs functions that are not documented to set errno (manual page does not mention errno), or where the documentation is unclear ("returns [...] the value of errno on failure"). While this could be read as "sets errno and returns it", a glance at the source code[] kills that hope: static inline int ibv_post_send(struct ibv_qp qp, struct ibv_send_wr wr, struct ibv_send_wr bad_wr) { return qp->context->ops.post_send(qp, wr, bad_wr); } The callback can be static int mana_post_send(struct ibv_qp ibqp, struct ibv_send_wr wr, struct ibv_send_wr bad) { / This version of driver supports RAW QP only. * Posting WR is done directly in the application. / return EOPNOTSUPP; } Neither of them touches errno. One of these errno uses is easy to fix, so do that now. Several more will go away later in the series; add temporary FIXME commments. Three will remain; add TODO comments. TODO, not FIXME, because the bug might be in Libibverbs documentation. [] https://github.com/linux-rdma/rdma-core.git commit 55fa316b4b18f258d8ac1ceb4aa5a7a35b094dcf Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-17-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	89997ac318	migration/rdma: Use bool for two RDMAContext flags @error_reported and @received_error are flags. The latter is even assigned bool true. Change them from int to bool. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-16-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	6a3792d78d	migration/rdma: Make qemu_rdma_buffer_mergeable() return bool qemu_rdma_buffer_mergeable() is semantically a predicate. It returns int 0 or 1. Return bool instead, and fix the function name's spelling. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-15-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	87e6bdabf0	migration/rdma: Drop qemu_rdma_search_ram_block() error handling qemu_rdma_search_ram_block() can't fail. Return void, and drop the unreachable error handling. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-14-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	0610d7a1d8	migration/rdma: Drop rdma_add_block() error handling rdma_add_block() can't fail. Return void, and drop the unreachable error handling. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-13-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	b16defbbfe	migration/rdma: Eliminate error_propagate() When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-12-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	3c03f21cb5	migration/rdma: Put @errp parameter last include/qapi/error.h demands: * - Functions that use Error to report errors have an Error *errp parameter. It should be the last parameter, except for functions * taking variable arguments. qemu_rdma_connect() does not conform. Clean it up. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-11-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	9a6afb1170	migration/rdma: Fix qemu_rdma_accept() to return failure on errors qemu_rdma_accept() returns 0 in some cases even when it didn't complete its job due to errors. Impact is not obvious. I figure the caller will soon fail again with a misleading error message. Fix it to return -1 on any failure. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-10-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	36cc822d85	migration/rdma: Give qio_channel_rdma_source_funcs internal linkage Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-9-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	8ff58b05a3	migration/rdma: Clean up two more harmless signed vs. unsigned issues qemu_rdma_exchange_get_response() compares int parameter @expecting with uint32_t head->type. Actual arguments are non-negative enumeration constants, RDMAControlHeader uint32_t member type, or qemu_rdma_exchange_recv() int parameter expecting. Actual arguments for the latter are non-negative enumeration constants. Change both parameters to uint32_t. In qio_channel_rdma_readv(), loop control variable @i is ssize_t, and counts from 0 up to @niov, which is size_t. Change @i to size_t. While there, make qio_channel_rdma_readv() and qio_channel_rdma_writev() more consistent: change the former's @done to ssize_t, and delete the latter's useless initialization of @len. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-8-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	25352b371b	migration/rdma: Fix unwanted integer truncation qio_channel_rdma_readv() assigns the size_t value of qemu_rdma_fill() to an int variable before it adds it to @done / subtracts it from @want, both size_t. Truncation when qemu_rdma_fill() copies more than INT_MAX bytes. Seems vanishingly unlikely, but needs fixing all the same. Fixes: `6ddd2d76ca` (migration: convert RDMA to use QIOChannel interface) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-7-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	87a24ca3f2	migration/rdma: Consistently use uint64_t for work request IDs We use int instead of uint64_t in a few places. Change them to uint64_t. This cleans up a comparison of signed qemu_rdma_block_for_wrid() parameter @wrid_requested with unsigned @wr_id. Harmless, because the actual arguments are non-negative enumeration constants. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-6-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	b5631d5bda	migration/rdma: Drop fragile wr_id formatting wrid_desc[] uses 4001 pointers to map four integer values to strings. print_wrid() accesses wrid_desc[] out of bounds when passed a negative argument. It returns null for values 2..1999 and 2001..3999. qemu_rdma_poll() and qemu_rdma_block_for_wrid() print wrid_desc[wr_id] and passes print_wrid(wr_id) to tracepoints. Could conceivably crash trying to format a null string. I believe access out of bounds is not possible. Not worth cleaning up. Dumb down to show just numeric wr_id. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-5-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	1720a2a875	migration/rdma: Clean up rdma_delete_block()'s return type rdma_delete_block() always returns 0, which its only caller ignores. Return void instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-4-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	c07d19622c	migration/rdma: Clean up qemu_rdma_data_init()'s return type qemu_rdma_data_init() return type is void . It actually returns RDMAContext , and all its callers assign the value to an RDMAContext . Unclean. Return RDMAContext instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-3-armbru@redhat.com>	2023-10-11 11:17:03 +02:00
Markus Armbruster	b72eacf3c0	migration/rdma: Clean up qemu_rdma_poll()'s return type qemu_rdma_poll()'s return type is uint64_t, even though it returns 0, -1, or @ret, which is int. Its callers assign the return value to int variables, then check whether it's negative. Unclean. Return int instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230928132019.2544702-2-armbru@redhat.com>	2023-10-11 11:17:02 +02:00
Peter Xu	0e99bb8f54	migration: Allow RECOVER->PAUSED convertion for dest qemu There's a bug on dest that if a double fault triggered on dest qemu (a network issue during postcopy-recover), we won't set PAUSED correctly because we assumed we always came from ACTIVE. Fix that by always overwriting the state to PAUSE. We could also check for these two states, but maybe it's an overkill. We did the same on the src QEMU to unconditionally switch to PAUSE anyway. Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231004220240.167175-10-peterx@redhat.com>	2023-10-11 11:17:02 +02:00
Fabiano Rosas	4111a732e8	migration: Set migration status early in incoming side We are sending a migration event of MIGRATION_STATUS_SETUP at qemu_start_incoming_migration but never actually setting the state. This creates a window between qmp_migrate_incoming and process_incoming_migration_co where the migration status is still MIGRATION_STATUS_NONE. Calling query-migrate during this time will return an empty response even though the incoming migration command has already been issued. Commit `7cf1fe6d68` ("migration: Add migration events on target side") has added support to the 'events' capability to the incoming part of migration, but chose to send the SETUP event without setting the state. I'm assuming this was a mistake. This introduces a change in behavior, any QMP client waiting for the SETUP event will hang, unless it has previously enabled the 'events' capability. Having the capability enabled is sufficient to continue to receive the event. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230712190742.22294-5-farosas@suse.de>	2023-10-11 11:17:02 +02:00
Peter Xu	86dec715a7	migration/qmp: Fix crash on setting tls-authz with null QEMU will crash if anyone tries to set tls-authz (which is a type StrOrNull) with 'null' value. Fix it in the easy way by converting it to qstring just like the other two tls parameters. Cc: qemu-stable@nongnu.org # v4.0+ Fixes: `d2f1d29b95` ("migration: add support for a "tls-authz" migration parameter") Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230905162335.235619-2-peterx@redhat.com>	2023-10-11 11:17:02 +02:00
Andrei Gudkov	320a6ccc76	migration/dirtyrate: use QEMU_CLOCK_HOST to report start-time Currently query-dirty-rate uses QEMU_CLOCK_REALTIME as the source for start-time field. This translates to clock_gettime(CLOCK_MONOTONIC), i.e. number of seconds since host boot. This is not very useful. The only reasonable use case of start-time I can imagine is to check whether previously completed measurements are too old or not. But this makes sense only if start-time is reported as host wall-clock time. This patch replaces source of start-time from QEMU_CLOCK_REALTIME to QEMU_CLOCK_HOST. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <399861531e3b24a1ecea2ba453fb2c3d129fb03a.1693905328.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>	2023-10-10 08:04:12 +08:00
Andrei Gudkov	34a68001f1	migration/calc-dirty-rate: millisecond-granularity period This patch allows to measure dirty page rate for sub-second intervals of time. An optional argument is introduced -- calc-time-unit. For example: {"execute": "calc-dirty-rate", "arguments": {"calc-time": 500, "calc-time-unit": "millisecond"} } Millisecond granularity allows to make predictions whether migration will succeed or not. To do this, calculate dirty rate with calc-time set to max allowed downtime (e.g. 300ms), convert measured rate into volume of dirtied memory, and divide by network throughput. If the value is lower than max allowed downtime, then migration will converge. Measurement results for single thread randomly writing to a 1/4/24GiB memory region: +----------------+-----------------------------------------------+ \| calc-time \| dirty rate MiB/s \| \| (milliseconds) +----------------+---------------+--------------+ \| \| theoretical \| page-sampling \| dirty-bitmap \| \| \| (at 3M wr/sec) \| \| \| +----------------+----------------+---------------+--------------+ \| 1GiB \| +----------------+----------------+---------------+--------------+ \| 100 \| 6996 \| 7100 \| 3192 \| \| 200 \| 4606 \| 4660 \| 2655 \| \| 300 \| 3305 \| 3280 \| 2371 \| \| 400 \| 2534 \| 2525 \| 2154 \| \| 500 \| 2041 \| 2044 \| 1871 \| \| 750 \| 1365 \| 1341 \| 1358 \| \| 1000 \| 1024 \| 1052 \| 1025 \| \| 1500 \| 683 \| 678 \| 684 \| \| 2000 \| 512 \| 507 \| 513 \| +----------------+----------------+---------------+--------------+ \| 4GiB \| +----------------+----------------+---------------+--------------+ \| 100 \| 10232 \| 8880 \| 4070 \| \| 200 \| 8954 \| 8049 \| 3195 \| \| 300 \| 7889 \| 7193 \| 2881 \| \| 400 \| 6996 \| 6530 \| 2700 \| \| 500 \| 6245 \| 5772 \| 2312 \| \| 750 \| 4829 \| 4586 \| 2465 \| \| 1000 \| 3865 \| 3780 \| 2178 \| \| 1500 \| 2694 \| 2633 \| 2004 \| \| 2000 \| 2041 \| 2031 \| 1789 \| +----------------+----------------+---------------+--------------+ \| 24GiB \| +----------------+----------------+---------------+--------------+ \| 100 \| 11495 \| 8640 \| 5597 \| \| 200 \| 11226 \| 8616 \| 3527 \| \| 300 \| 10965 \| 8386 \| 2355 \| \| 400 \| 10713 \| 8370 \| 2179 \| \| 500 \| 10469 \| 8196 \| 2098 \| \| 750 \| 9890 \| 7885 \| 2556 \| \| 1000 \| 9354 \| 7506 \| 2084 \| \| 1500 \| 8397 \| 6944 \| 2075 \| \| 2000 \| 7574 \| 6402 \| 2062 \| +----------------+----------------+---------------+--------------+ Theoretical values are computed according to the following formula: size * (1 - (1-(4096/size))^(timewps)) / (time 2^20), where size is in bytes, time is in seconds, and wps is number of writes per second. Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>	2023-10-10 08:03:50 +08:00
Peter Xu	579cedf430	migration: Unify and trace vmstate field_exists() checks For both save/load we actually share the logic on deciding whether a field should exist. Merge the checks into a helper and use it for both save and load. When doing so, add documentations and reformat the code to make it much easier to read. The real benefit here (besides code cleanups) is we add a trace-point for this; this is a known spot where we can easily break migration compatibilities between binaries, and this trace point will be critical for us to identify such issues. For example, this will be handy when debugging things like: https://gitlab.com/qemu-project/qemu/-/issues/932 Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230906204722.514474-1-peterx@redhat.com>	2023-10-04 13:19:47 +02:00
Steve Sistare	385f510df5	migration: file URI offset Allow an offset option to be specified as part of the file URI, in the form "file:filename,offset=offset", where offset accepts the common size suffixes, or the 0x prefix, but not both. Migration data is written to and read from the file starting at offset. If unspecified, it defaults to 0. This is needed by libvirt to store its own data at the head of the file. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1694182931-61390-3-git-send-email-steven.sistare@oracle.com>	2023-10-04 13:18:08 +02:00
Steve Sistare	2a9e2e595f	migration: file URI Extend the migration URI to support file:<filename>. This can be used for any migration scenario that does not require a reverse path. It can be used as an alternative to 'exec:cat > file' in minimized containers that do not contain /bin/sh, and it is easier to use than the fd:<fdname> URI. It can be used in HMP commands, and as a qemu command-line parameter. For best performance, guest ram should be shared and x-ignore-shared should be true, so guest pages are not written to the file, in which case the guest may remain running. If ram is not so configured, then the user is advised to stop the guest first. Otherwise, a busy guest may re-dirty the same page, causing it to be appended to the file multiple times, and the file may grow unboundedly. That issue is being addressed in the "fixed-ram" patch series. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Tested-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1694182931-61390-2-git-send-email-steven.sistare@oracle.com>	2023-10-04 13:16:58 +02:00
Li Zhijian	2ada4b63f1	migration/rdma: zore out head.repeat to make the error more clear Previously, we got a confusion error that complains the RDMAControlHeader.repeat: qemu-system-x86_64: rdma: Too many requests in this message (3638950032).Bailing. Actually, it's caused by an unexpected RDMAControlHeader.type. After this patch, error will become: qemu-system-x86_64: Unknown control message QEMU FILE Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230926100103.201564-2-lizhijian@fujitsu.com>	2023-10-04 10:54:41 +02:00
Tejus GK	848a050342	migration: Update error description outside migration.c A few code paths exist in the source code,where a migration is marked as failed via MIGRATION_STATUS_FAILED, but the failure happens outside of migration.c In such cases, an error_report() call is made, however the current MigrationState is never updated with the error description, and hence clients like libvirt never know the actual reason for the failure. This patch covers such cases outside of migration.c and updates the error description at the appropriate places. Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231003065538.244752-3-tejus.gk@nutanix.com>	2023-10-04 10:54:40 +02:00
Tejus GK	969298f9d7	migration/vmstate: Introduce vmstate_save_state_with_err Currently, a few code paths exist in the function vmstate_save_state_v, which ultimately leads to a migration failure. However, an update in the current MigrationState for the error description is never done. vmstate.c somehow doesn't seem to allow the use of migrate_set_error due to some dependencies for unit tests. Hence, this patch introduces a new function vmstate_save_state_with_err, which will eventually propagate the error message to savevm.c where a migrate_set_error call can be eventually done. Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231003065538.244752-2-tejus.gk@nutanix.com>	2023-10-04 10:54:40 +02:00
Stefan Hajnoczi	50d0bfd0ed	Migration Pull request (20231002) In this migration pull request: - Refactor repeated call of yank_unregister_instance (tejus) - More migraton-test changes Please, apply. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUatX4ACgkQ9IfvGFhy 1yMlbQ/+Kp7m1Mr5LUM/8mvh9LZTVvWauBHch1pdvpCsJO+Grdtv6MtZL5UKT2ue xYksZvf/rT4bdt2H1lSsG1o2GOcIf4qyWICgYNDo8peaxm1IrvgAbimaWHWLeORX sBxKcBBuTac55vmEKzbPSbwGCGGTU/11UGXQ4ruGN3Hwbd2JZHAK6GxGIzANToZc JtwBr/31SxJ2YndNLaPMEnD3cHbRbD2UyODeTt1KI5LdTGgXHoB6PgCk2AMQP1Ko LlaPLsrEKC06h2CJ27BB36CNVEGMN2iFa3aKz1FC85Oj2ckatspAFw78t9guj6eM MYxn0ipSsjjWjMsc3zEDxi7JrA///5bp1e6e7WdLpOaMBPpV4xuvVvA6Aku2es7D fMPOMdftBp6rrXp8edBMTs1sOHdE1k8ZsyJ90m96ckjfLX39TPAiJRm4pWD2UuP5 Wjr+/IU+LEp/KCqimMj0kYMRz4rM3PP8hOakPZLiRR5ZG6sgbHZK44iPXB/Udz/g TCZ87siIpI8YHb3WCaO5CvbdjPrszg1j9v7RimtDeGLDR/hNokkQ1EEeszDTGpgt xst4S4wVmex2jYyi53woH4V1p8anP7iqa8elPehAaYPobp47pmBV53ZaSwibqzPN TmO7P9rfyQGCiXXZRvrAQJa+gmAkQlSEI7mSssV77pU+1gdEj9c= =hD/8 -----END PGP SIGNATURE----- Merge tag 'migration-20231002-pull-request' of https://gitlab.com/juan.quintela/qemu into staging Migration Pull request (20231002) In this migration pull request: - Refactor repeated call of yank_unregister_instance (tejus) - More migraton-test changes Please, apply. # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmUatX4ACgkQ9IfvGFhy # 1yMlbQ/+Kp7m1Mr5LUM/8mvh9LZTVvWauBHch1pdvpCsJO+Grdtv6MtZL5UKT2ue # xYksZvf/rT4bdt2H1lSsG1o2GOcIf4qyWICgYNDo8peaxm1IrvgAbimaWHWLeORX # sBxKcBBuTac55vmEKzbPSbwGCGGTU/11UGXQ4ruGN3Hwbd2JZHAK6GxGIzANToZc # JtwBr/31SxJ2YndNLaPMEnD3cHbRbD2UyODeTt1KI5LdTGgXHoB6PgCk2AMQP1Ko # LlaPLsrEKC06h2CJ27BB36CNVEGMN2iFa3aKz1FC85Oj2ckatspAFw78t9guj6eM # MYxn0ipSsjjWjMsc3zEDxi7JrA///5bp1e6e7WdLpOaMBPpV4xuvVvA6Aku2es7D # fMPOMdftBp6rrXp8edBMTs1sOHdE1k8ZsyJ90m96ckjfLX39TPAiJRm4pWD2UuP5 # Wjr+/IU+LEp/KCqimMj0kYMRz4rM3PP8hOakPZLiRR5ZG6sgbHZK44iPXB/Udz/g # TCZ87siIpI8YHb3WCaO5CvbdjPrszg1j9v7RimtDeGLDR/hNokkQ1EEeszDTGpgt # xst4S4wVmex2jYyi53woH4V1p8anP7iqa8elPehAaYPobp47pmBV53ZaSwibqzPN # TmO7P9rfyQGCiXXZRvrAQJa+gmAkQlSEI7mSssV77pU+1gdEj9c= # =hD/8 # -----END PGP SIGNATURE----- # gpg: Signature made Mon 02 Oct 2023 08:20:14 EDT # gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full] # gpg: aka "Juan Quintela <quintela@trasno.org>" [full] # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * tag 'migration-20231002-pull-request' of https://gitlab.com/juan.quintela/qemu: migration/rdma: Simplify the function that saves a page migration: Remove unused qemu_file_credit_transfer() migration/rdma: Don't use imaginary transfers migration/rdma: Remove QEMUFile parameter when not used migration/RDMA: It is accounting for zero/normal pages in two places migration: Don't abuse qemu_file transferred for RDMA migration: Use qemu_file_transferred_noflush() for block migration. migration: Refactor repeated call of yank_unregister_instance migration-test: simplify shmem_opts handling migration-test: dirtylimit checks for x86_64 arch before migration-test: Add bootfile_create/delete() functions migration-test: bootpath is the same for all tests and for all archs migration-test: Create kvm_opts Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-10-02 14:42:44 -04:00
Juan Quintela	9c53d369e5	migration/rdma: Simplify the function that saves a page When we sent a page through QEMUFile hooks (RDMA) there are three posiblities: - We are not using RDMA. return RAM_SAVE_CONTROL_DELAYED and control_save_page() returns false to let anything else to proceed. - There is one error but we are using RDMA. Then we return a negative value, control_save_page() needs to return true. - Everything goes well and RDMA start the sent of the page asynchronously. It returns RAM_SAVE_CONTROL_DELAYED and we need to return 1 for ram_save_page_legacy. Clear? I know, I know, the interface is as bad as it gets. I think that now it is a bit clearer, but this needs to be done some other way. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-16-quintela@redhat.com>	2023-09-29 18:13:53 +02:00
Juan Quintela	9f51fe9239	migration: Remove unused qemu_file_credit_transfer() After this change, nothing abuses QEMUFile to account for data transferrefd during migration. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-15-quintela@redhat.com>	2023-09-29 18:13:36 +02:00
Juan Quintela	2ebe5d4d5a	migration/rdma: Don't use imaginary transfers RDMA protocol is completely asynchronous, so in qemu_rdma_save_page() they "invent" that a byte has been transferred. And then they call qemu_file_credit_transfer() and ram_transferred_add() with that byte. Just remove that calls as nothing has been sent. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-14-quintela@redhat.com>	2023-09-29 18:11:21 +02:00
Juan Quintela	e33780351c	migration/rdma: Remove QEMUFile parameter when not used Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-13-quintela@redhat.com>	2023-09-29 18:11:21 +02:00
Juan Quintela	19df4f3226	migration/RDMA: It is accounting for zero/normal pages in two places Remove the one in control_save_page(). Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-12-quintela@redhat.com>	2023-09-29 18:11:21 +02:00
Juan Quintela	67c31c9c1a	migration: Don't abuse qemu_file transferred for RDMA Just create a variable for it, the same way that multifd does. This way it is safe to use for other thread, etc, etc. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-11-quintela@redhat.com>	2023-09-29 18:11:21 +02:00
Juan Quintela	f16ecfa9f9	migration: Use qemu_file_transferred_noflush() for block migration. We only care about the amount of bytes transferred. Flushing is done by the system somewhere else. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230530183941.7223-4-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-09-29 17:05:23 +02:00
Tejus GK	f4e1b61336	migration: Refactor repeated call of yank_unregister_instance In the function qmp_migrate(), yank_unregister_instance() gets called twice which isn't required. Hence, refactoring it so that it gets called during the local_error cleanup. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Message-ID: <20230621130940.178659-3-tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-09-29 17:05:23 +02:00
Markus Armbruster	7f3de3f02f	migration: Clean up local variable shadowing Local variables shadowing other local variables or parameters make the code needlessly hard to understand. Tracked down with -Wshadow=local. Clean up: delete inner declarations when they are actually redundant, else rename variables. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Message-ID: <20230921121312.1301864-3-armbru@redhat.com>	2023-09-29 08:13:57 +02:00
Markus Armbruster	bbde656263	migration/rdma: Fix save_page method to fail on polling error qemu_rdma_save_page() reports polling error with error_report(), then succeeds anyway. This is because the variable holding the polling status shadows the variable the function returns. The latter remains zero. Broken since day one, and duplicated more recently. Fixes: `2da776db48` (rdma: core logic) Fixes: `b390afd8c5` (migration/rdma: Fix out of order wrid) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Message-ID: <20230921121312.1301864-2-armbru@redhat.com>	2023-09-29 08:13:57 +02:00
Fabiano Rosas	36e9aab3c5	migration: Move return path cleanup to main migration thread Now that the return path thread is allowed to finish during a paused migration, we can move the cleanup of the QEMUFiles to the main migration thread. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-9-farosas@suse.de>	2023-09-27 13:58:02 -04:00
Fabiano Rosas	ef796ee93b	migration: Replace the return path retry logic Replace the return path retry logic with finishing and restarting the thread. This fixes a race when resuming the migration that leads to a segfault. Currently when doing postcopy we consider that an IO error on the return path file could be due to a network intermittency. We then keep the thread alive but have it do cleanup of the 'from_dst_file' and wait on the 'postcopy_pause_rp' semaphore. When the user issues a migrate resume, a new return path is opened and the thread is allowed to continue. There's a race condition in the above mechanism. It is possible for the new return path file to be setup before the cleanup code in the return path thread has had a chance to run, leading to the new file being closed and the pointer set to NULL. When the thread is released after the resume, it tries to dereference 'from_dst_file' and crashes: Thread 7 "return path" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffd1dbf700 (LWP 9611)] 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 154 return f->last_error; (gdb) bt #0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 #1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206 #2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876 #3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541 #4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477 #5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Here's the race (important bit is open_return_path happening before migration_release_dst_files): migration \| qmp \| return path --------------------------+-----------------------------+--------------------------------- qmp_migrate_pause() shutdown(ms->to_dst_file) f->last_error = -EIO migrate_detect_error() postcopy_pause() set_state(PAUSED) wait(postcopy_pause_sem) qmp_migrate(resume) migrate_fd_connect() resume = state == PAUSED open_return_path <-- TOO SOON! set_state(RECOVER) post(postcopy_pause_sem) (incoming closes to_src_file) res = qemu_file_get_error(rp) migration_release_dst_files() ms->rp_state.from_dst_file = NULL post(postcopy_pause_rp_sem) postcopy_pause_return_path_thread() wait(postcopy_pause_rp_sem) rp = ms->rp_state.from_dst_file goto retry qemu_file_get_error(rp) SIGSEGV ------------------------------------------------------------------------------------------- We can keep the retry logic without having the thread alive and waiting. The only piece of data used by it is the 'from_dst_file' and it is only allowed to proceed after a migrate resume is issued and the semaphore released at migrate_fd_connect(). Move the retry logic to outside the thread by waiting for the thread to finish before pausing the migration. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-8-farosas@suse.de>	2023-09-27 13:58:02 -04:00
Fabiano Rosas	d50f5dc075	migration: Consolidate return path closing code We'll start calling the await_return_path_close_on_source() function from other parts of the code, so move all of the related checks and tracepoints into it. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-7-farosas@suse.de>	2023-09-27 13:58:02 -04:00
Fabiano Rosas	b3b101157d	migration: Remove redundant cleanup of postcopy_qemufile_src This file is owned by the return path thread which is already doing cleanup. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-6-farosas@suse.de>	2023-09-27 13:58:02 -04:00
Fabiano Rosas	7478fb0df9	migration: Fix possible race when shutting down to_dst_file It's not safe to call qemu_file_shutdown() on the to_dst_file without first checking for the file's presence under the lock. The cleanup of this file happens at postcopy_pause() and migrate_fd_cleanup() which are not necessarily running in the same thread as migrate_fd_cancel(). Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-5-farosas@suse.de>	2023-09-27 13:58:02 -04:00
Fabiano Rosas	639decf529	migration: Fix possible races when shutting down the return path We cannot call qemu_file_shutdown() on the return path file without taking the file lock. The return path thread could be running it's cleanup code and have just cleared the from_dst_file pointer. Checking ms->to_dst_file for errors could also race with migrate_fd_cleanup() which clears the to_dst_file pointer. Protect both accesses by taking the file lock. This was caught by inspection, it should be rare, but the next patches will start calling this code from other places, so let's do the correct thing. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-4-farosas@suse.de>	2023-09-27 13:58:02 -04:00
Fabiano Rosas	28a8347281	migration: Fix possible race when setting rp_state.error We don't need to set the rp_state.error right after a shutdown because qemu_file_shutdown() always sets the QEMUFile error, so the return path thread would have seen it and set the rp error itself. Setting the error outside of the thread is also racy because the thread could clear it after we set it. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-3-farosas@suse.de>	2023-09-27 13:58:02 -04:00
Peter Xu	cf02f29e1e	migration: Fix race that dest preempt thread close too early We hit intermit CI issue on failing at migration-test over the unit test preempt/plain: qemu-system-x86_64: Unable to read from socket: Connection reset by peer Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1 ** ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0) (test program exited with status code -6) Fabiano debugged into it and found that the preempt thread can quit even without receiving all the pages, which can cause guest not receiving all the pages and corrupt the guest memory. To make sure preempt thread finished receiving all the pages, we can rely on the page_requested_count being zero because preempt channel will only receive requested page faults. Note, not all the faulted pages are required to be sent via the preempt channel/thread; imagine the case when a requested page is just queued into the background main channel for migration, the src qemu will just still send it via the background channel. Here instead of spinning over reading the count, we add a condvar so the main thread can wait on it if that unusual case happened, without burning the cpu for no good reason, even if the duration is short; so even if we spin in this rare case is probably fine. It's just better to not do so. The condvar is only used when that special case is triggered. Some memory ordering trick is needed to guarantee it from happening (against the preempt thread status field), so the main thread will always get a kick when that triggers correctly. Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886 Debugged-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-2-farosas@suse.de>	2023-09-27 13:58:02 -04:00
Avihai Horon	08fc4cb517	migration: Add .save_prepare() handler to struct SaveVMHandlers Add a new .save_prepare() handler to struct SaveVMHandlers. This handler is called early, even before migration starts, and can be used by devices to perform early checks. Refactor migrate_init() to be able to return errors and call .save_prepare() from there. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-09-11 08:34:06 +02:00
Avihai Horon	f543aa222d	migration: Move more initializations to migrate_init() Initialization of mig_stats, compression_counters and VFIO bytes transferred is hard-coded in migration code path and snapshot code path. Make the code cleaner by initializing them in migrate_init(). Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-09-11 08:34:06 +02:00
Avihai Horon	38c482b477	migration: Add migration prefix to functions in target.c The functions in target.c are not static, yet they don't have a proper migration prefix. Add such prefix. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-09-11 08:34:06 +02:00
Stefan Hajnoczi	06e0f098d6	io: follow coroutine AioContext in qio_channel_yield() The ongoing QEMU multi-queue block layer effort makes it possible for multiple threads to process I/O in parallel. The nbd block driver is not compatible with the multi-queue block layer yet because QIOChannel cannot be used easily from coroutines running in multiple threads. This series changes the QIOChannel API to make that possible. In the current API, calling qio_channel_attach_aio_context() sets the AioContext where qio_channel_yield() installs an fd handler prior to yielding: qio_channel_attach_aio_context(ioc, my_ctx); ... qio_channel_yield(ioc); // my_ctx is used here ... qio_channel_detach_aio_context(ioc); This API design has limitations: reading and writing must be done in the same AioContext and moving between AioContexts involves a cumbersome sequence of API calls that is not suitable for doing on a per-request basis. There is no fundamental reason why a QIOChannel needs to run within the same AioContext every time qio_channel_yield() is called. QIOChannel only uses the AioContext while inside qio_channel_yield(). The rest of the time, QIOChannel is independent of any AioContext. In the new API, qio_channel_yield() queries the AioContext from the current coroutine using qemu_coroutine_get_aio_context(). There is no need to explicitly attach/detach AioContexts anymore and qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone. One coroutine can read from the QIOChannel while another coroutine writes from a different AioContext. This API change allows the nbd block driver to use QIOChannel from any thread. It's important to keep in mind that the block driver already synchronizes QIOChannel access and ensures that two coroutines never read simultaneously or write simultaneously. This patch updates all users of qio_channel_attach_aio_context() to the new API. Most conversions are simple, but vhost-user-server requires a new qemu_coroutine_yield() call to quiesce the vu_client_trip() coroutine when not attached to any AioContext. While the API is has become simpler, there is one wart: QIOChannel has a special case for the iohandler AioContext (used for handlers that must not run in nested event loops). I didn't find an elegant way preserve that behavior, so I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true\|false) for opting in to the new AioContext model. By default QIOChannel uses the iohandler AioHandler. Code that formerly called qio_channel_attach_aio_context() now calls qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is created. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20230830224802.493686-5-stefanha@redhat.com> [eblake: also fix migration/rdma.c] Signed-off-by: Eric Blake <eblake@redhat.com>	2023-09-07 20:32:11 -05:00
Stefan Hajnoczi	156618d9ea	Pull request v3: - Drop UFS emulation due to CI failures - Add "aio-posix: zero out io_uring sqe user_data" -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTvLIEACgkQnKSrs4Gr c8itVggAka3RMkEclbeW7JKJBOolm3oUuJTobV8oJfDNMQ8mmom9JkXVUctyPWQT EF+oeqZz1omjr0Dk7YEA2toCahTbXm/UsG7i6cZg8JXPl6e9sOne0j+p5zO5x/kc YlG43SBQJHdp/BfTm/gvwUh0W2on0wadaeEV82m3ZyIrZGTgNcrC1p1gj5dwF5VX SqW02mgALETECyJpo8O7y9vNUYGxEtETG9jzAhtrugGpYk4bPeXlm/rc+2zwV+ET YCnfUvhjhlu5vS4nkta6natg0If16ODjy35vWYm/aGlgveGTqQq9HWgTL71eNuxm Smn+hJHuvkyBclKjbGiiO1W1MuG1/g== =UvNK -----END PGP SIGNATURE----- Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging Pull request v3: - Drop UFS emulation due to CI failures - Add "aio-posix: zero out io_uring sqe user_data" # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTvLIEACgkQnKSrs4Gr # c8itVggAka3RMkEclbeW7JKJBOolm3oUuJTobV8oJfDNMQ8mmom9JkXVUctyPWQT # EF+oeqZz1omjr0Dk7YEA2toCahTbXm/UsG7i6cZg8JXPl6e9sOne0j+p5zO5x/kc # YlG43SBQJHdp/BfTm/gvwUh0W2on0wadaeEV82m3ZyIrZGTgNcrC1p1gj5dwF5VX # SqW02mgALETECyJpo8O7y9vNUYGxEtETG9jzAhtrugGpYk4bPeXlm/rc+2zwV+ET # YCnfUvhjhlu5vS4nkta6natg0If16ODjy35vWYm/aGlgveGTqQq9HWgTL71eNuxm # Smn+hJHuvkyBclKjbGiiO1W1MuG1/g== # =UvNK # -----END PGP SIGNATURE----- # gpg: Signature made Wed 30 Aug 2023 07:48:17 EDT # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [ultimate] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [ultimate] # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * tag 'block-pull-request' of https://gitlab.com/stefanha/qemu: aio-posix: zero out io_uring sqe user_data tests/qemu-iotests/197: add testcase for CoR with subclusters block/io: align requests to subcluster_size block: add subcluster_size field to BlockDriverInfo block-migration: Ensure we don't crash during migration cleanup Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-08-30 09:20:27 -04:00
Fabiano Rosas	f187609f27	block-migration: Ensure we don't crash during migration cleanup We can fail the blk_insert_bs() at init_blk_migration(), leaving the BlkMigDevState without a dirty_bitmap and BlockDriverState. Account for the possibly missing elements when doing cleanup. Fix the following crashes: Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359 359 BlockDriverState *bs = bitmap->bs; #0 0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359 #1 0x0000555555bba331 in unset_dirty_tracking () at ../migration/block.c:371 #2 0x0000555555bbad98 in block_migration_cleanup_bmds () at ../migration/block.c:681 Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073 7073 QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) { #0 0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073 #1 0x0000555555e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at ../block.c:7095 #2 0x0000555555bbae13 in block_migration_cleanup_bmds () at ../migration/block.c:690 Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-id: 20230731203338.27581-1-farosas@suse.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-08-30 07:39:10 -04:00
Andrei Gudkov	3eb82637fb	migration/dirtyrate: Fix precision losses and g_usleep overshoot Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Reviewed-by: Hyman Huang <yong.huang@smartx.com> Message-Id: <8ddb0d40d143f77aab8f602bd494e01e5fa01614.1691161009.git.gudkov.andrei@huawei.com> Signed-off-by: Hyman Huang <yong.huang@smartx.com>	2023-08-29 10:19:03 +08:00
Juan Quintela	697c4c86ab	migration/rdma: Split qemu_fopen_rdma() into input/output functions This is how everything else in QEMUFile is structured. As a bonus they are three less lines of code. Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20230530183941.7223-17-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Juan Quintela	ac6f48e15d	qemu-file: Make qemu_file_get_error_obj() static It was not used outside of qemu_file.c anyways. Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20230530183941.7223-21-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Juan Quintela	8c5ee0bfb8	qemu-file: Simplify qemu_file_shutdown() Reviewed-by: Peter Xu <peterx@redhat.com> Message-ID: <20230530183941.7223-20-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Juan Quintela	9ccf83f486	qemu_file: Make qemu_file_is_writable() static It is not used outside of qemu_file, and it shouldn't. Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230530183941.7223-19-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Juan Quintela	cf786549ce	migration: Change qemu_file_transferred to noflush We do a qemu_fclose() just after that, that also does a qemu_fflush(), so remove one qemu_fflush(). Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230530183941.7223-3-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Juan Quintela	fc95c63b60	qemu-file: Rename qemu_file_transferred_ fast -> noflush Fast don't say much. Noflush indicates more clearly that it is like qemu_file_transferred but without the flush. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230530183941.7223-2-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Wei Wang	82137e6c8c	migration: enforce multifd and postcopy preempt to be set before incoming qemu_start_incoming_migration needs to check the number of multifd channels or postcopy ram channels to configure the backlog parameter (i.e. the maximum length to which the queue of pending connections for sockfd may grow) of listen(). So enforce the usage of postcopy-preempt and multifd as below: - need to use "-incoming defer" on the destination; and - set_capability and set_parameter need to be done before migrate_incoming Otherwise, disable the use of the features and report error messages to remind users to adjust the commands. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <20230606101910.20456-2-wei.w.wang@intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Tejus GK	908927db28	migration: Update error description whenever migration fails There are places in migration.c where the migration is marked failed with MIGRATION_STATUS_FAILED, but the failure reason is never updated. Hence libvirt doesn't know why the migration failed when it queries for it. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Tejus GK <tejus.gk@nutanix.com> Message-ID: <20230621130940.178659-2-tejus.gk@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	15699cf542	migration: Extend query-migrate to provide dirty page limit info Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	acac51ba24	migration: Implement dirty-limit convergence algo Implement dirty-limit convergence algo for live migration, which is kind of like auto-converge algo but using dirty-limit instead of cpu throttle to make migration convergent. Enable dirty page limit if dirty_rate_high_cnt greater than 2 when dirty-limit capability enabled, Disable dirty-limit if migration be canceled. Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit" commands are not allowed during dirty-limit live migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-ID: <168733225273.5845.15871826788879741674-7@git.sr.ht> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	310ad5625e	migration: Put the detection logic before auto-converge checking This commit is prepared for the implementation of dirty-limit convergence algo. The detection logic of throttling condition can apply to both auto-converge and dirty-limit algo, putting it's position before the checking logic for auto-converge feature. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <168733225273.5845.15871826788879741674-6@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	bb9993c672	migration: Refactor auto-converge capability logic Check if block migration is running before throttling guest down in auto-converge way. Note that this modification is kind of like code clean, because block migration does not depend on auto-converge capability, so the order of checks can be adjusted. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-5@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	dc62395557	migration: Introduce dirty-limit capability Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	09f9ec9913	qapi/migration: Introduce vcpu-dirty-limit parameters Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)	4d80785719	qapi/migration: Introduce x-vcpu-dirty-limit-period parameter Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirtyrate calculation period configurable. Currently with the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes, test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Fabiano Rosas	01ec0f3a92	migration/multifd: Protect accesses to migration_threads This doubly linked list is common for all the multifd and migration threads so we need to avoid concurrent access. Add a mutex to protect the data from concurrent access. This fixes a crash when removing two MigrationThread objects from the list at the same time during cleanup of multifd threads. Fixes: `671326201d` ("migration: Introduce interface query-migrationthreads") Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230607161306.31425-3-farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Fabiano Rosas	788fa68041	migration/multifd: Rename threadinfo.c functions We're about to add more functions to this file so make it use the same coding style as the rest of the code. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20230607161306.31425-2-farosas@suse.de> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-07-26 10:55:56 +02:00
Michael Tokarev	d8b71d96b3	migration: spelling fixes Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Fabiano Rosas <farosas@suse.de>	2023-07-25 17:13:20 +03:00
David Hildenbrand	f161c88a03	migration/ram: Expose ramblock_is_ignored() as migrate_ram_is_ignored() virtio-mem wants to know whether it should not mess with the RAMBlock content (e.g., discard RAM, preallocate memory) on incoming migration. So let's expose that function as migrate_ram_is_ignored() in migration/misc.h Message-ID: <20230706075612.67404-4-david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-07-12 09:25:37 +02:00
Laszlo Ersek	aaf26bd382	migration: unexport migrate_fd_error() The only migrate_fd_error() call sites are in "migration/migration.c", which is also where we define migrate_fd_error(). Make the function static, and remove its declaration from "migration/migration.h". Cc: Juan Quintela <quintela@redhat.com> (maintainer:Migration) Cc: Leonardo Bras <leobras@redhat.com> (reviewer:Migration) Cc: Peter Xu <peterx@redhat.com> (reviewer:Migration) Cc: qemu-trivial@nongnu.org Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2023-07-08 07:24:38 +03:00
Laszlo Ersek	8c69ae9eff	migration: factor out "resume_requested" in qmp_migrate() It cuts back on those awkward, duplicated !(has_resume && resume) expressions. Cc: Juan Quintela <quintela@redhat.com> (maintainer:Migration) Cc: Leonardo Bras <leobras@redhat.com> (reviewer:Migration) Cc: Peter Xu <peterx@redhat.com> (reviewer:Migration) Cc: qemu-trivial@nongnu.org Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2023-07-08 07:24:38 +03:00
Avihai Horon	808642a2f6	vfio/migration: Reset bytes_transferred properly Currently, VFIO bytes_transferred is not reset properly: 1. bytes_transferred is not reset after a VM snapshot (so a migration following a snapshot will report incorrect value). 2. bytes_transferred is a single counter for all VFIO devices, however upon migration failure it is reset multiple times, by each VFIO device. Fix it by introducing a new function vfio_reset_bytes_transferred() and calling it during migration and snapshot start. Remove existing bytes_transferred reset in VFIO migration state notifier, which is not needed anymore. Fixes: `3710586caa` ("qapi: Add VFIO devices migration stats in Migration stats") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00
Avihai Horon	538ef4fe2f	migration: Enable switchover ack capability Now that switchover ack logic has been implemented, enable the capability. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00
Avihai Horon	1b4adb10f8	migration: Implement switchover ack logic Implement switchover ack logic. This prevents the source from stopping the VM and completing the migration until an ACK is received from the destination that it's OK to do so. To achieve this, a new SaveVMHandlers handler switchover_ack_needed() and a new return path message MIG_RP_MSG_SWITCHOVER_ACK are added. The switchover_ack_needed() handler is called during migration setup in the destination to check if switchover ack is used by the migrated device. When switchover is approved by all migrated devices in the destination that support this capability, the MIG_RP_MSG_SWITCHOVER_ACK return path message is sent to the source to notify it that it's OK to do switchover. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00
Avihai Horon	6574232fff	migration: Add switchover ack capability Migration downtime estimation is calculated based on bandwidth and remaining migration data. This assumes that loading of migration data in the destination takes a negligible amount of time and that downtime depends only on network speed. While this may be true for RAM, it's not necessarily true for other migrated devices. For example, loading the data of a VFIO device in the destination might require from the device to allocate resources, prepare internal data structures and so on. These operations can take a significant amount of time which can increase migration downtime. This patch adds a new capability "switchover ack" that prevents the source from stopping the VM and completing the migration until an ACK is received from the destination that it's OK to do so. This can be used by migrated devices in various ways to reduce downtime. For example, a device can send initial precopy metadata to pre-allocate resources in the destination and use this capability to make sure that the pre-allocation is completed before the source VM is stopped, so it will have full effect. This new capability relies on the return path capability to communicate from the destination back to the source. The actual implementation of the capability will be added in the following patches. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Tested-by: YangHang Liu <yanghliu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00
Philippe Mathieu-Daudé	de6cd7599b	meson: Replace softmmu_ss -> system_ss We use the user_ss[] array to hold the user emulation sources, and the softmmu_ss[] array to hold the system emulation ones. Hold the latter in the 'system_ss[]' array for parity with user emulation. Mechanical change doing: $ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss) Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-10-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-20 10:01:30 +02:00
Philippe Mathieu-Daudé	c7b64948f8	meson: Replace CONFIG_SOFTMMU -> CONFIG_SYSTEM_ONLY Since we might have user emulation with softmmu, use the clearer 'CONFIG_SYSTEM_ONLY' key to check for system emulation. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-9-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-20 10:01:30 +02:00
Steve Sistare	b0182e537e	exec/memory: Introduce RAM_NAMED_FILE flag migrate_ignore_shared() is an optimization that avoids copying memory that is visible and can be mapped on the target. However, a memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED flag set is not migrated when migrate_ignore_shared() is true. This is wrong, because the block has no named backing store, and its contents will be lost. To fix, ignore shared memory iff it is a named file. Define a new flag RAM_NAMED_FILE to distinguish this case. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1686151116-253260-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-06-13 11:28:58 +02:00
Philippe Mathieu-Daudé	7d5b0d6864	bulk: Remove pointless QOM casts Mechanical change running Coccinelle spatch with content generated from the qom-cast-macro-clean-cocci-gen.py added in the previous commit. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230601093452.38972-3-philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-06-05 20:48:34 +02:00
Fiona Ebner	3a8b81f2e6	migration: stop tracking ram writes when cancelling background migration Currently, it is only done when the iteration finishes successfully. Not cleaning up the userfaultfd write protection can lead to symptoms/issues such as the process hanging in memmove or GDB not being able to attach. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20230526115908.196171-1-f.ebner@proxmox.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-06-02 01:03:19 +02:00
Vladimir Sementsov-Ogievskiy	a4c6275aa1	migration: restore vmstate on migration failure 1. Otherwise failed migration just drops guest-panicked state, which is not good for management software. 2. We do keep different paused states like guest-panicked during migration with help of global_state state. 3. We do restore running state on source when migration is cancelled or failed. 4. "postmigrate" state is documented as "guest is paused following a successful 'migrate'", so originally it's only for successful path and we never documented current behavior. Let's restore paused states like guest-panicked in case of cancel or fail too. Allow same transitions like for inmigrate state. This commit changes the behavior that was introduced by commit `42da5550d6` "migration: set state to post-migrate on failure" and provides a bit different fix on related https://bugzilla.redhat.com/show_bug.cgi?id=1355683 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230517123752.21615-6-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-06-02 01:03:19 +02:00
Vladimir Sementsov-Ogievskiy	f4584076fc	migration: switch from .vm_was_running to .vm_old_state No logic change here, only refactoring. That's a preparation for next commit where we finally restore the stopped vm state on migration failure or cancellation. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230517123752.21615-5-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-06-02 01:03:19 +02:00
Vladimir Sementsov-Ogievskiy	c33f1829f8	migration: never fail in global_state_store() Actually global_state_store() can never fail. Let's get rid of extra error paths. To make things clear, use new runstate_get() and use same approach for global_state_store() and global_state_store_running(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230517123752.21615-3-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-06-02 01:03:19 +02:00
Stefan Hajnoczi	60f782b6b7	aio: remove aio_disable_external() API All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-30 17:37:26 +02:00
Richard Henderson	b5c0d842d6	migration: Build migration_files once The items in migration_files are built for libmigration and included info softmmu_ss from there; no need to also include them directly. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-05-23 16:51:18 -07:00
Richard Henderson	7ba7db9fa1	migration/xbzrle: Use i386 host/cpuinfo.h Perform the function selection once, and only if CONFIG_AVX512_OPT is enabled. Centralize the selection to xbzrle.c, instead of spreading the init across 3 files. Remove xbzrle-bench.c. The benefit of being able to benchmark the different implementations is less important than not peeking into the internals of the implementation. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-05-23 16:51:18 -07:00
Richard Henderson	1b48d0abdf	migration/xbzrle: Shuffle function order Place the CONFIG_AVX512BW_OPT block at the top, which will aid function selection in the next patch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-05-23 16:51:18 -07:00
Richard Henderson	146f515110	Migration Pull request Hi Based on latest reviewed parts of migration: - Disable colo (vladimir) - Migration atomic counters (juan) Please apply. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmRmXJUACgkQ9IfvGFhy 1yNRAxAAjDYJELL34Qovt/WE9qKhYJEvIUGTl1IMWJ22YMFnqIFKRdka57dWoU3P 7EK1BHmokEEtzGT7Fe1ecERXsOwQIJDIkDTJ5g8Oc8Jt1iqY1AC8h5T+LghijCar mbZ6qWHaSjsg2lmek/xc9quymzFGGK36PSyB5WkaLRviKQn4RIkEDpUaWny7nDbA Q8zJJpBqNFqKfC5/DN0ePa3QQscXQJhey3nxqFd8hYp8RFNIV5UJVW5Lf6ombtK7 atgdWC4ckkfO2z3OsghKeo/UiMFWpPktgBVVMhDLmk+P/E6czc2gfzD6SCvrPKTj XowI8hro22HVmq9bEY8PtbjMOfpxrAxer+tM2KR/0O9l3UzUacFsi7KGqCJ1/trQ 1tSDjlgyczb8GOgLwwxj8XE+jPHPfVrzCNfDqrBKBNxz6nnZSdZUwhV5mG8FdVtm oVVV96BIrNXLl/lIxYIFD/Zyvl8/lrSWQdLkEHTzihYQeXaQfyvPVbV/dOLT4sii YUuGCuEhF+DW/qz43G1krwq5/bfxsiZoQzrMV/Odtf0wYQKkabA3KNBIda/vxBCR dsLQ7QtmOwKmCzjqw4LUov9vDNYOYr98o7ZqwJ3qeKL4QgFwtEZUFO3VW6UR8fnF arVXiTn9wVlkTpu4sT5hLm9400iadhX4Fppji7Ce0tUpLbWbghA= =3x32 -----END PGP SIGNATURE----- Merge tag 'migration-20230518-pull-request' of https://gitlab.com/juan.quintela/qemu into staging Migration Pull request Hi Based on latest reviewed parts of migration: - Disable colo (vladimir) - Migration atomic counters (juan) Please apply. # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmRmXJUACgkQ9IfvGFhy # 1yNRAxAAjDYJELL34Qovt/WE9qKhYJEvIUGTl1IMWJ22YMFnqIFKRdka57dWoU3P # 7EK1BHmokEEtzGT7Fe1ecERXsOwQIJDIkDTJ5g8Oc8Jt1iqY1AC8h5T+LghijCar # mbZ6qWHaSjsg2lmek/xc9quymzFGGK36PSyB5WkaLRviKQn4RIkEDpUaWny7nDbA # Q8zJJpBqNFqKfC5/DN0ePa3QQscXQJhey3nxqFd8hYp8RFNIV5UJVW5Lf6ombtK7 # atgdWC4ckkfO2z3OsghKeo/UiMFWpPktgBVVMhDLmk+P/E6czc2gfzD6SCvrPKTj # XowI8hro22HVmq9bEY8PtbjMOfpxrAxer+tM2KR/0O9l3UzUacFsi7KGqCJ1/trQ # 1tSDjlgyczb8GOgLwwxj8XE+jPHPfVrzCNfDqrBKBNxz6nnZSdZUwhV5mG8FdVtm # oVVV96BIrNXLl/lIxYIFD/Zyvl8/lrSWQdLkEHTzihYQeXaQfyvPVbV/dOLT4sii # YUuGCuEhF+DW/qz43G1krwq5/bfxsiZoQzrMV/Odtf0wYQKkabA3KNBIda/vxBCR # dsLQ7QtmOwKmCzjqw4LUov9vDNYOYr98o7ZqwJ3qeKL4QgFwtEZUFO3VW6UR8fnF # arVXiTn9wVlkTpu4sT5hLm9400iadhX4Fppji7Ce0tUpLbWbghA= # =3x32 # -----END PGP SIGNATURE----- # gpg: Signature made Thu 18 May 2023 10:12:53 AM PDT # gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [undefined] # gpg: aka "Juan Quintela <quintela@trasno.org>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * tag 'migration-20230518-pull-request' of https://gitlab.com/juan.quintela/qemu: migration: Fix duplicated included in meson.build migration/multifd: Compute transferred bytes correctly migration: We don't need the field rate_limit_used anymore migration: Use migration_transferred_bytes() to calculate rate_limit migration: Add a trace for migration_transferred_bytes migration: Move migration_total_bytes() to migration-stats.c migration: Move rate_limit_max and rate_limit_used to migration_stats qemu-file: Account for rate_limit usage on qemu_fflush() migration: Don't use INT64_MAX for unlimited rate migration: process_incoming_migration_co(): move colo part to colo migration: split migration_incoming_co configure: add --disable-colo-proxy option Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-05-18 11:07:06 -07:00
Juan Quintela	ba9d2cbc01	migration: Fix duplicated included in meson.build This is the commint with the merge error (not in the submited patch). commit `52623f23b0` Author: Lukas Straub <lukasstraub2@web.de> Date: Thu Apr 20 11:48:35 2023 +0200 ram-compress.c: Make target independent Make ram-compress.c target independent. Fixes: `52623f23b0` Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20230509170217.83246-1-quintela@redhat.com>	2023-05-18 18:41:53 +02:00
Juan Quintela	cbec7eb768	migration/multifd: Compute transferred bytes correctly In the past, we had to put the in the main thread all the operations related with sizes due to qemu_file not beeing thread safe. As now all counters are atomic, we can update the counters just after the do the write. As an aditional bonus, we are able to use the right value for the compression methods. Right now we were assuming that there were no compression at all. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20230515195709.63843-17-quintela@redhat.com>	2023-05-18 18:41:46 +02:00
Juan Quintela	bd7ceaf6d5	migration: We don't need the field rate_limit_used anymore Since previous commit, we calculate how much data we have send with migration_transferred_bytes() so no need to maintain this counter and remember to always update it. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230515195709.63843-10-quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Juan Quintela	813cd61669	migration: Use migration_transferred_bytes() to calculate rate_limit Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230515195709.63843-9-quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Juan Quintela	3db9c05a90	migration: Add a trace for migration_transferred_bytes Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230515195709.63843-8-quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Juan Quintela	99319e2daf	migration: Move migration_total_bytes() to migration-stats.c Once there rename it to migration_transferred_bytes() and pass a QEMUFile instead of a migration object. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230515195709.63843-7-quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Juan Quintela	e1fde0e038	migration: Move rate_limit_max and rate_limit_used to migration_stats These way we can make them atomic and use this functions from any place. I also moved all functions that use rate_limit to migration-stats. Functions got renamed, they are not qemu_file anymore. qemu_file_rate_limit -> migration_rate_exceeded qemu_file_set_rate_limit -> migration_rate_set qemu_file_get_rate_limit -> migration_rate_get qemu_file_reset_rate_limit -> migration_rate_reset qemu_file_acct_rate_limit -> migration_rate_account. Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515195709.63843-6-quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Juan Quintela	de37f8b9c2	qemu-file: Account for rate_limit usage on qemu_fflush() That is the moment we know we have transferred something. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230515195709.63843-5-quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Juan Quintela	8e4b2a7059	migration: Don't use INT64_MAX for unlimited rate Define and use RATE_LIMIT_DISABLED instead. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Message-Id: <20230515195709.63843-2-quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Vladimir Sementsov-Ogievskiy	d0a14a2ba0	migration: process_incoming_migration_co(): move colo part to colo Let's make better public interface for COLO: instead of colo_process_incoming_thread and not trivial logic around creating the thread let's make simple colo_incoming_co(), hiding implementation from generic code. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515130640.46035-4-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Vladimir Sementsov-Ogievskiy	dd42ce24a3	migration: split migration_incoming_co Originally, migration_incoming_co was introduced by `25d0c16f62` "migration: Switch to COLO process after finishing loadvm" to be able to enter from COLO code to one specific yield point, added by `25d0c16f62`. Later in `923709896b` "migration: poll the cm event for destination qemu" we reused this variable to wake the migration incoming coroutine from RDMA code. That was doubtful idea. Entering coroutines is a very fragile thing: you should be absolutely sure which yield point you are going to enter. I don't know how much is it safe to enter during qemu_loadvm_state() which I think what RDMA want to do. But for sure RDMA shouldn't enter the special COLO-related yield-point. As well, COLO code doesn't want to enter during qemu_loadvm_state(), it want to enter it's own specific yield-point. As well, when in `8e48ac9586` "COLO: Add block replication into colo process" we added bdrv_invalidate_cache_all() call (now it's called activate_all()) it became possible to enter the migration incoming coroutine during that call which is wrong too. So, let't make these things separate and disjoint: loadvm_co for RDMA, non-NULL during qemu_loadvm_state(), and colo_incoming_co for COLO, non-NULL only around specific yield. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230515130640.46035-3-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-18 18:40:51 +02:00
Gavin Shan	1e493be587	migration: Add last stage indicator to global dirty log The global dirty log synchronization is used when KVM and dirty ring are enabled. There is a particularity for ARM64 where the backup bitmap is used to track dirty pages in non-running-vcpu situations. It means the dirty ring works with the combination of ring buffer and backup bitmap. The dirty bits in the backup bitmap needs to collected in the last stage of live migration. In order to identify the last stage of live migration and pass it down, an extra parameter is added to the relevant functions and callbacks. This last stage indicator isn't used until the dirty ring is enabled in the subsequent patches. No functional change intended. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Zhenyu Zhang <zhenyzha@redhat.com> Message-Id: <20230509022122.20888-2-gshan@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-05-18 08:53:50 +02:00
Philippe Mathieu-Daudé	1e05888ab5	sysemu/kvm: Remove unused headers All types used are forward-declared in "qemu/typedefs.h". Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230405160454.97436-2-philmd@linaro.org> [thuth: Add hw/core/cpu.h to migration/dirtyrate.c to fix compile failure] Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-05-16 09:13:34 +02:00
Juan Quintela	6da835d42a	qemu-file: Remove total from qemu_file_total_transferred_*() Function is already quite long. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230508130909.65420-7-quintela@redhat.com>	2023-05-15 13:46:14 +02:00
Juan Quintela	f87e4d6d43	qemu-file: Make rate_limit_used an uint64_t Change all the functions that use it. It was already passed as uint64_t. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230508130909.65420-6-quintela@redhat.com>	2023-05-15 13:45:33 +02:00
Juan Quintela	bffc0441d5	qemu-file: make qemu_file_[sg]et_rate_limit() use an uint64_t It is really size_t. Everything else uses uint64_t, so move this to uint64_t as well. A size can't be negative anyways. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230508130909.65420-5-quintela@redhat.com>	2023-05-15 13:44:38 +02:00
Juan Quintela	9d3ebbe217	migration: We set the rate_limit by a second That the implementation does the check every 100 milliseconds is an implementation detail that shouldn't be seen on the interfaz. Notice that all callers of qemu_file_set_rate_limit() used the division or pass 0, so this change is a NOP. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230508130909.65420-4-quintela@redhat.com>	2023-05-15 13:44:07 +02:00
Juan Quintela	52d01d4a5d	migration: A rate limit value of 0 is valid And it is the best way to not have rate_limit. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20230508130909.65420-2-quintela@redhat.com>	2023-05-15 13:42:07 +02:00
Juan Quintela	dc2836c380	migration: Make dirtyrate.c target independent After the previous two patches, there is nothing else that is target specific. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230511141208.17779-6-quintela@redhat.com>	2023-05-15 10:33:05 +02:00
Juan Quintela	148b1ad83c	migration: Teach dirtyrate about qemu_target_page_bits() Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230511141208.17779-5-quintela@redhat.com>	2023-05-15 10:33:05 +02:00
Juan Quintela	edd83a70dc	migration: Teach dirtyrate about qemu_target_page_size() Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230511141208.17779-4-quintela@redhat.com>	2023-05-15 10:33:04 +02:00
Juan Quintela	beeda9b7cd	Use new created qemu_target_pages_to_MiB() Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230511141208.17779-3-quintela@redhat.com>	2023-05-15 10:33:04 +02:00
Andrei Gudkov	00a3f9c60a	migration/calc-dirty-rate: replaced CRC32 with xxHash This significantly reduces overhead of dirty page rate calculation in sampling mode. Tested using 32GiB VM on E5-2690 CPU. With CRC32: total_pages=8388608 sampled_pages=16384 millis=71 With xxHash: total_pages=8388608 sampled_pages=16384 millis=14 Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com> Message-Id: <cd115a89fc81d5f2eeb4ea7d57a98b84f794f340.1682598010.git.gudkov.andrei@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-15 10:33:03 +02:00
Jamie Iles	370ed60029	cpu: expose qemu_cpu_list_lock for lock-guard use Expose qemu_cpu_list_lock globally so that we can use WITH_QEMU_LOCK_GUARD and QEMU_LOCK_GUARD to simplify a few code paths now and in future. Signed-off-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230427020925.51003-2-quic_jiles@quicinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-05-11 09:53:41 +01:00
Vladimir Sementsov-Ogievskiy	121ccedc2b	migration: block incoming colo when capability is disabled We generally require same set of capabilities on source and target. Let's require x-colo capability to use COLO on target. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20230428194928.1426370-11-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-10 18:48:12 +02:00
Vladimir Sementsov-Ogievskiy	d70178a88f	migration: disallow change capabilities in COLO state COLO is not listed as running state in migrate_is_running(), so, it's theoretically possible to disable colo capability in COLO state and the unexpected error in migration_iteration_finish() is reachable. Let's disallow that in qmp_migrate_set_capabilities. Than the error becomes absolutely unreachable: we can get into COLO state only with enabled capability and can't disable it while we are in COLO state. So substitute the error by simple assertion. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20230428194928.1426370-10-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-10 18:48:12 +02:00
Vladimir Sementsov-Ogievskiy	ecbfec6d77	migration: process_incoming_migration_co: simplify code flow around ret Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20230428194928.1426370-7-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-10 18:48:11 +02:00
Vladimir Sementsov-Ogievskiy	1d4cfcd409	migration: drop colo_incoming_thread from MigrationIncomingState have_colo_incoming_thread variable is unused. colo_incoming_thread can be local. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20230428194928.1426370-6-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-10 18:48:11 +02:00
Vladimir Sementsov-Ogievskiy	51e47cf860	build: move COLO under CONFIG_REPLICATION We don't allow to use x-colo capability when replication is not configured. So, no reason to build COLO when replication is disabled, it's unusable in this case. Note also that the check in migrate_caps_check() is not the only restriction: some functions in migration/colo.c will just abort if called with not defined CONFIG_REPLICATION, for example: migration_iteration_finish() case MIGRATION_STATUS_COLO: migrate_start_colo_process() colo_process_checkpoint() abort() It could probably make sense to have possibility to enable COLO without REPLICATION, but this requires deeper audit of colo & replication code, which may be done later if needed. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Acked-by: Dr. David Alan Gilbert <dave@treblig.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20230428194928.1426370-4-vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-10 18:48:11 +02:00

... 2 3 4 5 6 ...

2173 Commits