mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Wei Yang	6a88eb2b08	migration: use migration_in_postcopy() to check POSTCOPY_ACTIVE Use common helper function to check the state. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190719071129.11880-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	52aec70923	migration/postcopy: start_postcopy could be true only when migrate_postcopy() return true There is only one place to set start_postcopy to true, qmp_migrate_start_postcopy(), which make sure start_postcopy could be set to true when migrate_postcopy() return true. So start_postcopy is true implies the other one. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190718083747.5859-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	305b6f8431	migration/postcopy: PostcopyState is already set in loadvm_postcopy_handle_advise() PostcopyState is already set to ADVISE at the beginning of loadvm_postcopy_handle_advise(). Remove the redundant set. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190711080816.6405-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	e326767b45	migration/savevm: move non SaveStateEntry condition check out of iteration in_postcopy and iterable_only are not SaveStateEntry specific, it would be more proper to check them out of iteration. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190709140924.13291-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	622a80c955	migration/savevm: split qemu_savevm_state_complete_precopy() into two parts This is a preparation patch for further cleanup. No functional change, just wrap two major part of qemu_savevm_state_complete_precopy() into function. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190709140924.13291-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	4e455d51ef	migration/savevm: flush file for iterable_only case It would be proper to flush file even for iterable_only case. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190709140924.13291-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	8996604fe6	migration/postcopy: do_fixup is true when host_offset is non-zero This means it is not necessary to spare an extra variable to hold this condition. Use host_offset directly is fine. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190710050814.31344-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	e927a03317	migration/postcopy: reduce one operation to calculate fixup_start_addr Use the same way for run_end to calculate run_start, which saves one operation. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190710050814.31344-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	a162b572e9	migration/postcopy: discard_length must not be 0 Since we break the loop when there is no more page to discard, we are sure the following process would find some page to discard. It is not necessary to check it again. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190627020822.15485-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	33a5cb6202	migration/postcopy: break the loop when there is no more page to discard When one is equal or bigger then end, it means there is no page to discard. Just break the loop in this case instead of processing it. No functional change, just refactor it a little. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190627020822.15485-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	0abfff9ea7	migration/postcopy: the valid condition is one less then end If one equals end, it means we have gone through the whole bitmap. Use a more restrict check to skip a unnecessary condition. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190627020822.15485-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	640dfb14db	migration: consolidate time info into populate_time_info Consolidate time information fill up into its function for better readability. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190716005411.4156-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Yury Kotov	3d661c8ab1	migration: Add error_desc for file channel errors Currently, there is no information about error if outgoing migration was failed because of file channel errors. Example (QMP session): -> { "execute": "migrate", "arguments": { "uri": "exec:head -c 1" }} <- { "return": {} } ... -> { "execute": "query-migrate" } <- { "return": { "status": "failed" }} // There is not error's description And even in the QEMU's output there is nothing. This patch 1) Adds errp for the most of QEMUFileOps 2) Adds qemu_file_get_error_obj/qemu_file_set_error_obj 3) And finally using of qemu_file_get_error_obj in migration.c And now, the status for the mentioned fail will be: -> { "execute": "query-migrate" } <- { "return": { "status": "failed", "error-desc": "Unable to write to command: Broken pipe" }} Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190422103420.15686-1-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Ivan Ren	f193bc0c53	migration: fix migrate_cancel multifd migration leads destination hung forever When migrate_cancel a multifd migration, if run sequence like this: [source] [destination] multifd_send_sync_main[finish] multifd_recv_thread wait &p->sem_sync shutdown to_dst_file detect error from_src_file send RAM_SAVE_FLAG_EOS[fail] [no chance to run multifd_recv_sync_main] multifd_load_cleanup join multifd receive thread forever will lead destination qemu hung at following stack: pthread_join qemu_thread_join multifd_load_cleanup process_incoming_migration_co coroutine_trampoline Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <1561468699-9819-4-git-send-email-ivanren@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-24 14:47:21 +02:00
Juan Quintela	3c3ca25d1f	migration: Make explicit that we are quitting multifd We add a bool to indicate that. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-24 14:47:12 +02:00
Ivan Ren	a3ec6b7d23	migration: fix migrate_cancel leads live_migration thread hung forever When we 'migrate_cancel' a multifd migration, live_migration thread may hung forever at some points, because of multifd_send_thread has already exit for socket error: 1. multifd_send_pages may hung at qemu_sem_wait(&multifd_send_state-> channels_ready) 2. multifd_send_sync_main my hung at qemu_sem_wait(&multifd_send_state-> sem_sync) Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1561468699-9819-3-git-send-email-ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> --- Remove spurious not needed bits	2019-07-24 14:47:02 +02:00
Ivan Ren	713f762a31	migration: fix migrate_cancel leads live_migration thread endless loop When we 'migrate_cancel' a multifd migration, live_migration thread may go into endless loop in multifd_send_pages functions. Reproduce steps: (qemu) migrate_set_capability multifd on (qemu) migrate -d url (qemu) [wait a while] (qemu) migrate_cancel Then may get live_migration 100% cpu usage in following stack: pthread_mutex_lock qemu_mutex_lock_impl multifd_send_pages multifd_queue_page ram_save_multifd_page ram_save_target_page ram_save_host_page ram_find_and_save_block ram_find_and_save_block ram_save_iterate qemu_savevm_state_iterate migration_iteration_run migration_thread qemu_thread_start start_thread clone Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1561468699-9819-2-git-send-email-ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-24 14:46:51 +02:00
Ivan Ren	40c4d4a835	migration: always initial RAMBlock.bmap to 1 for new migration Reproduce the problem: migrate migrate_cancel migrate Error happen for memory migration The reason as follows: 1. qemu start, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] all set to 1 by a series of cpu_physical_memory_set_dirty_range 2. migration start:ram_init_bitmaps - memory_global_dirty_log_start: begin log diry - memory_global_dirty_log_sync: sync dirty bitmap to ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] - migration_bitmap_sync_range: sync ram_list. dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero 3. migration data... 4. migrate_cancel, will stop log dirty 5. migration start:ram_init_bitmaps - memory_global_dirty_log_start: begin log diry - memory_global_dirty_log_sync: sync dirty bitmap to ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] - migration_bitmap_sync_range: sync ram_list. dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero Here RAMBlock.bmap only have new logged dirty pages, don't contain the whole guest pages. Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1563115879-2715-1-git-send-email-ivanren@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:47:47 +02:00
Wei Yang	40277ca807	migration/postcopy: remove redundant cpu_synchronize_all_post_init cpu_synchronize_all_post_init() is called twice in loadvm_postcopy_handle_run_bh(), so remove one redundant call. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190715080751.24304-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:45:59 +02:00
Wei Yang	89dab31b27	migration/postcopy: fix document of postcopy_send_discard_bm_ram() Commit `6b6712efcc` ('ram: Split dirty bitmap by RAMBlock') changes the parameter of postcopy_send_discard_bm_ram(), while left the document part untouched. This patch correct the document and fix two typo by hand. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190715020549.15018-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:45:22 +02:00
Peng Tao	b17fbbe55c	migration: allow private destination ram with x-ignore-shared By removing the share ram check, qemu is able to migrate to private destination ram when x-ignore-shared capability is on. Then we can create multiple destination VMs based on the same source VM. This changes the x-ignore-shared migration capability to work similar to Lai's original bypass-shared-memory work(https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html) which enables kata containers (https://katacontainers.io) to implement the VM templating feature. An example usage in kata containers(https://katacontainers.io): 1. Start the source VM: qemu-system-x86 -m 2G \ -object memory-backend-file,id=mem0,size=2G,share=on,mem-path=/tmpfs/template-memory \ -numa node,memdev=mem0 2. Stop the template VM, set migration x-ignore-shared capability, migrate "exec:cat>/tmpfs/state", quit it 3. Start target VM: qemu-system-x86 -m 2G \ -object memory-backend-file,id=mem0,size=2G,share=off,mem-path=/tmpfs/template-memory \ -numa node,memdev=mem0 \ -incoming defer 4. connect to target VM qmp, set migration x-ignore-shared capability, migrate_incoming "exec:cat /tmpfs/state" 5. create more target VMs repeating 3 and 4 Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Yury Kotov <yury-kotov@yandex-team.ru> Cc: Jiangshan Lai <laijs@hyper.sh> Cc: Xu Wang <xu@hyper.sh> Signed-off-by: Peng Tao <tao.peng@linux.alibaba.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1560494113-1141-1-git-send-email-tao.peng@linux.alibaba.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:03 +02:00
Peter Xu	002cad6b16	migration: Split log_clear() into smaller chunks Currently we are doing log_clear() right after log_sync() which mostly keeps the old behavior when log_clear() was still part of log_sync(). This patch tries to further optimize the migration log_clear() code path to split huge log_clear()s into smaller chunks. We do this by spliting the whole guest memory region into memory chunks, whose size is decided by MigrationState.clear_bitmap_shift (an example will be given below). With that, we don't do the dirty bitmap clear operation on the remote node (e.g., KVM) when we fetch the dirty bitmap, instead we explicitly clear the dirty bitmap for the memory chunk for each of the first time we send a page in that chunk. Here comes an example. Assuming the guest has 64G memory, then before this patch the KVM ioctl KVM_CLEAR_DIRTY_LOG will be a single one covering 64G memory. If after the patch, let's assume when the clear bitmap shift is 18, then the memory chunk size on x86_64 will be 1UL<<18 * 4K = 1GB. Then instead of sending a big 64G ioctl, we'll send 64 small ioctls, each of the ioctl will cover 1G of the guest memory. For each of the 64 small ioctls, we'll only send if any of the page in that small chunk was going to be sent right away. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190603065056.25211-12-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:03 +02:00
Peter Xu	267691b65c	migration: No need to take rcu during sync_dirty_bitmap cpu_physical_memory_sync_dirty_bitmap() has one RAMBlock* as parameter, which means that it must be with RCU read lock held already. Taking it again inside seems redundant. Removing it. Instead comment on the functions about the RCU read lock. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-2-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Wei Yang	422314e751	migration/ram.c: reset complete_round when we gets a queued page In case we gets a queued page, the order of block is interrupted. We may not rely on the complete_round flag to say we have already searched the whole blocks on the list. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190605010828.6969-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Wei Yang	77568ea7f8	migration/multifd: sync packet_num after all thread are done Notification from recv thread is not ordered, which means we may be notified by one MultiFDRecvParams but adjust packet_num for another. Move the adjustment after we are sure each recv thread are sync-ed. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190604023540.26532-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Wei Yang	ca35380390	migration/xbzrle: update cache and current_data in one place When we are not in the last_stage, we need to update the cache if page is not the same. Currently this procedure is scattered in two places and mixed with encoding status check. This patch extract this general step out to make the code a little bit easy to read. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190610004159.20966-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Wei Yang	b6526c4b21	migration/multifd: call multifd_send_sync_main when sending RAM_SAVE_FLAG_EOS On receiving RAM_SAVE_FLAG_EOS, multifd_recv_sync_main() is called to synchronize receive threads. Current synchronization mechanism is to wait for each channel's sem_sync semaphore. This semaphore is triggered by a packet with MULTIFD_FLAG_SYNC flag. While in current implementation, we don't do multifd_send_sync_main() to send such packet when blk_mig_bulk_active() is true. This will leads to the receive threads won't notify multifd_recv_sync_main() by sem_sync. And multifd_recv_sync_main() will always wait there. [Note]: normal migration test works, while didn't test the blk_mig_bulk_active() case. Since not sure how to produce this situation. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190612014337.11255-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Juan Quintela	8ebad0f7a7	migration: fix multifd_recv event typo It uses num in multifd_send(). Make it coherent. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:01 +02:00
Like Xu	5cc8767d05	general: Replace global smp variables with smp machine properties Basically, the context could get the MachineState reference via call chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode. A local variable of the same name would be introduced in the declaration phase out of less effort OR replace it on the spot if it's only used once in the context. No semantic changes. Signed-off-by: Like Xu <like.xu@linux.intel.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190518205428.90532-4-like.xu@linux.intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-07-05 17:07:36 -03:00
Alex Bennée	1f4abd81f7	migration: move port_attr inside CONFIG_LINUX Otherwise the FreeBSD compiler complains about an unused variable. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2019-07-04 19:23:07 +01:00
Zhang Chen	0e8818f023	migration/colo.c: Add missed filter notify for Xen COLO. We need to notify net filter to do checkpoint for Xen COLO, like KVM side. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2019-07-02 10:21:07 +08:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Markus Armbruster	0b8fa32f55	Include qemu/module.h where needed, drop it from qemu-common.h Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]	2019-06-12 13:18:33 +02:00
Peter Maydell	0d74f3b427	Trivial fixes 06/06/2019 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAlz4844SHGxhdXJlbnRA dml2aWVyLmV1AAoJEPMMOL0/L748FtwQALCoNnMrEY4mnmHy0dnEQPRFcPMKa9pp 3lqpmxLHAkSsWFKmmLKPteZhUroBmzXPa91984hhQiglcMMMsIPy+A+x1QBj7Yt2 KeEKpIdSS6Qi4T72zVOtO4MR1pCeKUYHY8ICn/rqAkpkA/lt5DuX2xJSepgrSdAI /JgpawJ4Rz95x5rCLuy/t5egtKVYVhauv4EbQ9PeaFhSlwoKNYbc6qAZSvs8pr9n H4W8DgtI35wPj4zE3i9bbmnUUxCUMj6MjkIm/jTB5qewY/I+llb27CN2Uq1yvRKW ANGbGW3rVwQe8p6kbVcM7CDbawm4J0c59w/4mUTa3BRRuAj4KtHTeghXALHLn/gv aO90oZKGd2xGxpSMAapzgebNezUQxFFoRWhyI4o8N+SWEpoRbHkxDwrk2WlKXsCR xRYOensU17NOKMJ32AbUReC2/m7D71EH3723aVzd2O5nuIHlsEG2CYjlzjXFB4X8 wPbaigcqpDEMwLTt3kYy4TrghFdcSaAYepmqXJ9D9UONMOrnRhhR9GkvEmCB4Eus BJanLE0xp59KTVDZ5c/v6+44P/RQ04aD2oFh0bMKlNb4+cfbQlq2odoiCPcUKimh XCCbYbwJFRRkgGTh9pFMgMqH9zX8HTgG/2Zp2VGFnYXnJzz/AuvgH9k5Pc4P3C/A lOCSBWpxR6bk =wmCY -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-pull-request' into staging Trivial fixes 06/06/2019 # gpg: Signature made Thu 06 Jun 2019 12:05:50 BST # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/trivial-branch-pull-request: hw/watchdog/wdt_i6300esb: Use DEVICE() macro to access DeviceState.qdev hw/scsi: Use the QOM BUS() macro to access BusState.qbus hw/sd: Use the QOM BUS() macro to access BusState.qbus hw/audio/ac97: Use the QOM DEVICE() macro to access DeviceState.qdev hw/vfio/pci: Use the QOM DEVICE() macro to access DeviceState.qdev hw/usb-storage: Use the QOM DEVICE() macro to access DeviceState.qdev hw/isa: Use the QOM DEVICE() macro to access DeviceState.qdev hw/s390x/event-facility: Use the QOM BUS() macro to access BusState.qbus hw/pci-bridge: Use the QOM BUS() macro to access BusState.qbus hw/scsi/vmw_pvscsi: Use qbus_reset_all() directly docs/devel/build-system: Update an example test: Fix make target check-report.tap util: Adjust qemu_guest_getrandom_nofail for Coverity vhost: fix incorrect print type migration: fix a typo hw/rdma: Delete unused headers inclusion Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-06-06 14:09:14 +01:00
Li Qiang	ff1543af22	migration: fix a typo 'postocpy' should be 'postcopy'. CC: qemu-trivial@nongnu.org Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190525062832.18009-1-liq3ea@163.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-06-06 11:17:32 +02:00
Wei Yang	0315851938	migratioin/ram: leave RAMBlock->bmap blank on allocating During migration, we would sync bitmap from ram_list.dirty_memory to RAMBlock.bmap in cpu_physical_memory_sync_dirty_bitmap(). Since we set RAMBlock.bmap and ram_list.dirty_memory both to all 1, this means at the first round this sync is meaningless and is a duplicated work. Leaving RAMBlock->bmap blank on allocating would have a side effect on migration_dirty_pages, since it is calculated from the result of cpu_physical_memory_sync_dirty_bitmap(). To keep it right, we need to set migration_dirty_pages to 0 in ram_state_init(). Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-06-05 12:44:03 +02:00
Yury Kotov	61053d4826	migration: Fix fd protocol for incoming defer Currently, incoming migration through fd supports only command-line case: E.g. fork(); fd = open(); exec("qemu ... -incoming fd:%d", fd); It's possible to use add-fd commands to pass fd for migration, but it's invalid case. add-fd works with fdset but not with particular fds. To work with getfd in incoming defer it's enough to use monitor_fd_param instead of strtol. monitor_fd_param supports both cases: * fd:123 * fd:fd_name (added by getfd). And also the use of monitor_fd_param improves error messages. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-06-05 12:43:55 +02:00
Wei Yang	f38d7fbc01	migration/ram.c: multifd_send_state->count is not really used Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-06-05 12:42:54 +02:00
Wei Yang	7d4eaace46	migration/ram.c: MultiFDSendParams.sem_sync is not really used Besides init and destroy, MultiFDSendParams.sem_sync is not really used. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-06-05 12:42:39 +02:00
Kevin Wolf	d861ab3acf	block: Add BlockBackend.ctx This adds a new parameter to blk_new() which requires its callers to declare from which AioContext this BlockBackend is going to be used (or the locks of which AioContext need to be taken anyway). The given context is only stored and kept up to date when changing AioContexts. Actually applying the stored AioContext to the root node is saved for another commit. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
John Snow	592203e7cf	migration/dirty-bitmaps: change bitmap enumeration method Shift from looking at every root BDS to every BDS. This will migrate bitmaps that are attached to blockdev created nodes instead of just ones attached to emulated storage devices. Note that this will not migrate anonymous or internal-use bitmaps, as those are defined as having no name. This will also fix the Coverity issues Peter Maydell has been asking about for the past several releases, as well as fixing a real bug. Reported-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Coverity 😅 Reported-by: aihua liang <aliang@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 20190514201926.10407-1-jsnow@redhat.com Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652490 Fixes: Coverity CID 1390625 CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>	2019-05-28 19:33:31 -04:00
Greg Kurz	b6eca81e1b	migration: Fix typo in migrate_add_blocker() error message Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <155800428514.543845.17558475870097990036.stgit@bahia.lan> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-05-22 17:35:27 +02:00
Wei Yang	a5f7b1a63c	migration/ram.c: fix typos in comments Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190510233729.15554-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 19:00:04 +01:00
Yury Kotov	fd392cfa8e	migration: Fix use-after-free during process exit It fixes heap-use-after-free which was found by clang's ASAN. Control flow of this use-after-free: main_thread: * Got SIGTERM and completes main loop * Calls migration_shutdown - migrate_fd_cancel (so, migration_thread begins to complete) - object_unref(OBJECT(current_migration)); migration_thread: * migration_iteration_finish -> schedule cleanup bh * object_unref(OBJECT(s)); (Now, current_migration is freed) * exits main_thread: * Calls vm_shutdown -> drain bdrvs -> main loop -> cleanup_bh -> use after free If you want to reproduce, these couple of sleeps will help: vl.c:4613: migration_shutdown(); + sleep(2); migration.c:3269: + sleep(1); trace_migration_thread_after_loop(); migration_iteration_finish(s); Original output: qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>) ================================================================= ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 #5 0x5555573bddf6 in virtio_blk_data_plane_stop hw/block/dataplane/virtio-blk.c:282:5 #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 #10 0x555557c36713 in vm_state_notify vl.c:1605:9 #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 #13 0x555557c4283e in main vl.c:4617:5 #14 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) 0x61900001d210 is located 144 bytes inside of 952-byte region [0x61900001d180,0x61900001d538) freed by thread T6 (live_migration) here: #0 0x555556f76782 in __interceptor_free /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 #2 0x555558d57651 in object_unref qom/object.c:1068:9 #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) previously allocated by thread T0 (qemu-vm-0) here: #0 0x555556f76b03 in __interceptor_malloc /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) #2 0x555558d58031 in object_new qom/object.c:640:12 #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 #4 0x555557c41398 in main vl.c:4320:5 #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) Thread T6 (live_migration) created by T0 (qemu-vm-0) here: #0 0x555556f5f0dd in pthread_create /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 #15 0x7ffff6ede196 in g_main_context_dispatch (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 in migrate_fd_cleanup Shadow bytes around the buggy address: 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==31958==ABORTING Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190408113343.2370-1-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixed up comment formatting	2019-05-14 18:59:54 +01:00
Wei Yang	16015d32e4	migration/savevm: wrap into qemu_loadvm_state_header() On source side, we have qemu_savevm_state_header() to send related data, while on the receiving side those steps are scattered in qemu_loadvm_state(). This patch wrap those related steps into qemu_loadvm_state_header() to make it friendly to read. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-5-richardw.yang@linux.intel.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Wei Yang	9e14b84908	migration/savevm: load_header before load_setup In migration_thread() and qemu_savevm_state(), we savevm_state in following sequence: qemu_savevm_state_header(f); qemu_savevm_state_setup(f); Then it would be more proper to loadvm_state in the save sequence. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Wei Yang	5351e69af8	migration/savevm: remove duplicate check of migration_is_blocked Current call flow of save_snapshot is: save_snapshot migration_is_blocked qemu_savevm_state migration_is_blocked Since qemu_savevm_state is only called in save_snapshot, this means migration_is_blocked has been already checked. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190424004700.12766-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Yi Wang	4633456ced	migration: update comments of migration bitmap Since the ram bitmap and the unsent bitmap are split by RAMBlock in commit `6b6712e`, it's better to update the comments about them. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Message-Id: <1555311089-18610-1-git-send-email-wang.yi59@zte.com.cn> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Wei Yang	bf21297923	migration/ram.c: start of migration_bitmap_sync_range is always 0 We can eliminate to pass 0. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190430034412.12935-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Zhang Chen	c0913d1dfd	migration/colo.c: Remove redundant input parameter The colo_do_failover no need the input parameter. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20190426090730.2691-2-chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Cole Robinson	aded9dfa74	migration: savevm: fix error code with migration blockers The only caller that checks the error code is looking for != 0, so returning false is incorrect. Fixes: `5aaac46793` "migration: savevm: consult migration blockers" Signed-off-by: Cole Robinson <crobinso@redhat.com> Message-Id: <b991a4d0e6c4253bc08b2794c6084be55fc72e1d.1554851834.git.crobinso@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Wei Yang	f2dd7eddf2	vmstate: check subsection_found is enough subsection_found is true implies vmdesc is not NULL. This patch remove the additional check on vmdesc and rename subsection_found to vmdesc_has_subsections to make it more self-explain. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190403011016.12549-1-richardw.yang@linux.intel.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Wei Yang	15d2d64cf5	migration: remove not used field xfer_limit MigrationState->xfer_limit is only set to 0 in migrate_init(). Remove this unnecessary field. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190326055726.10539-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Wei Yang	a94cd7b8ab	migration: not necessary to check ops again During each iteration, se->ops is checked before each loop. So it is not necessary to check it again and simplify the following check a little. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190327013130.26259-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-05-14 17:33:35 +01:00
Peter Maydell	f151f8aca5	migration/ram.c: Fix use-after-free in multifd_recv_unfill_packet() Coverity points out (CID 1400442) that in this code: if (packet->pages_alloc > p->pages->allocated) { multifd_pages_clear(p->pages); multifd_pages_init(packet->pages_alloc); } we free p->pages in multifd_pages_clear() but continue to use it in the following code. We also leak memory, because multifd_pages_init() returns the pointer to a new MultiFDPages_t struct but we are ignoring its return value. Fix both of these bugs by adding the missing assignment of the newly created struct to p->pages. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-id: 20190409151830.6024-1-peter.maydell@linaro.org Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-04-09 20:46:34 +01:00
Zhang Chen	c6e5bafb6f	migration/ram.c: Fix codes conflict about bitmap_mutex I found upstream codes conflict with COLO and lead to crash, and I located to this patch: commit `386a907b37` Author: Wei Wang <wei.w.wang@intel.com> Date: Tue Dec 11 16:24:49 2018 +0800 migration: use bitmap_mutex in migration_bitmap_clear_dirty My colleague Wei's patch add bitmap_mutex in migration_bitmap_clear_dirty, but COLO didn't initialize the bitmap_mutex. So we always get an error when COLO start up. like that: qemu-system-x86_64: util/qemu-thread-posix.c:64: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed. This patch add the bitmap_mutex initialize and destroy in COLO lifecycle. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20190329222951.28945-1-chen.zhang@intel.com> Reviewed-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-04-05 15:29:48 +01:00
Markus Armbruster	daff7f0bbe	migration: Support adding migration blockers earlier migrate_add_blocker() asserts we have a current_migration object, in migrate_get_current(). We do only after migration_object_init(). This contributes to the following dependency cycle: * configure_blockdev() must run before machine_set_property() so machine properties can refer to block backends * machine_set_property() before configure_accelerator() so machine properties like kvm-irqchip get applied * configure_accelerator() before migration_object_init() so that Xen's accelerator compat properties get applied. * migration_object_init() before configure_blockdev() so configure_blockdev() can add migration blockers The cycle was closed when recent commit `cda4aa9a5a` "Create block backends before setting machine properties" added the first dependency, and satisfied it by violating the last one. Broke block backends that add migration blockers, as demonstrated by qemu-iotests 055. To fix it, break the last dependency: make migrate_add_blocker() usable before migration_object_init(). The previous commit already removed the use of migrate_get_current() from migrate_add_blocker() itself. Didn't quite do the trick, as there's another one hiding in migration_is_idle(). The use there isn't actually necessary: when no migration object has been created yet, migration is surely idle. Make migration_is_idle() return true then. Fixes: `cda4aa9a5a` Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190401090827.20793-4-armbru@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>	2019-04-02 13:49:36 +02:00
Markus Armbruster	811f865271	Revert "migration: move only_migratable to MigrationState" This reverts commit `3df663e575`. This reverts commit `b605c47b57`. Command line option --only-migratable is for disallowing any configuration that can block migration. Initially, --only-migratable set global variable @only_migratable. Commit `3df663e575` "migration: move only_migratable to MigrationState" replaced it by MigrationState member @only_migratable. That was a mistake. First, it doesn't make sense on the design level. MigrationState captures the state of an individual migration, but --only-migratable isn't a property of an individual migration, it's a restriction on QEMU configuration. With fault tolerance, we could have several migrations at once. --only-migratable would certainly protect all of them. Storing it in MigrationState feels inappropriate. Second, it contributes to a dependency cycle that manifests itself as a bug now. Putting @only_migratable into MigrationState means its available only after migration_object_init(). We can't set it before migration_object_init(), so we delay setting it with a global property (this is fixup commit `b605c47b57` "migration: fix handling for --only-migratable"). We can't get it before migration_object_init(), so anything that uses it can only run afterwards. Since migrate_add_blocker() needs to obey --only-migratable, any code adding migration blockers can run only afterwards. This contributes to the following dependency cycle: * configure_blockdev() must run before machine_set_property() so machine properties can refer to block backends * machine_set_property() before configure_accelerator() so machine properties like kvm-irqchip get applied * configure_accelerator() before migration_object_init() so that Xen's accelerator compat properties get applied. * migration_object_init() before configure_blockdev() so configure_blockdev() can add migration blockers The cycle was closed when recent commit `cda4aa9a5a` "Create block backends before setting machine properties" added the first dependency, and satisfied it by violating the last one. Broke block backends that add migration blockers. Moving @only_migratable into MigrationState was a mistake. Revert it. This doesn't quite break the "migration_object_init() before configure_blockdev() dependency, since migrate_add_blocker() still has another dependency on migration_object_init(). To be addressed the next commit. Note that the reverted commit made -only-migratable sugar for -global migration.only-migratable=on below the hood. Documentation has only ever mentioned -only-migratable. This commit removes the arcane & undocumented alternative to -only-migratable again. Nobody should be using it. Conflicts: include/migration/misc.h migration/migration.c migration/migration.h vl.c Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190401090827.20793-3-armbru@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>	2019-04-02 13:38:05 +02:00
Peter Maydell	7e9a2137ce	Pull request - Rebase last pull request - Drop multifd - several other minor fixesLaLaLa -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJcmRP1AAoJEPSH7xhYctcjVDcP/iZoCgMDn0BVzYFamRAIvtlh 1h1ElV+Jx49bHRvDRs0RaTSIlowqnbMY5yiTfn0L7aSbOr8KLEbs+i+jo5moF3+Q 50TNxGTDF/WWvl+z8X3WljwDPYBnG7mYeDBNBk+8V2RI/DvV2uAdm29VPmPN/Kc8 hW8S6kXRAQekkkt0BOkXHXWQlmvzHS9RqQoZ0dETP9GqcT7cJ6HDZJu8akiz6Oz3 r0Hek41EVQirjfKL+Sm5BluiiuvNcdFGsYK/TqLiCpnHolNUboMnIhXiTX2BJRf7 TEK8UGrbgXa3SarszCBxjsjMFYRJlq6Vi7ZQ54Ly7+wFr09jhIDgt9AlEr0YjOj4 8AgGF6nKYmFahQuKvJ1xMrgY3EccBDWXJKBwcnnd5zMJyVGlNtUUs7f7pSA3V/oG wEDMzmxcpKxK3A9jpPBgEN4ev0oKaR+rxAdy5NPTU7kMZV651JXt2pOirGm5AL2V soKiiSklUZ7VpJ998PnGj7pO4LL8xWW3Pi4mzlH6dv+Aw9T2L9vY8rPFEktOJ4V5 8qB9PERlAG/KbpVH2lrkUFFk4sfxBmVTG+SppwCk4I6/eSaDuO3pjXcuwiFaIyqT kHLsBVT0kLEYeE6zty2YHvjIEmAyaJxr2HezWquQ9xQOezDl1s3wjVGRFJ6xZDKn uMHI4j2i5UWA8B73inh0 =kjyq -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/juanquintela/tags/migration-pull-request' into staging Pull request - Rebase last pull request - Drop multifd - several other minor fixesLaLaLa # gpg: Signature made Mon 25 Mar 2019 17:46:29 GMT # gpg: using RSA key F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full] # gpg: aka "Juan Quintela <quintela@trasno.org>" [full] # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * remotes/juanquintela/tags/migration-pull-request: migration/postcopy: Update the bandwidth during postcopy Migration/colo.c: Make user obtain the last COLO mode info after failover Migration/colo.c: Add the necessary checks for colo_do_failover Migration/colo.c: Add new COLOExitReason to handle all failover state Migration/colo.c: Fix COLO failover status error migration/rdma: Check qemu_rdma_init_one_block migration: add support for a "tls-authz" migration parameter multifd: Drop x- multifd: Add some padding multifd: Change default packet size multifd: Be flexible about packet size multifd: Drop x-multifd-page-count parameter multifd: Create new next_packet_size field multifd: Rename "size" member to pages_alloc multifd: Only send pages when packet are not empty Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-03-25 18:15:43 +00:00
Dr. David Alan Gilbert	c38c1c142e	migration/postcopy: Update the bandwidth during postcopy The recently added max-postcopy-bandwidth parameter is only read at the transition from precopy->postcopy where as the older max-bandwidth parameter updates the migration bandwidth when changed even if the migration is already running. Fix this discrepency so that: a) You can change the bandwidth during postcopy by setting max-postcopy-bandwidth b) Changing max-bandwidth during postcopy has no effect (it currently changes the postcopy bandwidth which isn't expected). Fixes: `7e555c6c` bz: https://bugzilla.redhat.com/show_bug.cgi?id=1686321 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:46:03 +01:00
Zhang Chen	5ed0deca41	Migration/colo.c: Make user obtain the last COLO mode info after failover Add the last_colo_mode to save the status after failover. This patch can solve the issue that user want to get last colo mode use query_colo_status after failover. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:45:46 +01:00
Zhang Chen	82cd368ccd	Migration/colo.c: Add the necessary checks for colo_do_failover Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:45:38 +01:00
Zhang Chen	3a43ac4757	Migration/colo.c: Add new COLOExitReason to handle all failover state In this patch we add the processing state for COLOExitReason, because we have to identify COLO in the failover processing state or failover error state. In the way, we can handle all the failover state. We have improved the description of the COLOExitReason by the way. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:45:30 +01:00
Zhang Chen	1fe6ab267f	Migration/colo.c: Fix COLO failover status error When finished COLO failover, the status is FAILOVER_STATUS_COMPLETED. The origin codes misunderstand the FAILOVER_STATUS_REQUIRE. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:45:23 +01:00
Dr. David Alan Gilbert	281496bb8a	migration/rdma: Check qemu_rdma_init_one_block Actually it can't fail at the moment, but Coverity moans that it's the only place it's not checked, and it's an easy check. Reported-by: Coverity (CID 1399413) Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:45:10 +01:00
Daniel P. Berrange	d2f1d29b95	migration: add support for a "tls-authz" migration parameter The QEMU instance that runs as the server for the migration data transport (ie the target QEMU) needs to be able to configure access control so it can prevent unauthorized clients initiating an incoming migration. This adds a new 'tls-authz' migration parameter that is used to provide the QOM ID of a QAuthZ subclass instance that provides the access control check. This is checked against the x509 certificate obtained during the TLS handshake. For example, when starting a QEMU for incoming migration, it is possible to give an example identity of the source QEMU that is intended to be connecting later: $QEMU \ -monitor stdio \ -incoming defer \ ...other args... (qemu) object_add tls-creds-x509,id=tls0,dir=/home/berrange/qemutls,\ endpoint=server,verify-peer=yes \ (qemu) object_add authz-simple,id=auth0,identity=CN=laptop.example.com,,\ O=Example Org,,L=London,,ST=London,,C=GB \ (qemu) migrate_incoming tcp:localhost:9000 Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:47 +01:00
Juan Quintela	cbfd6c957a	multifd: Drop x- We make it supported from now on. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:45 +01:00
Juan Quintela	5fbd8b4bbb	multifd: Add some padding Add some padding. MultifdInit_t is padded to 64 bytes. MultiFDPacket_t is padded to 320bytes (64 * 5). Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:44 +01:00
Juan Quintela	4b0c72645c	multifd: Change default packet size We moved from 64KB to 512KB, as it makes less locking contention without any downside in testing. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:43 +01:00
Juan Quintela	7ed379b286	multifd: Be flexible about packet size This way we can change the packet size in the future and everything will work. We choose an arbitrary big number (100 times configured size) as a limit about how big we will reallocate. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:42 +01:00
Juan Quintela	efd1a1d640	multifd: Drop x-multifd-page-count parameter Libvirt don't want to expose (and explain it). From now on we measure the number of packages in bytes instead of pages, so it is the same independently of architecture. We choose the page size of x86. Notice that in the following patch we make this variable. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:41 +01:00
Juan Quintela	2a34ee593b	multifd: Create new next_packet_size field We need to send this field when we add compression support. As we are still on x- stage, we can do this kind of changes. Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:39 +01:00
Juan Quintela	6f86269295	multifd: Rename "size" member to pages_alloc It really indicates what is the number of allocated pages for one packet. Once there rename "used" to "pages_used". Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:38 +01:00
Juan Quintela	ad24c7cb59	multifd: Only send pages when packet are not empty We send packages without pages sometimes for sysnchronizanion. The iov functions do the right thing, but we will be changing this code in future patches. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-03-25 18:13:37 +01:00
Markus Armbruster	dec9776049	trace-events: Fix attribution of trace points to source Some trace points are attributed to the wrong source file. Happens when we neglect to update trace-events for code motion, or add events in the wrong place, or misspell the file name. Clean up with help of cleanup-trace-events.pl. Same funnies as in the previous commit, of course. Manually shorten its change to linux-user/trace-events to */signal.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 20190314180929.27722-6-armbru@redhat.com Message-Id: <20190314180929.27722-6-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-03-22 16:18:07 +00:00
Markus Armbruster	500016e5db	trace-events: Shorten file names in comments We spell out sub/dir/ in sub/dir/trace-events' comments pointing to source files. That's because when trace-events got split up, the comments were moved verbatim. Delete the sub/dir/ part from these comments. Gets rid of several misspellings. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190314180929.27722-3-armbru@redhat.com Message-Id: <20190314180929.27722-3-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-03-22 16:18:07 +00:00
Peter Maydell	f6c63c0dbf	* ASAN fixes -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlyHw88UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNL4Qf/UPunPKY/OK47evFGPG0ZMGF3IxOp OgM0MMBOPdSMaLuI+cgmI+U1+hOqw9Vf/eyyfRFZCTQXjr1BQL0exAG+KvBeLOSC h1hJmpecc0IS2D3DaXDI2SvlLr7AFAVIY2JR9lCdJW99mC6HROSeaWnjQ0XflxTM 2BSl1FDzO6bHz3OgUHM2NAPYzjpwTOq7ZnaTd20a7zE+7ef7iEJ3edRHEg+RmHtN gMwOkZw1Ip5Zn5hCjJbURZG+OMOKY4/mSqV6a9IByQ5Kws8rhb38f9wpA09C7y3S Q7Tv1XIT84sVg7B0eToQObzmkagA6NGJuNy+TleOeTemntEmzQGQ4fk6Zw== =ybUj -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * ASAN fixes # gpg: Signature made Tue 12 Mar 2019 14:35:59 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: test-migration: fix memory leak migration: fix memory leak test-bdrv-graph-mod: fix Error leak test-char: fix undefined behavior Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-03-14 12:02:12 +00:00
Eric Blake	796a3798ab	bitmaps: Fix typo in function name Commit `a88b179f` introduced the ability to set and query bitmap persistence, but with an atypical spelling. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 20190308205845.25734-1-eblake@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-03-12 12:05:49 -04:00
John Snow	3ae96d6684	block/dirty-bitmaps: add block_dirty_bitmap_check function Instead of checking against busy, inconsistent, or read only directly, use a check function with permissions bits that let us streamline the checks without reproducing them in many places. Included in this patch are permissions changes that simply add the inconsistent check to existing permissions call spots, without addressing existing bugs. In general, this means that busy+readonly checks become BDRV_BITMAP_DEFAULT, which checks against all three conditions. busy-only checks become BDRV_BITMAP_ALLOW_RO. Notably, remove allows inconsistent bitmaps, so it doesn't follow the pattern. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190301191545.8728-4-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-03-12 12:05:49 -04:00
John Snow	27a1b301a4	block/dirty-bitmaps: unify qmp_locked and user_locked calls These mean the same thing now. Unify them and rename the merged call bdrv_dirty_bitmap_busy to indicate semantically what we are describing, as well as help disambiguate from the various _locked and _unlocked versions of bitmap helpers that refer to mutex locks. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190223000614.13894-8-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-03-12 12:05:48 -04:00
John Snow	50a47257f8	block/dirty-bitmaps: rename frozen predicate helper "Frozen" was a good description a long time ago, but it isn't adequate now. Rename the frozen predicate to has_successor to make the semantics of the predicate more clear to outside callers. In the process, remove some calls to frozen() that no longer semantically make sense. For bdrv_enable_dirty_bitmap_locked and bdrv_disable_dirty_bitmap_locked, it doesn't make sense to prohibit QEMU internals from performing this action when we only wished to prohibit QMP users from issuing these commands. All of the QMP API commands for bitmap manipulation already check against user_locked() to prohibit these actions. Several other assertions really want to check that the bitmap isn't in-use by another operation -- use the bitmap_user_locked function for this instead, which presently also checks for has_successor. This leaves some redundant checks of has_successor through different helpers that are addressed in forthcoming patches. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190223000614.13894-3-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-03-12 12:05:48 -04:00
Paolo Bonzini	5e78bc6a47	migration: fix memory leak Reported by ASAN. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-03-12 15:18:40 +01:00
Marc-André Lureau	d890344166	slirp: use libslirp migration code slirp migration code uses QEMU vmstate so far, when building WITH_QEMU. Introduce slirp_state_{load,save,version}() functions to move the state saving handling to libslirp side. So far, the bitstream compatibility should remain equal with current QEMU, as this is effectively using the same code, with the same format etc. When libslirp is made standalone, we will need some mechanism to ensure bitstream compatibility regardless of the libslirp version installed. See the FIXME note in the code. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20190212162524.31504-3-marcandre.lureau@redhat.com> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>	2019-03-07 12:46:31 +01:00
Zhang Chen	db00972922	Migration/colo.c: Make COLO node running after failover Delay to close COLO for auto start VM after failover. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190303145021.2962-4-chen.zhang@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:18 +00:00
Zhang Chen	b8b5734b09	Migration/colo.c: Fix double close bug when occur COLO failover In migration_incoming_state_destroy(void) will check the mis->to_src_file to double close the mis->to_src_file when occur COLO failover. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190303145021.2962-2-chen.zhang@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:18 +00:00
Wei Wang	6eeb63f740	migration/ram.c: add the free page optimization enable flag This patch adds the free page optimization enable flag, and a function to set this flag. When the free page optimization is enabled, not all the pages are needed to be sent in the bulk stage. Why using a new flag, instead of directly disabling ram_bulk_stage when the optimization is running? Thanks for Peter Xu's reminder that disabling ram_bulk_stage will affect the use of compression. Please see save_page_use_compression. When xbzrle and compression are used, if free page optimizaion causes the ram_bulk_stage to be disabled, save_page_use_compression will return false, which disables the use of compression. That is, if free page optimization avoids the sending of half of the guest pages, the other half of pages loses the benefits of compression in the meantime. Using a new flag to let migration_bitmap_find_dirty skip the free pages in the bulk stage will avoid the above issue. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-7-git-send-email-wei.w.wang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:18 +00:00
Wei Wang	bd2270608f	migration/ram.c: add a notifier chain for precopy This patch adds a notifier chain for the memory precopy. This enables various precopy optimizations to be invoked at specific places. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-6-git-send-email-wei.w.wang@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:18 +00:00
Wei Wang	6bcb05fc42	migration: API to clear bits of guest free pages from the dirty bitmap This patch adds an API to clear bits corresponding to guest free pages from the dirty bitmap. Spilt the free page block if it crosses the QEMU RAMBlock boundary. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-5-git-send-email-wei.w.wang@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:18 +00:00
Wei Wang	386a907b37	migration: use bitmap_mutex in migration_bitmap_clear_dirty The bitmap mutex is used to synchronize threads to update the dirty bitmap and the migration_dirty_pages counter. For example, the free page optimization clears bits of free pages from the bitmap in an iothread context. This patch makes migration_bitmap_clear_dirty update the bitmap and counter under the mutex. Signed-off-by: Wei Wang <wei.w.wang@intel.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1544516693-5395-4-git-send-email-wei.w.wang@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:18 +00:00
Juan Quintela	9aca82ba31	migration: Create socket-address parameter It will be used to store the uri parameters. We want this only for tcp, so we don't set it for other uris. We need it to know what port is migration running. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Removed DummyStruct as suggested by Eric & Markus --	2019-03-06 10:49:17 +00:00
Yury Kotov	6cafc8e4dd	migration: Add capabilities validation Currently we don't check which capabilities set in the source QEMU. We just expect that the target QEMU has the same enabled capabilities. Add explicit validation for capabilities to make sure that the target VM has them too. This is enabled for only new capabilities to keep compatibily. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-6-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Manual merge	2019-03-06 10:49:17 +00:00
Yury Kotov	fbd162e629	migration: Add an ability to ignore shared RAM blocks If ignore-shared capability is set then skip shared RAMBlocks during the RAM migration. Also, move qemu_ram_foreach_migratable_block (and rename) to the migration code, because it requires access to the migration capabilities. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-4-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:17 +00:00
Yury Kotov	18269069c3	migration: Introduce ignore-shared capability We want to use local migration to update QEMU for running guests. In this case we don't need to migrate shared (file backed) RAM. So, add a capability to ignore such blocks during live migration. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-3-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:17 +00:00
Yury Kotov	754cb9c0eb	exec: Change RAMBlockIterFunc definition Currently, qemu_ram_foreach_* calls RAMBlockIterFunc with many block-specific arguments. But often iter func needs RAMBlock. This refactoring is needed for fast access to RAMBlock flags from qemu_ram_foreach_block's callback. The only way to achieve this now is to call qemu_ram_block_from_host (which also enumerates blocks). So, this patch reduces complexity of qemu_ram_foreach_block() -> cb() -> qemu_ram_block_from_host() from O(n^2) to O(n). Fix RAMBlockIterFunc definition and add some functions to read RAMBlock fields witch were passed. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190215174548.2630-2-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:17 +00:00
Marcel Apfelbaum	9589e76301	migration/rdma: clang compilation fix Configuring QEMU with: ../configure --cc=clang --enable-rdma Leads to compilation error: CC migration/rdma.o CC migration/block.o qemu/migration/rdma.c:3615:58: error: taking address of packed member 'rkey' of class or structure 'RDMARegisterResult' may result in an unaligned pointer value [-Werror,-Waddress-of-packed-member] (uintptr_t)host_addr, NULL, &reg_result->rkey, ^~~~~~~~~~~~~~~~ Fix it by using a temp local variable. Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Message-Id: <20190304184923.24215-1-marcel.apfelbaum@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-03-06 10:49:17 +00:00
Dr. David Alan Gilbert	892ae715b6	migration: Cleanup during exit Currently we cleanup the migration object as we exit main after the main_loop finishes; however if there's a migration running things get messy and we can end up with the migration thread still trying to access freed structures. We now take a ref to the object around the migration thread itself, so the act of dropping the ref during exit doesn't cause us to lose the state until the thread quits. Cancelling the migration during migration also tries to get the thread to quit. We do this a bit earlier; so hopefully migration gets out of the way before all the devices etc are freed. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20190227164900.16378-1-dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-03-06 10:49:17 +00:00
Dr. David Alan Gilbert	cf75e26849	migration/rdma: Fix qemu_rdma_cleanup null check If the migration fails before the channel is open (e.g. a bad address) we end up in the cleanup with rdma->channel==NULL. Spotted by Coverity: CID 1398634 Fixes: `fbbaacab27` Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190214185351.5927-1-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-03-06 10:49:17 +00:00
Dr. David Alan Gilbert	c3c5eae6ac	migration: Fix cancel state During a cancelled migration there's a race where the fd can go into an error state before we get back around the migration loop and migration_detect_error transitions from cancelling->failed. Check for cancelled/cancelling and don't change the state. Red Hat bug: https://bugzilla.redhat.com/show_bug.cgi?id=1608649 Fixes: `b23c2ade25` Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190219195928.12289-1-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>	2019-03-06 10:49:17 +00:00
Dr. David Alan Gilbert	7659505c16	migration: Switch to using announce timer Switch the announcements to using the new announce timer. Move the code that does it to announce.c rather than savevm because it really has nothing to do with the actual migration. Migration starts the announce from bh's and so they're all in the main thread/bql, and so there's never any racing with the timers themselves. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2019-03-05 11:27:41 +08:00
Dr. David Alan Gilbert	ee3d96baf3	migration: Add announce parameters Add migration parameters that control RARP/GARP announcement timeouts. Based on earlier patches by myself and Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2019-03-05 11:27:41 +08:00
Dr. David Alan Gilbert	50510ea2c2	net: Introduce announce timer The 'announce timer' will be used by migration, and explicit requests for qemu to perform network announces. Based on the work by Germano Veit Michel <germano@redhat.com> and Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2019-03-05 11:27:41 +08:00
Vladimir Sementsov-Ogievskiy	f556f37b11	migration/block: use qemu_iovec_init_buf Use new qemu_iovec_init_buf() instead of qemu_iovec_init_external( ... , 1), which simplifies the code. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190218140926.333779-14-vsementsov@virtuozzo.com Message-Id: <20190218140926.333779-14-vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-02-22 09:42:13 +00:00
Xiao Guangrong	aecbfe9c64	migration: introduce pages-per-second It introduces a new statistic, pages-per-second, as bandwidth or mbps is not enough to measure the performance of posting pages out as we have compression, xbzrle, which can significantly reduce the amount of the data size, instead, pages-per-second is the one we want Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> Message-Id: <20190111063732.10484-2-xiaoguangrong@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> With typo's Eric spotted fixed	2019-01-23 15:51:47 +00:00
Marc-André Lureau	de22ded044	vmstate: constify SaveVMHandlers Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181114133139.27346-1-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-01-23 15:51:47 +00:00
Dr. David Alan Gilbert	fbbaacab27	migration/rdma: unregister fd handler Unregister the fd handler before we destroy the channel, otherwise we've got a race where we might land in the fd handler just as we're closing the device. (The race is quite data dependent, you just have to have the right set of devices for it to trigger). Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190122173111.29821-1-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-01-23 15:51:32 +00:00
Fei Li	6d99c2d41c	migration: unify error handling for process_incoming_migration_co In the current code, if process_incoming_migration_co() fails we do the same error handing: set the error state, close the source file, do the cleanup for multifd, and then exit(EXIT_FAILURE). To make the code clearer, add a "goto fail" to unify the error handling. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Fei Li <fli@suse.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190113140849.38339-6-lifei1214@126.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-01-23 15:02:07 +00:00
Fei Li	91b02dc750	migration: add more error handling for postcopy_ram_enable_notify Call postcopy_ram_incoming_cleanup() to do the cleanup when postcopy_ram_enable_notify fails. Besides, report the error message when qemu_ram_foreach_migratable_block() fails. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Fei Li <fli@suse.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190113140849.38339-5-lifei1214@126.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-01-23 15:02:07 +00:00
Fei Li	1398b2e3fe	migration: multifd_save_cleanup() can't fail, simplify multifd_save_cleanup() takes an Error ** argument and returns an error code even though it can't actually fail. Its callers dutifully check for failure. Remove the useless argument and return value, and simplify the callers. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: Fei Li <fli@suse.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190113140849.38339-4-lifei1214@126.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-01-23 15:02:07 +00:00
Fei Li	49ed0d24a4	migration: fix the multifd code when receiving less channels In our current code, when multifd is used during migration, if there is an error before the destination receives all new channels, the source keeps running, however the destination does not exit but keeps waiting until the source is killed deliberately. Fix this by dumping the specific error and let users decide whether to quit from the destination side when failing to receive packet via some channel. And update the comment for multifd_recv_new_channel(). Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: Fei Li <fli@suse.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20190113140849.38339-3-lifei1214@126.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-01-23 15:02:07 +00:00
Aaron Lindsay	8c07559fc7	migration: Add post_save function to VMStateDescription In some cases it may be helpful to modify state before saving it for migration, and then modify the state back after it has been saved. The existing pre_save function provides half of this functionality. This patch adds a post_save function to provide the second half. Signed-off-by: Aaron Lindsay <aclindsa@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20181211151945.29137-2-aaron@os.amperecomputing.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-21 10:38:55 +00:00
Philippe Mathieu-Daudé	a346af9c88	migration: Use strnlen() for fixed-size string GCC 8 introduced the -Wstringop-overflow, which detect buffer overflow by string-modifying functions declared in <string.h>, such strncpy(), used in global_state_store_running(). GCC indeed found an incorrect use of strlen(), because this array is loaded by VMSTATE_BUFFER(runstate, GlobalState) then parsed using qapi_enum_parse which does not get the buffer length. Use strnlen() which returns sizeof(s->runstate) if the array is not NUL-terminated, assert the size is within range, and enforce the array to be NUL-terminated to avoid an overflow in qapi_enum_parse(). This fixes: CC migration/global_state.o qemu/migration/global_state.c: In function 'global_state_pre_save': qemu/migration/global_state.c:109:15: error: 'strlen' argument 1 declared attribute 'nonstring' [-Werror=stringop-overflow=] s->size = strlen((char )s->runstate) + 1; ^~~~~~~~~~~~~~~~~~~~~~~~~~~ qemu/migration/global_state.c:24:13: note: argument 'runstate' declared here uint8_t runstate[100] QEMU_NONSTRING; ^~~~~~~~ cc1: all warnings being treated as errors make: ** [qemu/rules.mak:69: migration/global_state.o] Error 1 Suggested-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-01-17 21:10:57 -05:00
Marc-André Lureau	0a5526a18b	migration: Fix stringop-truncation warning GCC 8 added a -Wstringop-truncation warning: The -Wstringop-truncation warning added in GCC 8.0 via r254630 for bug 81117 is specifically intended to highlight likely unintended uses of the strncpy function that truncate the terminating NUL character from the source string. This new warning leads to compilation failures: CC migration/global_state.o qemu/migration/global_state.c: In function 'global_state_store_running': qemu/migration/global_state.c:45:5: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation] strncpy((char )global_state.runstate, state, sizeof(global_state.runstate)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ make: ** [qemu/rules.mak:69: migration/global_state.o] Error 1 Adding an assert is enough to silence GCC. (alternatively, we could hard-code "running") Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [PMD: More verbose commit message] Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-01-17 21:10:57 -05:00
Paolo Bonzini	b58deb344d	qemu/queue.h: leave head structs anonymous unless necessary Most list head structs need not be given a name. In most cases the name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV or reverse iteration, but this does not apply to lists of other kinds, and even for QTAILQ in practice this is only rarely needed. In addition, we will soon reimplement those macros completely so that they do not need a name for the head struct. So clean up everything, not giving a name except in the rare case where it is necessary. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-01-11 15:46:55 +01:00
Daniel Henrique Barboza	fb06411210	qmp hmp: Make system_wakeup check wake-up support and run state The qmp/hmp command 'system_wakeup' is simply a direct call to 'qemu_system_wakeup_request' from vl.c. This function verifies if runstate is SUSPENDED and if the wake up reason is valid before proceeding. However, no error or warning is thrown if any of those pre-requirements isn't met. There is no way for the caller to differentiate between a successful wakeup or an error state caused when trying to wake up a guest that wasn't suspended. This means that system_wakeup is silently failing, which can be considered a bug. Adding error handling isn't an API break in this case - applications that didn't check the result will remain broken, the ones that check it will have a chance to deal with it. Adding to that, the commit before previous created a new QMP API called query-current-machine, with a new flag called wakeup-suspend-support, that indicates if the guest has the capability of waking up from suspended state. Although such guest will never reach SUSPENDED state and erroring it out in this scenario would suffice, it is more informative for the user to differentiate between a failure because the guest isn't suspended versus a failure because the guest does not have support for wake up at all. All this considered, this patch changes qmp_system_wakeup to check if the guest is capable of waking up from suspend, and if it is suspended. After this patch, this is the output of system_wakeup in a guest that does not have wake-up from suspend support (ppc64): (qemu) system_wakeup wake-up from suspend is not supported by this guest (qemu) And this is the output of system_wakeup in a x86 guest that has the support but isn't suspended: (qemu) system_wakeup Unable to wake up: guest is not in suspended state (qemu) Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20181205194701.17836-4-danielhb413@gmail.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-12-18 07:55:47 +01:00
Marc-André Lureau	335d10cd8e	qapi: add conditions to REPLICATION type/commands on the schema Add #if defined(CONFIG_REPLICATION) in generated code, and adjust the code accordingly. Made conditional: * xen-set-replication, query-xen-replication-status, xen-colo-do-checkpoint Before the patch, we first register the commands unconditionally in generated code (requires a stub), then conditionally unregister in qmp_unregister_commands_hack(). Afterwards, we register only when CONFIG_REPLICATION. The command fails exactly the same, with CommandNotFound. Improvement, because now query-qmp-schema is accurate, and we're one step closer to killing qmp_unregister_commands_hack(). * enum BlockdevDriver value "replication" in command blockdev-add * BlockdevOptions variant @replication and related structures. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20181213123724.4866-23-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-12-14 06:52:48 +01:00
Marc-André Lureau	03fee66fde	vmstate: constify VMStateField Because they are supposed to remain const. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181114132931.22624-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-11-27 15:35:15 +01:00
Paolo Bonzini	5aaac46793	migration: savevm: consult migration blockers There is really no difference between live migration and savevm, except that savevm does not require bdrv_invalidate_cache to be implemented by all disks. However, it is unlikely that savevm is used with anything except qcow2 disks, so the penalty is small and worth the improvement in catching bad usage of savevm. Only one place was taking care of savevm when adding a migration blocker, and it can be removed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-11-27 15:06:14 +01:00
Zhang Chen	7e934f5b27	migration/migration.c: Add COLO dependency checks Current COLO mode(independent disk mode) need replication module work together. Suggested by Dr. David Alan Gilbert <dgilbert@redhat.com>. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20181114190912.7242-1-chen.zhang@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-11-21 11:38:12 +00:00
Zhang Chen	3ebb9c4f52	migration/colo.c: Fix compilation issue when disable replication This compilation issue will occur when user use --disable-replication to config Qemu. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Message-Id: <20181101021226.6353-1-zhangckid@gmail.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-11-21 11:20:14 +00:00
Jia Lina	3d63da16fb	migration: avoid segmentfault when take a snapshot of a VM which being migrated During an active background migration, snapshot will trigger a segmentfault. As snapshot clears the "current_migration" struct and updates "to_dst_file" before it finds out that there is a migration task, Migration accesses the null pointer in "current_migration" struct and qemu crashes eventually. Signed-off-by: Jia Lina <jialina01@baidu.com> Signed-off-by: Chai Wen <chaiwen@baidu.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Message-Id: <20181026083620.10172-1-jialina01@baidu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-10-31 09:38:59 +00:00
Vladimir Sementsov-Ogievskiy	9c98f145df	dirty-bitmaps: clean-up bitmaps loading and migration logic This patch aims to bring the following behavior: 1. We don't load bitmaps, when started in inactive mode. It's the case of incoming migration. In this case we wait for bitmaps migration through migration channel (if 'dirty-bitmaps' capability is enabled) or for invalidation (to load bitmaps from the image). 2. We don't remove persistent bitmaps on inactivation. Instead, we only remove bitmaps after storing. This is the only way to restore bitmaps, if we decided to resume source after [failed] migration with 'dirty-bitmaps' capability enabled (which means, that bitmaps were not stored). 3. We load bitmaps on open and any invalidation, it's ok for all cases: - normal open - migration target invalidation with dirty-bitmaps capability (bitmaps are migrating through migration channel, the are not stored, so they should have IN_USE flag set and will be skipped when loading. However, it would fail if bitmaps are read-only[1]) - migration target invalidation without dirty-bitmaps capability (normal load of the bitmaps, if migrated with shared storage) - source invalidation with dirty-bitmaps capability (skip because IN_USE) - source invalidation without dirty-bitmaps capability (bitmaps were dropped, reload them) [1]: to accurately handle this, migration of read-only bitmaps is explicitly forbidden in this patch. New mechanism for not storing bitmaps when migrate with dirty-bitmaps capability is introduced: migration filed in BdrvDirtyBitmap. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com>	2018-10-29 16:23:17 -04:00
John Snow	993edc0ce0	block/dirty-bitmaps: add user_locked status checker Instead of both frozen and qmp_locked checks, wrap it into one check. frozen implies the bitmap is split in two (for backup), and shouldn't be modified. qmp_locked implies it's being used by another operation, like being exported over NBD. In both cases it means we shouldn't allow the user to modify it in any meaningful way. Replace any usages where we check both frozen and qmp_locked with the new check. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20181002230218.13949-2-jsnow@redhat.com [w/edits Suggested-By: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>] Signed-off-by: John Snow <jsnow@redhat.com>	2018-10-29 16:23:16 -04:00
Peter Maydell	13399aad4f	Error reporting patches for 2018-10-22 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJbzcCHAAoJEDhwtADrkYZT3YsP/2qE4HNY/htj3IP6vNJuSaqw CLPRTz7zWmUBTE6FqSkvLsq3X2BMFFLeaIPA9EFcbyn2km6qPqBYgg9ElXXvPZBm 6hDeRIoC8FdRD0Apozd5MGC94/lE47PheDRV8V+4KrGLaaMXEPxMZ0wP4AfdS5pS 6Pt2xuF7nPu1+OWVxMk0fXadGjGLEuOQQmTh3B21J5RaynQ3gtd6h7XFC/LJyOGG LC/6GyPc0h7KU83VnvrRjH/EOpu1wENgrsvWsS0sem8op35Z+i9jU5BfCp4qFkDy gCHHUEyEeyexS+W+Tj87eBtK2gfrqQx9ovo8CIsWcUwpKbdD6AMK4FKGsDNMNHab Kg5u/M+O8nHCB7DuursF+3mqEbZHb05cfKe6JEtiq49EuORMV5hp4Ap966noSwTw UEU0NJNA1p8EdmXVudyyyYR7wpoSSmZpoenA+bJ3nthK8K0KcU4RUGk6ZEbxfJy+ 7ENl+3R2IxmxzgXv/x0tz0uFisaVW1rltTXtMte+ElQsO0qy74iHdfR7JHsmLxj9 CO/ABMVoYsWq2OJv8pWLrdKpT4v3HQLJdHhknyu0ZcJGDyICqX29ULLEhPrNEZvW rxVxAkiemlaqxlUjbrM46CDQQm+w03OCnk7aCYcV4oK+u5+o3mCag705gMPErapZ 6uOE3fAjiWw43sA31mek =kPZX -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-10-22' into staging Error reporting patches for 2018-10-22 # gpg: Signature made Mon 22 Oct 2018 13:20:23 BST # gpg: using RSA key 3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-error-2018-10-22: (40 commits) error: Drop bogus "use error_setg() instead" admonitions vpc: Fail open on bad header checksum block: Clean up bdrv_img_create()'s error reporting vl: Simplify call of parse_name() vl: Fix exit status for -drive format=help blockdev: Convert drive_new() to Error vl: Assert drive_new() does not fail in default_drive() fsdev: Clean up error reporting in qemu_fsdev_add() spice: Clean up error reporting in add_channel() tpm: Clean up error reporting in tpm_init_tpmdev() numa: Clean up error reporting in parse_numa() vnc: Clean up error reporting in vnc_init_func() ui: Convert vnc_display_init(), init_keyboard_layout() to Error ui/keymaps: Fix handling of erroneous include files vl: Clean up error reporting in device_init_func() vl: Clean up error reporting in parse_fw_cfg() vl: Clean up error reporting in mon_init_func() vl: Clean up error reporting in machine_set_property() vl: Clean up error reporting in chardev_init_func() qom: Clean up error reporting in user_creatable_add_opts_foreach() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-10-23 17:20:23 +01:00
Markus Armbruster	4dd32b3dda	migration: Fix !replay_can_snapshot() error handling Calling error_report() in a function that takes an Error ** argument is suspicious. save_snapshot() and load_snapshot() do that, and then fail without setting an error. Wrong. The HMP commands survive this unscathed, since hmp_handle_error() does nothing when no error has been set. Callers main() (on behalf of -loadvm) and replay_vmstate_init() crash, but I'm not sure the error is possible there. Screwed up when commit `377b21ccea` (v2.12.0) added incorrect error handling right next to correct examples. Fix by calling error_setg() instead of error_report(). Fixes: `377b21ccea` Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181017082702.5581-13-armbru@redhat.com>	2018-10-19 14:51:34 +02:00
Markus Armbruster	4b5766488f	error: Fix use of error_prepend() with &error_fatal, &error_abort From include/qapi/error.h: * Pass an existing error to the caller with the message modified: * error_propagate(errp, err); * error_prepend(errp, "Could not frobnicate '%s': ", name); Fei Li pointed out that doing error_propagate() first doesn't work well when @errp is &error_fatal or &error_abort: the error_prepend() is never reached. Since I doubt fixing the documentation will stop people from getting it wrong, introduce error_propagate_prepend(), in the hope that it lures people away from using its constituents in the wrong order. Update the instructions in error.h accordingly. Convert existing error_prepend() next to error_propagate to error_propagate_prepend(). If any of these get reached with &error_fatal or &error_abort, the error messages improve. I didn't check whether that's the case anywhere. Cc: Fei Li <fli@suse.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20181017082702.5581-2-armbru@redhat.com>	2018-10-19 14:51:34 +02:00
zhanghailiang	2518aec192	COLO: quick failover process by kick COLO thread COLO thread may sleep at qemu_sem_wait(&s->colo_checkpoint_sem), while failover works begin, It's better to wakeup it to quick the process. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
zhanghailiang	7b3435309d	COLO: notify net filters about checkpoint/failover event Notify all net filters about the checkpoint and failover event. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
zhanghailiang	d1955d2219	COLO: flush host dirty ram from cache Don't need to flush all VM's ram from cache, only flush the dirty pages since last checkpoint Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	3f6df99d9d	savevm: split the process of different stages for loadvm/savevm There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	f56c0065b8	qapi: Add new command to query colo status Libvirt or other high level software can use this command query colo status. You can test this command like that: {'execute':'query-colo-status'} Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	41b6b77921	qapi/migration.json: Rename COLO unknown mode to none mode. Suggested by Markus Armbruster rename COLO unknown mode to none mode. Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
zhanghailiang	9ecff6d66e	qmp event: Add COLO_EXIT event to notify users while exited COLO If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x-colo-lost-heartbeat', Users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exited COLO mode. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	e6f4aa188c	COLO: Flush memory data from ram cache During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's RAM cache will always be same with PVM's memory after checkpoint. Instead of flushing all content of PVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	7d9acafa2c	ram/COLO: Record the dirty pages that SVM received We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. Here, it is a trick, we record dirty pages by re-using migration dirty bitmap. In the later patch, we will start the dirty log for SVM, just like migration, in this way, we can record both the dirty pages caused by PVM and SVM, we only flush those dirty pages from RAM cache while do checkpoint. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	13af18f222	COLO: Load dirty pages into SVM's RAM cache firstly We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	aad555c229	COLO: Remove colo_state migration struct We need to know if migration is going into COLO state for incoming side before start normal migration. Instead by using the VMStateDescription to send colo_state from source side to destination side, we use MIG_CMD_ENABLE_COLO to indicate whether COLO is enabled or not. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	8e48ac9586	COLO: Add block replication into colo process Make sure master start block replication after slave's block replication started. Besides, we need to activate VM's blocks before goes into COLO state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	131b2153fc	COLO: integrate colo compare with colo frame For COLO FT, both the PVM and SVM run at the same time, only sync the state while it needs. So here, let SVM runs while not doing checkpoint, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100. Besides, we forgot to release colo_checkpoint_semd and colo_delay_timer, fix them here. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Ilya Maximets	55d0fe8254	migration: Stop postcopy fault thread before notifying POSTCOPY_NOTIFY_INBOUND_END handlers will remove userfault fds from the postcopy_remote_fds array which could be still in use by the fault thread. Let's stop the thread before notification to avoid possible accessing wrong memory. Fixes: `46343570c0` ("vhost+postcopy: Wire up POSTCOPY_END notify") Cc: qemu-stable@nongnu.org Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Message-Id: <20181008160536.6332-2-i.maximets@samsung.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-10-11 19:58:26 +01:00
Peter Maydell	341ba0df4c	migration/ram.c: Avoid taking address of fields in packed MultiFDInit_t struct Taking the address of a field in a packed struct is a bad idea, because it might not be actually aligned enough for that pointer type (and thus cause a crash on dereference on some host architectures). Newer versions of clang warn about this: migration/ram.c:651:19: warning: taking address of packed member 'magic' of class or structure 'MultiFDInit_t' may result in an unaligned pointer value [-Waddress-of-packed-member] migration/ram.c:652:19: warning: taking address of packed member 'version' of class or structure 'MultiFDInit_t' may result in an unaligned pointer value [-Waddress-of-packed-member] migration/ram.c:737:19: warning: taking address of packed member 'magic' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member] migration/ram.c:745:19: warning: taking address of packed member 'version' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member] migration/ram.c:755:19: warning: taking address of packed member 'size' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member] Avoid the bug by not using the "modify in place" byteswapping functions. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20180925161924.7832-1-peter.maydell@linaro.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 17:29:01 +01:00
Fei Li	05306935b1	migration: fix the compression code Add judgement in compress_threads_save_cleanup() to check whether the static CompressParam *comp_param has been allocated. If not, just return; or else segmentation fault will occur when using the NULL comp_param's parameters. One test case can reproduce this is: set the compression on and migrate to a wrong nonexistent host IP address. Our current code does not judge before handling comp_param[idx]'s quit and cond that whether they have been initialized. If not initialized, "qemu_mutex_lock_impl: Assertion `mutex->initialized' failed." will occur. Fix this by squashing the terminate_compression_threads() into compress_threads_save_cleanup() and employing the existing judgement condition. One test case can reproduce this error is: set the compression on and fail to fully setup the default eight compression thread in compress_threads_save_setup(). Signed-off-by: Fei Li <fli@suse.com> Message-Id: <20180925091440.18910-1-fli@suse.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 17:29:01 +01:00
Marc-André Lureau	0284a2a81c	migration: fix QEMUFile leak Spotted by ASAN while running: $ tests/migration-test -p /x86_64/migration/postcopy/recovery ================================================================= ==18034==ERROR: LeakSanitizer: detected memory leaks Direct leak of 33864 byte(s) in 1 object(s) allocated from: #0 0x7f3da7f31e50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f3da644441d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x55af9db15440 in qemu_fopen_channel_input /home/elmarco/src/qemu/migration/qemu-file-channel.c:183 #3 0x55af9db15413 in channel_get_output_return_path /home/elmarco/src/qemu/migration/qemu-file-channel.c:159 #4 0x55af9db0d4ac in qemu_file_get_return_path /home/elmarco/src/qemu/migration/qemu-file.c:78 #5 0x55af9dad5e4f in open_return_path_on_source /home/elmarco/src/qemu/migration/migration.c:2295 #6 0x55af9dadb3bf in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3111 #7 0x55af9dae1bf3 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:91 #8 0x55af9daddeca in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108 #9 0x55af9e13d3db in qio_task_complete /home/elmarco/src/qemu/io/task.c:158 #10 0x55af9e13ca03 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:89 #11 0x7f3da643b1ca in g_idle_dispatch gmain.c:5535 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180925092245.29565-1-marcandre.lureau@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 17:29:01 +01:00
Dr. David Alan Gilbert	096c83b721	migration: cleanup in error paths in loadvm There's a couple of error paths in qemu_loadvm_state which happen early on but after we've initialised the load state; that needs to be cleaned up otherwise we can hit asserts if the state gets reinitialised later. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20180914170430.54271-3-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 17:29:01 +01:00
Dr. David Alan Gilbert	9cf4bb8730	migration/postcopy: Clear have_listen_thread Clear have_listen_thread when we exit the thread. The fallout from this was that various things thought there was an ongoing postcopy after the postcopy had finished. The case that failed was postcopy->savevm->loadvm. This corresponds to RH bug https://bugzilla.redhat.com/show_bug.cgi?id=1608765 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20180914170430.54271-2-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 17:29:01 +01:00
Xiao Guangrong	32b054954f	migration: use save_page_use_compression in flush_compressed_data It avoids to touch compression locks if xbzrle and compression are both enabled Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20180906070101.27280-4-xiaoguangrong@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 12:27:43 +01:00
Xiao Guangrong	76e030004f	migration: show the statistics of compression Currently, it includes: pages: amount of pages compressed and transferred to the target VM busy: amount of count that no free thread to compress data busy-rate: rate of thread busy compressed-size: amount of bytes after compression compression-rate: rate of compressed size Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> Message-Id: <20180906070101.27280-3-xiaoguangrong@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 12:27:27 +01:00
Xiao Guangrong	48df9d8002	migration: do not flush_compressed_data at the end of iteration flush_compressed_data() needs to wait all compression threads to finish their work, after that all threads are free until the migration feeds new request to them, reducing its call can improve the throughput and use CPU resource more effectively We do not need to flush all threads at the end of iteration, the data can be kept locally until the memory block is changed or memory migration starts over in that case we will meet a dirtied page which may still exists in compression threads's ring Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20180906070101.27280-2-xiaoguangrong@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 12:26:58 +01:00
Jose Ricardo Ziviani	827beacb47	Add a hint message to loadvm and exits on failure This patch adds a small hint for the failure case of the load snapshot process. It may be useful for users to remember that the VM configuration has changed between the save and load processes. (qemu) loadvm vm-20180903083641 Unknown savevm section or instance 'cpu_common' 4. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices Error -22 while loading VM state (qemu) device_add host-spapr-cpu-core,core-id=4 (qemu) loadvm vm-20180903083641 (qemu) c (qemu) info status VM status: running It also exits Qemu if the snapshot cannot be loaded before reaching the main loop (-loadvm in the command line). $ qemu-system-ppc64 ... -loadvm vm-20180903083641 qemu-system-ppc64: Unknown savevm section or instance 'cpu_common' 4. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices qemu-system-ppc64: Error -22 while loading VM state $ Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com> Message-Id: <20180903162613.15877-1-joserz@linux.ibm.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 12:26:38 +01:00
Xiao Guangrong	e8f3735fa3	migration: handle the error condition properly ram_find_and_save_block() can return negative if any error hanppens, however, it is completely ignored in current code Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20180903092644.25812-5-xiaoguangrong@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 12:22:21 +01:00
Xiao Guangrong	be8b02edae	migration: fix calculating xbzrle_counters.cache_miss_rate As Peter pointed out: \| - xbzrle_counters.cache_miss is done in save_xbzrle_page(), so it's \| per-guest-page granularity \| \| - RAMState.iterations is done for each ram_find_and_save_block(), so \| it's per-host-page granularity \| \| An example is that when we migrate a 2M huge page in the guest, we \| will only increase the RAMState.iterations by 1 (since \| ram_find_and_save_block() will be called once), but we might increase \| xbzrle_counters.cache_miss for 2M/4K=512 times (we'll call \| save_xbzrle_page() that many times) if all the pages got cache miss. \| Then IMHO the cache miss rate will be 512/1=51200% (while it should \| actually be just 100% cache miss). And he also suggested as xbzrle_counters.cache_miss_rate is the only user of rs->iterations we can adapt it to count target guest page numbers After that, rename 'iterations' to 'target_page_count' to better reflect its meaning Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> Message-Id: <20180903092644.25812-3-xiaoguangrong@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-09-26 12:21:56 +01:00

1 2 3 4 5 ...

1149 Commits