slirp migration code uses QEMU vmstate so far, when building WITH_QEMU.
Introduce slirp_state_{load,save,version}() functions to move the
state saving handling to libslirp side.
So far, the bitstream compatibility should remain equal with current
QEMU, as this is effectively using the same code, with the same format
etc. When libslirp is made standalone, we will need some mechanism to
ensure bitstream compatibility regardless of the libslirp version
installed. See the FIXME note in the code.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190212162524.31504-3-marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Delay to close COLO for auto start VM after failover.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190303145021.2962-4-chen.zhang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
In migration_incoming_state_destroy(void) will check the mis->to_src_file
to double close the mis->to_src_file when occur COLO failover.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190303145021.2962-2-chen.zhang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds the free page optimization enable flag, and a function
to set this flag. When the free page optimization is enabled, not all
the pages are needed to be sent in the bulk stage.
Why using a new flag, instead of directly disabling ram_bulk_stage when
the optimization is running?
Thanks for Peter Xu's reminder that disabling ram_bulk_stage will affect
the use of compression. Please see save_page_use_compression. When
xbzrle and compression are used, if free page optimizaion causes the
ram_bulk_stage to be disabled, save_page_use_compression will return
false, which disables the use of compression. That is, if free page
optimization avoids the sending of half of the guest pages, the other
half of pages loses the benefits of compression in the meantime. Using a
new flag to let migration_bitmap_find_dirty skip the free pages in the
bulk stage will avoid the above issue.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-7-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds a notifier chain for the memory precopy. This enables various
precopy optimizations to be invoked at specific places.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-6-git-send-email-wei.w.wang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds an API to clear bits corresponding to guest free pages
from the dirty bitmap. Spilt the free page block if it crosses the QEMU
RAMBlock boundary.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-5-git-send-email-wei.w.wang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The bitmap mutex is used to synchronize threads to update the dirty
bitmap and the migration_dirty_pages counter. For example, the free
page optimization clears bits of free pages from the bitmap in an
iothread context. This patch makes migration_bitmap_clear_dirty update
the bitmap and counter under the mutex.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-4-git-send-email-wei.w.wang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It will be used to store the uri parameters. We want this only for
tcp, so we don't set it for other uris. We need it to know what port
is migration running.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: Removed DummyStruct as suggested by Eric & Markus
--
Currently we don't check which capabilities set in the source QEMU.
We just expect that the target QEMU has the same enabled capabilities.
Add explicit validation for capabilities to make sure that the target VM
has them too. This is enabled for only new capabilities to keep compatibily.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190215174548.2630-6-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: Manual merge
If ignore-shared capability is set then skip shared RAMBlocks during the
RAM migration.
Also, move qemu_ram_foreach_migratable_block (and rename) to the
migration code, because it requires access to the migration capabilities.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190215174548.2630-4-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We want to use local migration to update QEMU for running guests.
In this case we don't need to migrate shared (file backed) RAM.
So, add a capability to ignore such blocks during live migration.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190215174548.2630-3-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently, qemu_ram_foreach_* calls RAMBlockIterFunc with many
block-specific arguments. But often iter func needs RAMBlock*.
This refactoring is needed for fast access to RAMBlock flags from
qemu_ram_foreach_block's callback. The only way to achieve this now
is to call qemu_ram_block_from_host (which also enumerates blocks).
So, this patch reduces complexity of
qemu_ram_foreach_block() -> cb() -> qemu_ram_block_from_host()
from O(n^2) to O(n).
Fix RAMBlockIterFunc definition and add some functions to read
RAMBlock* fields witch were passed.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190215174548.2630-2-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Configuring QEMU with:
../configure --cc=clang --enable-rdma
Leads to compilation error:
CC migration/rdma.o
CC migration/block.o
qemu/migration/rdma.c:3615:58: error: taking address of packed member 'rkey' of class or structure
'RDMARegisterResult' may result in an unaligned pointer value [-Werror,-Waddress-of-packed-member]
(uintptr_t)host_addr, NULL, ®_result->rkey,
^~~~~~~~~~~~~~~~
Fix it by using a temp local variable.
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20190304184923.24215-1-marcel.apfelbaum@gmail.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Currently we cleanup the migration object as we exit main after the
main_loop finishes; however if there's a migration running things
get messy and we can end up with the migration thread still trying
to access freed structures.
We now take a ref to the object around the migration thread itself,
so the act of dropping the ref during exit doesn't cause us to lose
the state until the thread quits.
Cancelling the migration during migration also tries to get the thread
to quit.
We do this a bit earlier; so hopefully migration gets out of the way
before all the devices etc are freed.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190227164900.16378-1-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If the migration fails before the channel is open (e.g. a bad
address) we end up in the cleanup with rdma->channel==NULL.
Spotted by Coverity: CID 1398634
Fixes: fbbaacab27
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190214185351.5927-1-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
During a cancelled migration there's a race where the fd can
go into an error state before we get back around the migration loop
and migration_detect_error transitions from cancelling->failed.
Check for cancelled/cancelling and don't change the state.
Red Hat bug: https://bugzilla.redhat.com/show_bug.cgi?id=1608649
Fixes: b23c2ade25
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190219195928.12289-1-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Switch the announcements to using the new announce timer.
Move the code that does it to announce.c rather than savevm
because it really has nothing to do with the actual migration.
Migration starts the announce from bh's and so they're all
in the main thread/bql, and so there's never any racing with
the timers themselves.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Add migration parameters that control RARP/GARP announcement timeouts.
Based on earlier patches by myself and
Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The 'announce timer' will be used by migration, and explicit
requests for qemu to perform network announces.
Based on the work by Germano Veit Michel <germano@redhat.com>
and Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Use new qemu_iovec_init_buf() instead of
qemu_iovec_init_external( ... , 1), which simplifies the code.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20190218140926.333779-14-vsementsov@virtuozzo.com
Message-Id: <20190218140926.333779-14-vsementsov@virtuozzo.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It introduces a new statistic, pages-per-second, as bandwidth or mbps is
not enough to measure the performance of posting pages out as we have
compression, xbzrle, which can significantly reduce the amount of the
data size, instead, pages-per-second is the one we want
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20190111063732.10484-2-xiaoguangrong@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
With typo's Eric spotted fixed
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181114133139.27346-1-marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Unregister the fd handler before we destroy the channel,
otherwise we've got a race where we might land in the
fd handler just as we're closing the device.
(The race is quite data dependent, you just have to have
the right set of devices for it to trigger).
Corresponds to RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1666601
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190122173111.29821-1-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
In the current code, if process_incoming_migration_co() fails we do
the same error handing: set the error state, close the source file,
do the cleanup for multifd, and then exit(EXIT_FAILURE). To make the
code clearer, add a "goto fail" to unify the error handling.
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190113140849.38339-6-lifei1214@126.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Call postcopy_ram_incoming_cleanup() to do the cleanup when
postcopy_ram_enable_notify fails. Besides, report the error
message when qemu_ram_foreach_migratable_block() fails.
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190113140849.38339-5-lifei1214@126.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
multifd_save_cleanup() takes an Error ** argument and returns an
error code even though it can't actually fail. Its callers
dutifully check for failure. Remove the useless argument and return
value, and simplify the callers.
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190113140849.38339-4-lifei1214@126.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
In our current code, when multifd is used during migration, if there
is an error before the destination receives all new channels, the
source keeps running, however the destination does not exit but keeps
waiting until the source is killed deliberately.
Fix this by dumping the specific error and let users decide whether
to quit from the destination side when failing to receive packet via
some channel. And update the comment for multifd_recv_new_channel().
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190113140849.38339-3-lifei1214@126.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
In some cases it may be helpful to modify state before saving it for
migration, and then modify the state back after it has been saved. The
existing pre_save function provides half of this functionality. This
patch adds a post_save function to provide the second half.
Signed-off-by: Aaron Lindsay <aclindsa@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20181211151945.29137-2-aaron@os.amperecomputing.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
GCC 8 introduced the -Wstringop-overflow, which detect buffer overflow
by string-modifying functions declared in <string.h>, such strncpy(),
used in global_state_store_running().
GCC indeed found an incorrect use of strlen(), because this array
is loaded by VMSTATE_BUFFER(runstate, GlobalState) then parsed
using qapi_enum_parse which does not get the buffer length.
Use strnlen() which returns sizeof(s->runstate) if the array is not
NUL-terminated, assert the size is within range, and enforce the array
to be NUL-terminated to avoid an overflow in qapi_enum_parse().
This fixes:
CC migration/global_state.o
qemu/migration/global_state.c: In function 'global_state_pre_save':
qemu/migration/global_state.c:109:15: error: 'strlen' argument 1 declared attribute 'nonstring' [-Werror=stringop-overflow=]
s->size = strlen((char *)s->runstate) + 1;
^~~~~~~~~~~~~~~~~~~~~~~~~~~
qemu/migration/global_state.c:24:13: note: argument 'runstate' declared here
uint8_t runstate[100] QEMU_NONSTRING;
^~~~~~~~
cc1: all warnings being treated as errors
make: *** [qemu/rules.mak:69: migration/global_state.o] Error 1
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
GCC 8 added a -Wstringop-truncation warning:
The -Wstringop-truncation warning added in GCC 8.0 via r254630 for
bug 81117 is specifically intended to highlight likely unintended
uses of the strncpy function that truncate the terminating NUL
character from the source string.
This new warning leads to compilation failures:
CC migration/global_state.o
qemu/migration/global_state.c: In function 'global_state_store_running':
qemu/migration/global_state.c:45:5: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
strncpy((char *)global_state.runstate, state, sizeof(global_state.runstate));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make: *** [qemu/rules.mak:69: migration/global_state.o] Error 1
Adding an assert is enough to silence GCC.
(alternatively, we could hard-code "running")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: More verbose commit message]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Most list head structs need not be given a name. In most cases the
name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV
or reverse iteration, but this does not apply to lists of other kinds,
and even for QTAILQ in practice this is only rarely needed. In addition,
we will soon reimplement those macros completely so that they do not
need a name for the head struct. So clean up everything, not giving a
name except in the rare case where it is necessary.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The qmp/hmp command 'system_wakeup' is simply a direct call to
'qemu_system_wakeup_request' from vl.c. This function verifies if
runstate is SUSPENDED and if the wake up reason is valid before
proceeding. However, no error or warning is thrown if any of those
pre-requirements isn't met. There is no way for the caller to
differentiate between a successful wakeup or an error state caused
when trying to wake up a guest that wasn't suspended.
This means that system_wakeup is silently failing, which can be
considered a bug. Adding error handling isn't an API break in this
case - applications that didn't check the result will remain broken,
the ones that check it will have a chance to deal with it.
Adding to that, the commit before previous created a new QMP API called
query-current-machine, with a new flag called wakeup-suspend-support,
that indicates if the guest has the capability of waking up from suspended
state. Although such guest will never reach SUSPENDED state and erroring
it out in this scenario would suffice, it is more informative for the user
to differentiate between a failure because the guest isn't suspended versus
a failure because the guest does not have support for wake up at all.
All this considered, this patch changes qmp_system_wakeup to check if
the guest is capable of waking up from suspend, and if it is suspended.
After this patch, this is the output of system_wakeup in a guest that
does not have wake-up from suspend support (ppc64):
(qemu) system_wakeup
wake-up from suspend is not supported by this guest
(qemu)
And this is the output of system_wakeup in a x86 guest that has the
support but isn't suspended:
(qemu) system_wakeup
Unable to wake up: guest is not in suspended state
(qemu)
Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20181205194701.17836-4-danielhb413@gmail.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Add #if defined(CONFIG_REPLICATION) in generated code, and adjust the
code accordingly.
Made conditional:
* xen-set-replication, query-xen-replication-status,
xen-colo-do-checkpoint
Before the patch, we first register the commands unconditionally in
generated code (requires a stub), then conditionally unregister in
qmp_unregister_commands_hack().
Afterwards, we register only when CONFIG_REPLICATION. The command
fails exactly the same, with CommandNotFound.
Improvement, because now query-qmp-schema is accurate, and we're one
step closer to killing qmp_unregister_commands_hack().
* enum BlockdevDriver value "replication" in command blockdev-add
* BlockdevOptions variant @replication
and related structures.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20181213123724.4866-23-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Because they are supposed to remain const.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181114132931.22624-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is really no difference between live migration and savevm, except
that savevm does not require bdrv_invalidate_cache to be implemented
by all disks. However, it is unlikely that savevm is used with anything
except qcow2 disks, so the penalty is small and worth the improvement
in catching bad usage of savevm.
Only one place was taking care of savevm when adding a migration blocker,
and it can be removed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current COLO mode(independent disk mode) need replication module work
together. Suggested by Dr. David Alan Gilbert <dgilbert@redhat.com>.
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Message-Id: <20181114190912.7242-1-chen.zhang@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This compilation issue will occur when user use --disable-replication
to config Qemu.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Message-Id: <20181101021226.6353-1-zhangckid@gmail.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
During an active background migration, snapshot will trigger a
segmentfault. As snapshot clears the "current_migration" struct
and updates "to_dst_file" before it finds out that there is a
migration task, Migration accesses the null pointer in
"current_migration" struct and qemu crashes eventually.
Signed-off-by: Jia Lina <jialina01@baidu.com>
Signed-off-by: Chai Wen <chaiwen@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20181026083620.10172-1-jialina01@baidu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch aims to bring the following behavior:
1. We don't load bitmaps, when started in inactive mode. It's the case
of incoming migration. In this case we wait for bitmaps migration
through migration channel (if 'dirty-bitmaps' capability is enabled) or
for invalidation (to load bitmaps from the image).
2. We don't remove persistent bitmaps on inactivation. Instead, we only
remove bitmaps after storing. This is the only way to restore bitmaps,
if we decided to resume source after [failed] migration with
'dirty-bitmaps' capability enabled (which means, that bitmaps were not
stored).
3. We load bitmaps on open and any invalidation, it's ok for all cases:
- normal open
- migration target invalidation with dirty-bitmaps capability
(bitmaps are migrating through migration channel, the are not
stored, so they should have IN_USE flag set and will be skipped
when loading. However, it would fail if bitmaps are read-only[1])
- migration target invalidation without dirty-bitmaps capability
(normal load of the bitmaps, if migrated with shared storage)
- source invalidation with dirty-bitmaps capability
(skip because IN_USE)
- source invalidation without dirty-bitmaps capability
(bitmaps were dropped, reload them)
[1]: to accurately handle this, migration of read-only bitmaps is
explicitly forbidden in this patch.
New mechanism for not storing bitmaps when migrate with dirty-bitmaps
capability is introduced: migration filed in BdrvDirtyBitmap.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Instead of both frozen and qmp_locked checks, wrap it into one check.
frozen implies the bitmap is split in two (for backup), and shouldn't
be modified. qmp_locked implies it's being used by another operation,
like being exported over NBD. In both cases it means we shouldn't allow
the user to modify it in any meaningful way.
Replace any usages where we check both frozen and qmp_locked with the
new check.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20181002230218.13949-2-jsnow@redhat.com
[w/edits Suggested-By: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>]
Signed-off-by: John Snow <jsnow@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbzcCHAAoJEDhwtADrkYZT3YsP/2qE4HNY/htj3IP6vNJuSaqw
CLPRTz7zWmUBTE6FqSkvLsq3X2BMFFLeaIPA9EFcbyn2km6qPqBYgg9ElXXvPZBm
6hDeRIoC8FdRD0Apozd5MGC94/lE47PheDRV8V+4KrGLaaMXEPxMZ0wP4AfdS5pS
6Pt2xuF7nPu1+OWVxMk0fXadGjGLEuOQQmTh3B21J5RaynQ3gtd6h7XFC/LJyOGG
LC/6GyPc0h7KU83VnvrRjH/EOpu1wENgrsvWsS0sem8op35Z+i9jU5BfCp4qFkDy
gCHHUEyEeyexS+W+Tj87eBtK2gfrqQx9ovo8CIsWcUwpKbdD6AMK4FKGsDNMNHab
Kg5u/M+O8nHCB7DuursF+3mqEbZHb05cfKe6JEtiq49EuORMV5hp4Ap966noSwTw
UEU0NJNA1p8EdmXVudyyyYR7wpoSSmZpoenA+bJ3nthK8K0KcU4RUGk6ZEbxfJy+
7ENl+3R2IxmxzgXv/x0tz0uFisaVW1rltTXtMte+ElQsO0qy74iHdfR7JHsmLxj9
CO/ABMVoYsWq2OJv8pWLrdKpT4v3HQLJdHhknyu0ZcJGDyICqX29ULLEhPrNEZvW
rxVxAkiemlaqxlUjbrM46CDQQm+w03OCnk7aCYcV4oK+u5+o3mCag705gMPErapZ
6uOE3fAjiWw43sA31mek
=kPZX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-10-22' into staging
Error reporting patches for 2018-10-22
# gpg: Signature made Mon 22 Oct 2018 13:20:23 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-error-2018-10-22: (40 commits)
error: Drop bogus "use error_setg() instead" admonitions
vpc: Fail open on bad header checksum
block: Clean up bdrv_img_create()'s error reporting
vl: Simplify call of parse_name()
vl: Fix exit status for -drive format=help
blockdev: Convert drive_new() to Error
vl: Assert drive_new() does not fail in default_drive()
fsdev: Clean up error reporting in qemu_fsdev_add()
spice: Clean up error reporting in add_channel()
tpm: Clean up error reporting in tpm_init_tpmdev()
numa: Clean up error reporting in parse_numa()
vnc: Clean up error reporting in vnc_init_func()
ui: Convert vnc_display_init(), init_keyboard_layout() to Error
ui/keymaps: Fix handling of erroneous include files
vl: Clean up error reporting in device_init_func()
vl: Clean up error reporting in parse_fw_cfg()
vl: Clean up error reporting in mon_init_func()
vl: Clean up error reporting in machine_set_property()
vl: Clean up error reporting in chardev_init_func()
qom: Clean up error reporting in user_creatable_add_opts_foreach()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Calling error_report() in a function that takes an Error ** argument
is suspicious. save_snapshot() and load_snapshot() do that, and then
fail without setting an error. Wrong. The HMP commands survive this
unscathed, since hmp_handle_error() does nothing when no error has
been set. Callers main() (on behalf of -loadvm) and
replay_vmstate_init() crash, but I'm not sure the error is possible
there.
Screwed up when commit 377b21ccea (v2.12.0) added incorrect error
handling right next to correct examples. Fix by calling error_setg()
instead of error_report().
Fixes: 377b21ccea
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181017082702.5581-13-armbru@redhat.com>
From include/qapi/error.h:
* Pass an existing error to the caller with the message modified:
* error_propagate(errp, err);
* error_prepend(errp, "Could not frobnicate '%s': ", name);
Fei Li pointed out that doing error_propagate() first doesn't work
well when @errp is &error_fatal or &error_abort: the error_prepend()
is never reached.
Since I doubt fixing the documentation will stop people from getting
it wrong, introduce error_propagate_prepend(), in the hope that it
lures people away from using its constituents in the wrong order.
Update the instructions in error.h accordingly.
Convert existing error_prepend() next to error_propagate to
error_propagate_prepend(). If any of these get reached with
&error_fatal or &error_abort, the error messages improve. I didn't
check whether that's the case anywhere.
Cc: Fei Li <fli@suse.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181017082702.5581-2-armbru@redhat.com>
COLO thread may sleep at qemu_sem_wait(&s->colo_checkpoint_sem),
while failover works begin, It's better to wakeup it to quick
the process.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Notify all net filters about the checkpoint and failover event.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Don't need to flush all VM's ram from cache, only
flush the dirty pages since last checkpoint
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
There are several stages during loadvm/savevm process. In different stage,
migration incoming processes different types of sections.
We want to control these stages more accuracy, it will benefit COLO
performance, we don't have to save type of QEMU_VM_SECTION_START
sections everytime while do checkpoint, besides, we want to separate
the process of saving/loading memory and devices state.
So we add three new helper functions: qemu_load_device_state() and
qemu_savevm_live_state() to achieve different process during migration.
Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
public, and simplify the codes of qemu_save_device_state() by calling the
wrapper qemu_savevm_state_header().
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Libvirt or other high level software can use this command query colo status.
You can test this command like that:
{'execute':'query-colo-status'}
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Suggested by Markus Armbruster rename COLO unknown mode to none mode.
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
If some errors happen during VM's COLO FT stage, it's important to
notify the users of this event. Together with 'x-colo-lost-heartbeat',
Users can intervene in COLO's failover work immediately.
If users don't want to get involved in COLO's failover verdict,
it is still necessary to notify users that we exited COLO mode.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
During the time of VM's running, PVM may dirty some pages, we will transfer
PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
time. So, the content of SVM's RAM cache will always be same with PVM's memory
after checkpoint.
Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
we do this in a more efficient way:
Only flush any page that dirtied by PVM since last checkpoint.
In this way, we can ensure SVM's memory same with PVM's.
Besides, we must ensure flush RAM cache before load device state.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We record the address of the dirty pages that received,
it will help flushing pages that cached into SVM.
Here, it is a trick, we record dirty pages by re-using migration
dirty bitmap. In the later patch, we will start the dirty log
for SVM, just like migration, in this way, we can record both
the dirty pages caused by PVM and SVM, we only flush those dirty
pages from RAM cache while do checkpoint.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.
We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached ram
to SVM after we receive all PVM's state.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We need to know if migration is going into COLO state for
incoming side before start normal migration.
Instead by using the VMStateDescription to send colo_state
from source side to destination side, we use MIG_CMD_ENABLE_COLO
to indicate whether COLO is enabled or not.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Make sure master start block replication after slave's block
replication started.
Besides, we need to activate VM's blocks before goes into
COLO state.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
For COLO FT, both the PVM and SVM run at the same time,
only sync the state while it needs.
So here, let SVM runs while not doing checkpoint, change
DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100.
Besides, we forgot to release colo_checkpoint_semd and
colo_delay_timer, fix them here.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
POSTCOPY_NOTIFY_INBOUND_END handlers will remove userfault fds
from the postcopy_remote_fds array which could be still in
use by the fault thread. Let's stop the thread before
notification to avoid possible accessing wrong memory.
Fixes: 46343570c0 ("vhost+postcopy: Wire up POSTCOPY_END notify")
Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Message-Id: <20181008160536.6332-2-i.maximets@samsung.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this:
migration/ram.c:651:19: warning: taking address of packed member 'magic' of class or structure 'MultiFDInit_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
migration/ram.c:652:19: warning: taking address of packed member 'version' of class or structure 'MultiFDInit_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
migration/ram.c:737:19: warning: taking address of packed member 'magic' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
migration/ram.c:745:19: warning: taking address of packed member 'version' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
migration/ram.c:755:19: warning: taking address of packed member 'size' of class or structure 'MultiFDPacket_t' may result in an unaligned pointer value [-Waddress-of-packed-member]
Avoid the bug by not using the "modify in place" byteswapping
functions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180925161924.7832-1-peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Add judgement in compress_threads_save_cleanup() to check whether the
static CompressParam *comp_param has been allocated. If not, just
return; or else segmentation fault will occur when using the NULL
comp_param's parameters. One test case can reproduce this is: set
the compression on and migrate to a wrong nonexistent host IP address.
Our current code does not judge before handling comp_param[idx]'s quit
and cond that whether they have been initialized. If not initialized,
"qemu_mutex_lock_impl: Assertion `mutex->initialized' failed." will
occur. Fix this by squashing the terminate_compression_threads() into
compress_threads_save_cleanup() and employing the existing judgement
condition. One test case can reproduce this error is: set the
compression on and fail to fully setup the default eight compression
thread in compress_threads_save_setup().
Signed-off-by: Fei Li <fli@suse.com>
Message-Id: <20180925091440.18910-1-fli@suse.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Spotted by ASAN while running:
$ tests/migration-test -p /x86_64/migration/postcopy/recovery
=================================================================
==18034==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 33864 byte(s) in 1 object(s) allocated from:
#0 0x7f3da7f31e50 in calloc (/lib64/libasan.so.5+0xeee50)
#1 0x7f3da644441d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
#2 0x55af9db15440 in qemu_fopen_channel_input /home/elmarco/src/qemu/migration/qemu-file-channel.c:183
#3 0x55af9db15413 in channel_get_output_return_path /home/elmarco/src/qemu/migration/qemu-file-channel.c:159
#4 0x55af9db0d4ac in qemu_file_get_return_path /home/elmarco/src/qemu/migration/qemu-file.c:78
#5 0x55af9dad5e4f in open_return_path_on_source /home/elmarco/src/qemu/migration/migration.c:2295
#6 0x55af9dadb3bf in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3111
#7 0x55af9dae1bf3 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:91
#8 0x55af9daddeca in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108
#9 0x55af9e13d3db in qio_task_complete /home/elmarco/src/qemu/io/task.c:158
#10 0x55af9e13ca03 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:89
#11 0x7f3da643b1ca in g_idle_dispatch gmain.c:5535
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180925092245.29565-1-marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
There's a couple of error paths in qemu_loadvm_state
which happen early on but after we've initialised the
load state; that needs to be cleaned up otherwise
we can hit asserts if the state gets reinitialised later.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180914170430.54271-3-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Clear have_listen_thread when we exit the thread.
The fallout from this was that various things thought there was
an ongoing postcopy after the postcopy had finished.
The case that failed was postcopy->savevm->loadvm.
This corresponds to RH bug https://bugzilla.redhat.com/show_bug.cgi?id=1608765
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180914170430.54271-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It avoids to touch compression locks if xbzrle and compression
are both enabled
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20180906070101.27280-4-xiaoguangrong@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently, it includes:
pages: amount of pages compressed and transferred to the target VM
busy: amount of count that no free thread to compress data
busy-rate: rate of thread busy
compressed-size: amount of bytes after compression
compression-rate: rate of compressed size
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180906070101.27280-3-xiaoguangrong@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
flush_compressed_data() needs to wait all compression threads to
finish their work, after that all threads are free until the
migration feeds new request to them, reducing its call can improve
the throughput and use CPU resource more effectively
We do not need to flush all threads at the end of iteration, the
data can be kept locally until the memory block is changed or
memory migration starts over in that case we will meet a dirtied
page which may still exists in compression threads's ring
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20180906070101.27280-2-xiaoguangrong@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds a small hint for the failure case of the load snapshot
process. It may be useful for users to remember that the VM
configuration has changed between the save and load processes.
(qemu) loadvm vm-20180903083641
Unknown savevm section or instance 'cpu_common' 4.
Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
Error -22 while loading VM state
(qemu) device_add host-spapr-cpu-core,core-id=4
(qemu) loadvm vm-20180903083641
(qemu) c
(qemu) info status
VM status: running
It also exits Qemu if the snapshot cannot be loaded before reaching the
main loop (-loadvm in the command line).
$ qemu-system-ppc64 ... -loadvm vm-20180903083641
qemu-system-ppc64: Unknown savevm section or instance 'cpu_common' 4.
Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
qemu-system-ppc64: Error -22 while loading VM state
$
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com>
Message-Id: <20180903162613.15877-1-joserz@linux.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
ram_find_and_save_block() can return negative if any error hanppens,
however, it is completely ignored in current code
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20180903092644.25812-5-xiaoguangrong@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
As Peter pointed out:
| - xbzrle_counters.cache_miss is done in save_xbzrle_page(), so it's
| per-guest-page granularity
|
| - RAMState.iterations is done for each ram_find_and_save_block(), so
| it's per-host-page granularity
|
| An example is that when we migrate a 2M huge page in the guest, we
| will only increase the RAMState.iterations by 1 (since
| ram_find_and_save_block() will be called once), but we might increase
| xbzrle_counters.cache_miss for 2M/4K=512 times (we'll call
| save_xbzrle_page() that many times) if all the pages got cache miss.
| Then IMHO the cache miss rate will be 512/1=51200% (while it should
| actually be just 100% cache miss).
And he also suggested as xbzrle_counters.cache_miss_rate is the only
user of rs->iterations we can adapt it to count target guest page
numbers
After that, rename 'iterations' to 'target_page_count' to better reflect
its meaning
Suggested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180903092644.25812-3-xiaoguangrong@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Clang correctly errors out moaning that rdma_return_path
is used uninitialised in the earlier error paths.
Make it NULL so that the error path ignores it.
Fixes: 55cc1b5937
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180830173657.22939-1-dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The generated qapi_event_send_FOO() take an Error ** argument. They
can't actually fail, because all they do with the argument is passing it
to functions that can't fail: the QObject output visitor, and the
@qmp_emit callback, which is either monitor_qapi_event_queue() or
event_test_emit().
Drop the argument, and pass &error_abort to the QObject output visitor
and @qmp_emit instead.
Suggested-by: Eric Blake <eblake@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180815133747.25032-4-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message rewritten, update to qapi-code-gen.txt corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
While the qemu_balloon_inhibit() interface appears rather general purpose,
postcopy uses it in a last-caller-wins approach with no guarantee of balanced
inhibits and de-inhibits. Wrap postcopy's usage of the inhibitor to give it
one vote overall, using the same last-caller-wins approach as previously
implemented at the balloon level.
Fixes: 01ccbec7bd ("balloon: Allow multiple inhibit users")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Try to hold src_page_req_mutex only if the queue is not
empty
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Detecting zero page is not a light work, moving it to the thread to
speed the main thread up, btw, handling ram_release_pages() for the
zero page is moved to the thread as well
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It is not used and cleans the code up a little
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It will be used by the compression threads
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The compressed page is not normal page
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Instead of putting the main thread to sleep state to wait for
free compression thread, we can directly post it out as normal
page that reduces the latency and uses CPUs more efficiently
A parameter, compress-wait-thread, is introduced, it can be
enabled if the user really wants the old behavior
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The destination qemu only poll the comp_channel->fd in
qemu_rdma_wait_comp_channel. But when source qemu disconnnect
the rdma connection, the destination qemu should be notified.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Because RDMA QIOChannel not implement shutdown function,
If the to_dst_file was set error, the return path thread
will wait forever. and the migration thread will wait
return path thread exit.
the backtrace of return path thread is:
(gdb) bt
#0 0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6
#1 0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000)
at qemu-timer.c:325
#2 0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000)
at migration/rdma.c:1501
#3 0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000,
byte_len=0x7ef7091d0640) at migration/rdma.c:1580
#4 0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000,
head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726
#5 0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720,
expecting=3) at migration/rdma.c:1903
#6 0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8,
size=32768) at migration/rdma.c:2714
#7 0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232
#8 0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0)
at migration/qemu-file.c:502
#9 0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515
#10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591
#11 0x00000000006a46d3 in source_return_path_thread (
opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1331
#12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f372a77635d in clone () from /lib64/libc.so.6
the backtrace of migration thread is:
(gdb) bt
#0 0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0
#1 0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 <current_migration.37100+88>)
at util/qemu-thread-posix.c:504
#2 0x00000000006a4bc5 in await_return_path_close_on_source (
ms=0xd826a0 <current_migration.37100>) at migration/migration.c:1460
#3 0x00000000006a53e4 in migration_completion (s=0xd826a0 <current_migration.37100>,
current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980)
at migration/migration.c:1695
#4 0x00000000006a5c54 in migration_thread (opaque=0xd826a0 <current_migration.37100>)
at migration/migration.c:1837
#5 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f372a77635d in clone () from /lib64/libc.so.6
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function
maybe loop forever. so we should also poll the cm event fd, and when
receive RDMA_CM_EVENT_DISCONNECTED and RDMA_CM_EVENT_DEVICE_REMOVAL,
we consider some error happened.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Signed-off-by: Gal Shachaf <galsha@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash.
The backtrace is:
(gdb) bt
#0 0x00007fb20aba91d7 in raise () from /lib64/libc.so.6
#1 0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6
#2 0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460
#5 0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768)
at migration/qemu-file-channel.c:83
#6 0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299
#7 0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562
#8 0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575
#9 0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647
#10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794
#11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504
#12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0
#13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6
This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine().
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu
crash.
The backtrace is:
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080,
io_read=0x8db841 <qio_channel_restart_read>, io_write=0x0, opaque=0x38111e0) at io/channel.c:
#2 0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438
#3 0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47
#4 0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327
at migration/qemu-file-channel.c:83
#5 0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299
#6 0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562
#7 0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575
#8 0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655
#9 0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126
#10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366
#11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1
#12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6
#13 0x00007f96fe858760 in ?? ()
#14 0x0000000000000000 in ?? ()
RDMA QIOChannel not implement io_set_aio_fd_handler. so
qio_channel_set_aio_fd_handler will access NULL pointer.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
During incoming postcopy, the destination qemu will invoke
qemu_rdma_wait_comp_channel in a seprate thread. So does not use rdma
yield, and poll the completion channel fd instead.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch implements bi-directional RDMA QIOChannel. Because different
threads may access RDMAQIOChannel currently, this patch use RCU to protect it.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If start a RDMA migration with postcopy enabled, the source qemu
establish a dedicated connection for return path.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
RDMA WRITE operations are performed with no notification to the destination
qemu, then the destination qemu can not wakeup. This patch disable RDMA WRITE
after postcopy started.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Currently, the default maximum CPU throttle for migration is
99(CPU_THROTTLE_PCT_MAX). This is too big and can make a remarkable
performance effect for the guest. We see a lot of packets latency
exceed 500ms when the CPU_THROTTLE_PCT_MAX reached. This patch set
adds a new max-cpu-throttle parameter to limit the CPU throttle.
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Currently the vmstate subsection handling code treats a subsection
with no 'needed' function pointer as if it were the subsection
list terminator, so the subsection is never transferred and nor
is any subsection following it in the list.
Handle NULL 'needed' function pointers in subsections in the same
way that we do for top level VMStateDescription structures:
treat the subsection as always being needed.
This doesn't change behaviour for the current set of devices
in the tree, because all subsections declare a 'needed' function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Because we need to make sure the pmem kind memory data is synced
after migration, we choose to call pmem_persist() when the migration
finish. This will make sure the data of pmem is safe and will not
lose if power is off.
Signed-off-by: Junyan He <junyan.he@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The nvdimm kind memory does not support post copy now.
We disable post copy if we have nvdimm memory and print some
log hint to user.
Signed-off-by: Junyan He <junyan.he@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
migrate_fd_connect duplicate initialize expected_downtime and cleanup_bh.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Message-Id: <1532434585-14732-2-git-send-email-lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Postcopy recovery won't work well with release-ram capability since
release-ram will drop the page buffer as long as the page is put into
the send buffer. So if there is a network failure happened, any page
buffers that have not yet reached the destination VM but have already
been sent from the source VM will be lost forever. Let's refuse the
client from resuming such a postcopy migration. Luckily release-ram was
designed to only be used when src and destination VMs are on the same
host, so it should be fine.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180723123305.24792-3-peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We shouldn't update the received bitmap if we're the source VM. This
fixes a breakage when release-ram is enabled on postcopy.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180723123305.24792-2-peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We've been getting the warning:
migration_iteration_finish: Unknown ending state 2
on a cancel.
I think that's originally due to 39b9e17905c; although
I've only seen the warning, I think that in some cases
that we could find the VM stays paused after a cancel where
it should restart.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180719092257.12703-1-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
I would guess it won't happen normally, but this should ease Coverity.
>>> CID 1394385: Integer handling issues (OVERFLOW_BEFORE_WIDEN)
>>> Potentially overflowing expression "pages->used * 8192U" with type "unsigned int" (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type "uint64_t" (64 bits, unsigned).
854 transferred = pages->used * TARGET_PAGE_SIZE + p->packet_len;
Fixes: CID 1394385
CC: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180720034713.11711-1-peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It was accidently added before MIG_CMD_PACKAGED so it might break
command compatibility when we run postcopy migration between old/new
QEMUs. Fix that up quickly before the QEMU 3.0 release.
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180710094424.30754-1-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
These two states will be missing when doing "query-migrate" on
destination VM. Add these states so that we can get the query results
as expected.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180710091902.28780-5-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The calculation on size of received bitmap is incorrect for postcopy
recovery. Here we wanted to let the size to cover all the valid bits in
the bitmap, we should use DIV_ROUND_UP() instead of a division.
For example, a RAMBlock with size=4K (which contains only one single 4K
page) will have nbits=1, then nbits/8=0, then the real bitmap won't be
sent to source at all.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180710091902.28780-4-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We were checking against -EIO, assuming that it will cover all IO
failures. But actually it is not. One example is that in
qemu_loadvm_section_start_full() we can have tons of places that will
return -EINVAL even if the error is caused by IO failures on the
network.
Let's loosen the recovery check logic here to cover all the error cases
happened by removing the explicit check against -EIO. After all we
won't lose anything here if any other failure happened.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180710091902.28780-3-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Firstly, renaming the old matching_page_sizes variable to
matches_target_page_size, which suites more to what it did (it only
checks against target page size rather than multiple page sizes).
Meanwhile, simplify the check logic a bit, and enhance the comments.
Should have no functional change.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180710091902.28780-2-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This is the 2nd patch to unbreak postcopy recovery.
Let's unify the migration_incoming_process() call at a single place
rather than calling it in connection setup codes. This fixes a problem
that we will go into incoming migration procedure even if we are trying
to recovery from a paused postcopy migration.
Fixes: 36c2f8be2c ("migration: Delay start of migration main routines")
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180627132246.5576-5-peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The whole postcopy recovery logic was accidentally broken. We need to
fix it in two steps.
This is the first step that we should do the recovery when needed. It
was bypassed before after commit 36c2f8be2c.
Introduce postcopy_try_recovery() helper for the postcopy recovery
logic. Call it both in migration_fd_process_incoming() and
migration_ioc_process_incoming().
Fixes: 36c2f8be2c ("migration: Delay start of migration main routines")
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180627132246.5576-4-peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Move the call to migration_incoming_process() out of multifd code. It's
a bit strange that we can migration generic calls in multifd code.
Instead, let multifd_recv_new_channel() return a boolean showing whether
it's ready to continue the incoming migration.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180627132246.5576-3-peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Before this patch we firstly setup the postcopy-paused state then we
clean up the QEMUFile handles. That can be racy if there is a very fast
"migrate-recover" command running in parallel. Fix that up.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180627132246.5576-2-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Bitmap lock/unlock were added to bdrv_enable_dirty_bitmap in
8b1402ce80, but some places were not updated correspondingly, which
leads to trying to take this lock twice, which is dead-lock. Fix this.
Actually, iotest 199 (about dirty bitmap postcopy migration) is broken
now, and this fixes it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180625165745.25259-3-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
The way we determine if we can start the incoming migration was
changed to use migration_has_all_channels() in:
commit 428d89084c
Author: Juan Quintela <quintela@redhat.com>
Date: Mon Jul 24 13:06:25 2017 +0200
migration: Create migration_has_all_channels
This method in turn calls multifd_recv_all_channels_created()
which is hardcoded to always return 'true' when multifd is
not in use. This is a latent bug...
...activated in a following commit where that return result
ends up acting as the flag to indicate whether it is possible
to start processing the migration:
commit 36c2f8be2c
Author: Juan Quintela <quintela@redhat.com>
Date: Wed Mar 7 08:40:52 2018 +0100
migration: Delay start of migration main routines
This means that if channel initialization fails with normal
migration, it'll never notice and attempt to start the
incoming migration regardless and crash on a NULL pointer.
This can be seen, for example, if a client connects to a server
requiring TLS, but has an invalid x509 certificate:
qemu-system-x86_64: The certificate hasn't got a known issuer
qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed.
#0 0x00007fffebd24f2b in raise () at /lib64/libc.so.6
#1 0x00007fffebd0f561 in abort () at /lib64/libc.so.6
#2 0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3 0x00007fffebd1d692 in () at /lib64/libc.so.6
#4 0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386
#5 0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
#6 0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6
#7 0x0000000000000000 in ()
To handle the non-multifd case, we check whether mis->from_src_file
is non-NULL. With this in place, the migration server drops the
rejected client and stays around waiting for another, hopefully
valid, client to arrive.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180619163552.18206-1-berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Not needed. Don't expose last_ram_page().
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180620202736.21399-1-david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We have to flush() the QEMUFile because now we sent really few data
through that channel.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We know quit with shutdwon in the QIO.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
Add comment
Use shutdown() instead of unref()
We have three conditions here:
- channel fails -> error
- we have to quit: we close the channel and reads fails
- normal read that success, we are in bussiness
So forget the complications of waiting in a semaphore.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The function still don't use multifd, but we have simplified
ram_save_page, xbzrle and RDMA stuff is gone. We have added a new
counter.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
Add last_page parameter
Add commets for done and address
Remove multifd field, it is the same than normal pages
Merge next patch, now we send multiple pages at a time
Remove counter for multifd pages, it is identical to normal pages
Use iovec's instead of creating the equivalent.
Clear memory used by pages (dave)
Use g_new0(danp)
define MULTIFD_CONTINUE
now pages member is a pointer
Fix off-by-one in number of pages in one packet
Remove RAM_SAVE_FLAG_MULTIFD_PAGE
s/multifd_pages_t/MultiFDPages_t/
add comment explaining what it means
This will include how many bytes they are sent through multifd.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We synchronize all threads each RAM_SAVE_FLAG_EOS. Bitmap
synchronizations don't happen inside a ram section, so we are safe
about two channels trying to overwrite the same memory.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
seq needs to be atomic now, will also be accessed from main thread.
Fix the if (true || ...) leftover
We are back to non-atomics
Either for quit, sync or packet, we first wake them.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We want to know how many pages/packets each channel has sent. Add
counters for those.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
sort trace-events (dave)
Right now we use the "position" inside the QEMUFile, but things like
RDMA already do weird things to be able to maintain that counter
right, and multifd will have some similar problems.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We used to include in this calculation the setup time, but that can be
quite big in rdma or multifd.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We still don't put anything there.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
fix magic (dave)
check offset/ramblock (dave)
s/seq/packet_num/ and make it 64bit
We only create/destry the page list here. We will use it later.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Only one existing trace event uses a floating point type. Unfortunately
float and double cannot be supported since SystemTap does not have
floating point types.
Remove float and double from the whitelist and document this limitation.
Update the migrate_transferred trace event to use uint64_t instead of
double.
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 20180621150254.4922-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
expected_downtime value is not accurate with dirty_pages_rate * page_size,
using ram_bytes_remaining() would yeild it resonable.
consider to read the remaining ram just after having updated the dirty
pages count later migration_bitmap_sync_range() in migration_bitmap_sync()
and reuse the `remaining` field in ram_counters to hold ram_bytes_remaining()
for calculating expected_downtime.
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20180612085009.17594-2-bala24@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Use the 'urgent request' mechanism added in the previous patch
for entries added to the postcopy request queue for RAM. Ignore
the rate limiting while we have requests.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180613102642.23995-4-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Rate limiting sleeps the migration thread for a while when it runs
out of bandwidth; but sometimes we want to wake up to get on with
something more urgent (like a postcopy request). Here we use
a semaphore with a timedwait instead of a simple sleep; Incrementing
the sempahore will wake it up sooner. Anything that consumes
these urgent events must decrement the sempahore.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180613102642.23995-3-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Limit the background transfer bandwidth during the postcopy
phase to the value set on this new parameter. The default, 0,
corresponds to the existing behaviour which is unlimited bandwidth.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180613102642.23995-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It is used to slightly clean the code up, no logic is changed
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180604095520.8563-5-xiaoguangrong@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Sync up xbzrle_cache_miss_prev only after migration iteration goes
forward
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180604095520.8563-4-xiaoguangrong@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dirty_bitmap_load_header return code is obtained but not handled. Fix
this.
Bug was introduced in b35ebdf076
"migration: add postcopy migration of dirty bitmaps" with the whole
function.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180530112424.204835-1-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The migration code should be using the
RAMBLOCK_FOREACH_MIGRATABLE and qemu_ram_foreach_block_migratable
not the all-block versions; poison them so that we can't accidentally
use them.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180605162545.80778-3-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
There are still a few cases where migration code is using the macros
and functions that do all RAMBlocks rather than just the migratable
blocks; fix those up.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180605162545.80778-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Since commit 83ee768d62, we now have two places that define the
QJSON type:
$ git grep 'typedef struct QJSON QJSON'
include/migration/vmstate.h:typedef struct QJSON QJSON;
migration/qjson.h:typedef struct QJSON QJSON;
This breaks docker-test-build@centos6:
In file included from /tmp/qemu-test/src/migration/savevm.c:59:
/tmp/qemu-test/src/migration/qjson.h:16: error: redefinition of typedef
'QJSON'
/tmp/qemu-test/src/include/migration/vmstate.h:30: note: previous
declaration of 'QJSON' was here
make: *** [migration/savevm.o] Error 1
This happens because CentOS 6 has an old GCC 4.4.7. Even if redefining
a typedef with the same type is permitted since GCC 4.6, unless -pedantic
is passed, we don't really need to do that on purpose. Let's have a
single definition in <qemu/typedefs.h> instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <152844714981.11789.3657734445739553287.stgit@bahia.lan>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbEXNvAAoJECgfDbjSjVRpc8gH/R8xrcFrV+k9wwbgYcOcGb6Y
LWjseE31pqJcxRV80vLOdzYEuLStZQKQQY7xBDMlA5vdyvZxIA6FLO2IsiJSbFAk
EK8pclwhpwQAahr8BfzenabohBv2UO7zu5+dqSvuJCiMWF3jGtPAIMxInfjXaOZY
odc1zY2D2EgsC7wZZ1hfraRbISBOiRaez9BoGDKPOyBY9G1ASEgxJgleFgoBLfsK
a1XU+fDM6hAVdxftfkTm0nibyf7PWPDyzqghLqjR9WXLvZP3Cqud4p8N29mY51pR
KSTjA4FYk6Z9EVMltyBHfdJs6RQzglKjxcNGdlrvacDfyFi79fGdiosVllrjfJM=
=3+V0
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
acpi, vhost, misc: fixes, features
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 01 Jun 2018 17:25:19 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (31 commits)
vhost-blk: turn on pre-defined RO feature bit
ACPI testing: test NFIT platform capabilities
nvdimm, acpi: support NFIT platform capabilities
tests/.gitignore: add entry for generated file
arch_init: sort architectures
ui: use local path for local headers
qga: use local path for local headers
colo: use local path for local headers
migration: use local path for local headers
usb: use local path for local headers
sd: fix up include
vhost-scsi: drop an unused include
ppc: use local path for local headers
rocker: drop an unused include
e1000e: use local path for local headers
ioapic: fix up includes
ide: use local path for local headers
display: use local path for local headers
trace: use local path for local headers
migration: drop an unused include
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When cancel migration during RDMA precopy, the source qemu main thread hangs sometime.
The backtrace is:
(gdb) bt
#0 0x00007f249eabd43d in write () from /lib64/libpthread.so.0
#1 0x00007f24a1ce98e4 in rdma_get_cm_event (channel=0x4675d10, event=0x7ffe2f643dd0) at src/cma.c:2189
#2 0x00000000007b6166 in qemu_rdma_cleanup (rdma=0x6784000) at migration/rdma.c:2296
#3 0x00000000007b7cae in qio_channel_rdma_close (ioc=0x3bfcc30, errp=0x0) at migration/rdma.c:2999
#4 0x00000000008db60e in qio_channel_close (ioc=0x3bfcc30, errp=0x0) at io/channel.c:273
#5 0x00000000007a8765 in channel_close (opaque=0x3bfcc30) at migration/qemu-file-channel.c:98
#6 0x00000000007a71f9 in qemu_fclose (f=0x527c000) at migration/qemu-file.c:334
#7 0x0000000000795b96 in migrate_fd_cleanup (opaque=0x3b46280) at migration/migration.c:1162
#8 0x000000000093a71b in aio_bh_call (bh=0x3db7a20) at util/async.c:90
#9 0x000000000093a7b2 in aio_bh_poll (ctx=0x3b121c0) at util/async.c:118
#10 0x000000000093f2ad in aio_dispatch (ctx=0x3b121c0) at util/aio-posix.c:436
#11 0x000000000093ab41 in aio_ctx_dispatch (source=0x3b121c0, callback=0x0, user_data=0x0)
at util/async.c:261
#12 0x00007f249f73c7aa in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#13 0x000000000093dc5e in glib_pollfds_poll () at util/main-loop.c:215
#14 0x000000000093dd4e in os_host_main_loop_wait (timeout=28000000) at util/main-loop.c:263
#15 0x000000000093de05 in main_loop_wait (nonblocking=0) at util/main-loop.c:522
#16 0x00000000005bc6a5 in main_loop () at vl.c:1944
#17 0x00000000005c39b5 in main (argc=56, argv=0x7ffe2f6443f8, envp=0x3ad0030) at vl.c:4752
It does not get the RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect sometime.
According to IB Spec once active side send DREQ message, it should wait for DREP message
and only once it arrived it should trigger a DISCONNECT event. DREP message can be dropped
due to network issues.
For that case the spec defines a DREP_timeout state in the CM state machine, if the DREP is
dropped we should get a timeout and a TIMEWAIT_EXIT event will be trigger.
Unfortunately the current kernel CM implementation doesn't include the DREP_timeout state
and in above scenario we will not get DISCONNECT or TIMEWAIT_EXIT events.
So it should not invoke rdma_get_cm_event which may hang forever, and the event channel
is also destroyed in qemu_rdma_cleanup.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
by different threads concurrently, this patch removes unnecessary variables
len in QIOChannelRDMA and use local variable instead.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Lidong Chen <jemmy858585@gmail.com>
Activating the block devices causes the locks to be taken on
the backing file. If we're running with -S and the destination libvirt
hasn't started the destination with 'cont', it's expecting the locks are
still untaken.
Don't activate the block devices if we're not going to autostart the VM;
'cont' already will do that anyway. This change is tied to the new
migration capability 'late-block-activate' that defaults to off, keeping
the old behaviour by default.
bz: https://bugzilla.redhat.com/show_bug.cgi?id=1560854
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
On the POWER9 processor, the XIVE interrupt controller can control
interrupt sources using MMIO to trigger events, to EOI or to turn off
the sources. Priority management and interrupt acknowledgment is also
controlled by MMIO in the presenter sub-engine.
These MMIO regions are exposed to guests in QEMU with a set of 'ram
device' memory mappings, similarly to VFIO, and the VMAs are populated
dynamically with the appropriate pages using a fault handler.
But, these regions are an issue for migration. We need to discard the
associated RAMBlocks from the RAM state on the source VM and let the
destination VM rebuild the memory mappings on the new host in the
post_load() operation just before resuming the system.
To achieve this goal, the following introduces a new RAMBlock flag
RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
vmstate_unregister_ram() routines. This flag is then used by the
migration to identify RAMBlocks to discard on the source. Some checks
are also performed on the destination to make sure nothing invalid was
sent.
This change impacts the boston, malta and jazz mips boards for which
migration compatibility is broken.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
QEMU 3.0 enables strict check for compression & decompression to
make the migration more robust, that depends on the source to fix
the internal design which triggers the unexpected error conditions
To make it work for migrating old version QEMU to 2.13 QEMU, we
introduce this parameter to disable the error check on the
destination which is the default behavior of the machine type
which is older than 2.13, alternately, the strict check can be
enabled explicitly as followings:
-M pc-q35-2.11 -global migration.decompress-error-check=true
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
When pulling in headers that are in the same directory as the C file (as
opposed to one in include/), we should use its relative path, without a
directory.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
In the vmstate.h file, we just need a struct name. Use a forward
declaration instead of an include, then adjust the one affected .c file
to include the file that is no longer implicit from the header.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The VMS_STRUCT has no way to specify which version of a structure
to use. Add a type and a new field to allow the specific version
of a structure to be used.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Message-Id: <1524670052-28373-2-git-send-email-minyard@acm.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Buffers allocated with bitmap_new() should be freed with g_free().
Both reported by Coverity:
*** CID 1391300: API usage errors (ALLOC_FREE_MISMATCH)
/migration/ram.c: 3517 in ram_dirty_bitmap_reload()
3511 * the last one to sync, we need to notify the main send thread.
3512 */
3513 ram_dirty_bitmap_reload_notify(s);
3514
3515 ret = 0;
3516 out:
>>> CID 1391300: API usage errors (ALLOC_FREE_MISMATCH)
>>> Calling "free" frees "le_bitmap" using "free" but it should have been freed using "g_free".
3517 free(le_bitmap);
3518 return ret;
3519 }
3520
3521 static int ram_resume_prepare(MigrationState *s, void *opaque)
3522 {
*** CID 1391292: API usage errors (ALLOC_FREE_MISMATCH)
/migration/ram.c: 249 in ramblock_recv_bitmap_send()
243 * Mark as an end, in case the middle part is screwed up due to
244 * some "misterious" reason.
245 */
246 qemu_put_be64(file, RAMBLOCK_RECV_BITMAP_ENDING);
247 qemu_fflush(file);
248
>>> CID 1391292: API usage errors (ALLOC_FREE_MISMATCH)
>>> Calling "free" frees "le_bitmap" using "free" but it should have been freed using "g_free".
249 free(le_bitmap);
250
251 if (qemu_file_get_error(file)) {
252 return qemu_file_get_error(file);
253 }
254
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20180525015042.31778-1-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Commit:
commit 36c2f8be2c
Author: Juan Quintela <quintela@redhat.com>
Date: Wed Mar 7 08:40:52 2018 +0100
migration: Delay start of migration main routines
Missed tcp and fd transports. This fix its.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20180523091411.1073-1-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
During a TLS connect we see:
migration_channel_connect calls
migration_tls_channel_connect
(calls after TLS setup)
migration_channel_connect
My previous error handling fix made migration_channel_connect
call migrate_fd_connect in all cases; unfortunately the above
means it gets called twice and crashes doing double cleanup.
Fixes: 688a3dcba9
Reported-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180430185943.35714-1-dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
rdma_delete_block function deletes RDMALocalBlock base on index field,
but not update the index field. So when next time invoke rdma_delete_block,
it will not work correctly.
If start and cancel migration repeatedly, some RDMALocalBlock not invoke
ibv_dereg_mr to decrease kernel mm_struct vmpin. When vmpin is large than
max locked memory limitation, ibv_reg_mr will failed, and migration can not
start successfully again.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1525618499-1560-1-git-send-email-lidongchen@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Lidong Chen <jemmy858585@gmail.com>
It pauses an ongoing migration. Currently it only supports postcopy.
Note that this command will work on either side of the migration.
Basically when we trigger this on one side, it'll interrupt the other
side as well since the other side will get notified on the disconnect
event.
However, it's still possible that the other side is not notified, for
example, when the network is totally broken, or due to some firewall
configuration changes. In that case, we will also need to run the same
command on the other side so both sides will go into the paused state.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-24-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
s/2.12/2.13/
Let's introduce a lock for that QEMUFile since we are going to operate
on it in multiple threads.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-23-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The first allow-oob=true command. It's used on destination side when
the postcopy migration is paused and ready for a recovery. After
execution, a new migration channel will be established for postcopy to
continue.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-21-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
s/2.12/2.13/
Though we may not need it, now we init both the src/dst migration
objects in migration_object_init() so that even incoming migration
object would be thread safe (it was not).
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-20-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Finish the last step to do the final handshake for the recovery.
First source sends one MIG_CMD_RESUME to dst, telling that source is
ready to resume.
Then, dest replies with MIG_RP_MSG_RESUME_ACK to source, telling that
dest is ready to resume (after switch to postcopy-active state).
When source received the RESUME_ACK, it switches its state to
postcopy-active, and finally the recovery is completed.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-19-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
After we updated the dirty bitmaps of ramblocks, we also need to update
the critical fields in RAMState to make sure it is ready for a resume.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-18-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch implements the first part of core RAM resume logic for
postcopy. ram_resume_prepare() is provided for the work.
When the migration is interrupted by network failure, the dirty bitmap
on the source side will be meaningless, because even the dirty bit is
cleared, it is still possible that the sent page was lost along the way
to destination. Here instead of continue the migration with the old
dirty bitmap on source, we ask the destination side to send back its
received bitmap, then invert it to be our initial dirty bitmap.
The source side send thread will issue the MIG_CMD_RECV_BITMAP requests,
once per ramblock, to ask for the received bitmap. On destination side,
MIG_RP_MSG_RECV_BITMAP will be issued, along with the requested bitmap.
Data will be received on the return-path thread of source, and the main
migration thread will be notified when all the ramblock bitmaps are
synchronized.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-17-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This is hook function to be called when a postcopy migration wants to
resume from a failure. For each module, it should provide its own
recovery logic before we switch to the postcopy-active state.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-16-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Creating new message to reply for MIG_CMD_POSTCOPY_RESUME. One uint32_t
is used as payload to let the source know whether destination is ready
to continue the migration.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-15-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introducing this new command to be sent when the source VM is ready to
resume the paused migration. What the destination does here is
basically release the fault thread to continue service page faults.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-14-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introducing new return path message MIG_RP_MSG_RECV_BITMAP to send
received bitmap of ramblock back to source.
This is the reply message of MIG_CMD_RECV_BITMAP, it contains not only
the header (including the ramblock name), and it was appended with the
whole ramblock received bitmap on the destination side.
When the source receives such a reply message (MIG_RP_MSG_RECV_BITMAP),
it parses it, convert it to the dirty bitmap by inverting the bits.
One thing to mention is that, when we send the recv bitmap, we are doing
these things in extra:
- converting the bitmap to little endian, to support when hosts are
using different endianess on src/dst.
- do proper alignment for 8 bytes, to support when hosts are using
different word size (32/64 bits) on src/dst.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-13-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add a new vm command MIG_CMD_RECV_BITMAP to request received bitmap for
one ramblock.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-12-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
On the destination side, we cannot wake up all the threads when we got
reconnected. The first thing to do is to wake up the main load thread,
so that we can continue to receive valid messages from source again and
reply when needed.
At this point, we switch the destination VM state from postcopy-paused
back to postcopy-recover.
Now we are finally ready to do the resume logic.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-11-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introducing new migration state "postcopy-recover". If a migration
procedure is paused and the connection is rebuilt afterward
successfully, we'll switch the source VM state from "postcopy-paused" to
the new state "postcopy-recover", then we'll do the resume logic in the
migration thread (along with the return path thread).
This patch only do the state switch on source side. Another following up
patch will handle the state switching on destination side using the same
status bit.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-10-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
s/2.11/2.13/
This patch detects the "resume" flag of migration command, rebuild the
channels only if the flag is set.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-9-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It will be used when we want to resume one paused migration.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-8-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
s/2.12/2.13/
Allows the fault thread to stop handling page faults temporarily. When
network failure happened (and if we expect a recovery afterwards), we
should not allow the fault thread to continue sending things to source,
instead, it should halt for a while until the connection is rebuilt.
When the dest main thread noticed the failure, it kicks the fault thread
to switch to pause state.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-7-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Let the thread pause for network issues.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-6-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
When there is IO error on the incoming channel (e.g., network down),
instead of bailing out immediately, we allow the dst vm to switch to the
new POSTCOPY_PAUSE state. Currently it is still simple - it waits the
new semaphore, until someone poke it for another attempt.
One note is that here on ram loading thread we cannot detect the
POSTCOPY_ACTIVE state, but we need to detect the more specific
POSTCOPY_INCOMING_RUNNING state, to make sure we have already loaded all
the device states.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-5-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Now when network down for postcopy, the source side will not fail the
migration. Instead we convert the status into this new paused state, and
we will try to wait for a rescue in the future.
If a recovery is detected, migration_thread() will reset its local
variables to prepare for that.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-4-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introducing a new state "postcopy-paused", which can be used when the
postcopy migration is paused. It is targeted for postcopy network
failure recovery.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-3-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The old incoming migration is running in main thread and default
gcontext. With the new qio_channel_add_watch_full() we can now let it
run in the thread's own gcontext (if there is one).
Currently this patch does nothing alone. But when any of the incoming
migration is run in another iothread (e.g., the upcoming migrate-recover
command), this patch will bind the incoming logic to the iothread
instead of the main thread (which may already get page faulted and
hanged).
RDMA is not considered for now since it's not even using the QIO watch
framework at all.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180502104740.12123-2-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Once there, we don't need the struct names anywhere, just the
typedefs. And now also document all fields.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
--
Be network agnostic.
Add error checking for all values.
We need to make sure that we have started all the multifd threads.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In both sides. We still don't transmit anything through them.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Once there, make count field to always be accessed with atomic
operations. To make blocking operations, we need to know that the
thread is running, so create a bool to indicate that.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
--
Once here, s/terminate_multifd_*-threads/multifd_*_terminate_threads/
This is consistente with every other function
Fix the bug introduced by da3f56cb2e (migration: remove
ram_save_compressed_page()), It should be 'return' rather than
'res'
Sorry for this stupid mistake :(
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180428081045.8878-1-xiaoguangrong@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Release buf on error path too.
Bug was introduced in b35ebdf076 "migration: add postcopy
migration of dirty bitmaps" with the whole function.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180427142002.21930-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Now that we can safely call QOBJECT() on QObject * as well as its
subtypes, we can have macros qobject_ref() / qobject_unref() that work
everywhere instead of having to use QINCREF() / QDECREF() for QObject
and qobject_incref() / qobject_decref() for its subtypes.
The replacement is mechanical, except I broke a long line, and added a
cast in monitor_qmp_cleanup_req_queue_locked(). Unlike
qobject_decref(), qobject_unref() doesn't accept void *.
Note that the new macros evaluate their argument exactly once, thus no
need to shout them.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Rebased, semantic conflict resolved, commit message improved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now, we can reuse the path in ram_save_page() to post the page out
as normal, then the only thing remained in ram_save_compressed_page()
is compression that we can move it out to the caller
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-11-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It directly sends the page to the stream neither checking zero nor
using xbzrle or compression
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-10-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
save_zero_page() is always our first approach to try, move it to
the common place before calling ram_save_compressed_page
and ram_save_page
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-9-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The function is called by both ram_save_page and ram_save_target_page,
so move it to the common caller to cleanup the code
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-8-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Move some code from ram_save_target_page() to ram_save_host_page()
to make it be more readable for latter patches that dramatically
clean ram_save_target_page() up
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-7-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Abstract the common function control_save_page() to cleanup the code,
no logic is changed
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-6-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently the page being compressed is allowed to be updated by
the VM on the source QEMU, correspondingly the destination QEMU
just ignores the decompression error. However, we completely miss
the chance to catch real errors, then the VM is corrupted silently
To make the migration more robuster, we copy the page to a buffer
first to avoid it being written by VM, then detect and handle the
errors of both compression and decompression errors properly
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-5-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Current code uses uncompress() to decompress memory which manages
memory internally, that causes huge memory is allocated and freed
very frequently, more worse, frequently returning memory to kernel
will flush TLBs
So, we maintain the memory by ourselves and reuse it for each
decompression
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-4-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Current code uses compress2() to compress memory which manages memory
internally, that causes huge memory is allocated and freed very
frequently
More worse, frequently returning memory to kernel will flush TLBs
and trigger invalidation callbacks on mmu-notification which
interacts with KVM MMU, that dramatically reduce the performance
of VM
So, we maintain the memory by ourselves and reuse it for each
compression
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-3-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
As compression is a heavy work, do not do it in migration thread,
instead, we post it out as a normal page
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
Message-Id: <20180330075128.26919-2-xiaoguangrong@tencent.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Postcopy total blocktime is available on destination side only.
But query-migrate was possible only for source. This patch
adds ability to call query-migrate on destination.
To be able to see postcopy blocktime, need to request postcopy-blocktime
capability.
The query-migrate command will show following sample result:
{"return":
"postcopy-vcpu-blocktime": [115, 100],
"status": "completed",
"postcopy-blocktime": 100
}}
postcopy_vcpu_blocktime contains list, where the first item is the first
vCPU in QEMU.
This patch has a drawback, it combines states of incoming and
outgoing migration. Ongoing migration state will overwrite incoming
state. Looks like better to separate query-migrate for incoming and
outgoing migration or add parameter to indicate type of migration.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1521742647-25550-7-git-send-email-a.perevalov@samsung.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch provides blocktime calculation per vCPU,
as a summary and as a overlapped value for all vCPUs.
This approach was suggested by Peter Xu, as an improvements of
previous approch where QEMU kept tree with faulted page address and cpus bitmask
in it. Now QEMU is keeping array with faulted page address as value and vCPU
as index. It helps to find proper vCPU at UFFD_COPY time. Also it keeps
list for blocktime per vCPU (could be traced with page_fault_addr)
Blocktime will not calculated if postcopy_blocktime field of
MigrationIncomingState wasn't initialized.
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1521742647-25550-4-git-send-email-a.perevalov@samsung.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds request to kernel space for UFFD_FEATURE_THREAD_ID, in
case this feature is provided by kernel.
PostcopyBlocktimeContext is encapsulated inside postcopy-ram.c,
due to it being a postcopy-only feature.
Also it defines PostcopyBlocktimeContext's instance live time.
Information from PostcopyBlocktimeContext instance will be provided
much after postcopy migration end, instance of PostcopyBlocktimeContext
will live till QEMU exit, but part of it (vcpu_addr,
page_fault_vcpu_time) used only during calculation, will be released
when postcopy ended or failed.
To enable postcopy blocktime calculation on destination, need to
request proper compatibility (Patch for documentation will be at the
tail of the patch set).
As an example following command enable that capability, assume QEMU was
started with
-chardev socket,id=charmonitor,path=/var/lib/migrate-vm-monitor.sock
option to control it
[root@host]#printf "{\"execute\" : \"qmp_capabilities\"}\r\n \
{\"execute\": \"migrate-set-capabilities\" , \"arguments\": {
\"capabilities\": [ { \"capability\": \"postcopy-blocktime\", \"state\":
true } ] } }" | nc -U /var/lib/migrate-vm-monitor.sock
Or just with HMP
(qemu) migrate_set_capability postcopy-blocktime on
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1521742647-25550-3-git-send-email-a.perevalov@samsung.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Right now it could be used on destination side to
enable vCPU blocktime calculation for postcopy live migration.
vCPU blocktime - it's time since vCPU thread was put into
interruptible sleep, till memory page was copied and thread awake.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1521742647-25550-2-git-send-email-a.perevalov@samsung.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This reverts commit 0746a92612.
Discussion with kwolf suggests this is actually an API change that
we need to gate on a capability. Push to 2.13.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>