Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
Fixed missing space after comma (fabiano)
Once that we are there, we rename the function to migrate_return_path()
to be consistent with all other capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Once that we are there, we rename the function to migrate_block()
to be consistent with all other capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Once that we are there, we rename the function to migrate_xbzrle()
to be consistent with all other capabilities.
We change the type to return bool also for consistency.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Once that we are there, we rename the function to
migrate_zero_copy_send() to be consistent with all other capabilities.
We can remove the CONFIG_LINUX guard. We already check that we can't
setup this capability in migrate_caps_check().
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Once that we are there, we rename the function to migrate_multifd()
to be consistent with all other capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Once that we are there, we rename the function to migrate_events()
to be consistent with all other capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Once that we are there, we rename the function to migrate_compress()
to be consistent with all other capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Once that we are there, we rename the function to migrate_colo() to be
consistent with all other capabilities.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
We move there all capabilities helpers from migration.c.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
Following David advise:
- looked through the history, capabilities are newer than 2012, so we
can remove that bit of the header.
- This part is posterior to Anthony.
Original Author is Orit. Once there,
I put myself. Peter Xu also did quite a bit of work here.
Anyone else wants/needs to be there? I didn't search too hard
because nobody asked before to be added.
What do you think?
And remove the convoluted use of qmp_migrate_set_capabilities() to
enable disable MIGRATION_CAPABILITY_BLOCK.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
It has nothing to do with migration, except for the "migrate" in the
name of the command. Move it with the rest of the ui commands.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
It is only used there, so we can make it static.
Once there, remove spice.h that it is not used.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
fix David Edmonson ui/qemu-spice.h unintended removal
No need to declare a temporary variable.
Suggested-by: Juan Quintela <quintela@redhat.com>
Fixes: 1df36e8c6289 ("migration: Handle block device inactivation failures better")
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We used to pass the old capabilities array and the new
capabilities as a list.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
It is clear from the context what that means, and such a long name
with the extra long names of the capabilities make very difficilut to
stay inside the 80 columns limit.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Postcopy requires the memory support userfaultfd to work. Right now we
check it but it's a bit too late (when switching to postcopy migration).
Do that early right at enabling of postcopy.
Note that this is still only a best effort because ramblocks can be
dynamically created. We can add check in hostmem creations and fail if
postcopy enabled, but maybe that's too aggressive.
Still, we have chance to fail the most obvious where we know there's an
existing unsupported ramblock.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Consider what happens when performing a migration between two host
machines connected to an NFS server serving multiple block devices to
the guest, when the NFS server becomes unavailable. The migration
attempts to inactivate all block devices on the source (a necessary
step before the destination can take over); but if the NFS server is
non-responsive, the attempt to inactivate can itself fail. When that
happens, the destination fails to get the migrated guest (good,
because the source wasn't able to flush everything properly):
(qemu) qemu-kvm: load of migration failed: Input/output error
at which point, our only hope for the guest is for the source to take
back control. With the current code base, the host outputs a message, but then appears to resume:
(qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)
(src qemu)info status
VM status: running
but a second migration attempt now asserts:
(src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
Whether the guest is recoverable on the source after the first failure
is debatable, but what we do not want is to have qemu itself fail due
to an assertion. It looks like the problem is as follows:
In migration.c:migration_completion(), the source sets 'inactivate' to
true (since COLO is not enabled), then tries
savevm.c:qemu_savevm_state_complete_precopy() with a request to
inactivate block devices. In turn, this calls
block.c:bdrv_inactivate_all(), which fails when flushing runs up
against the non-responsive NFS server. With savevm failing, we are
now left in a state where some, but not all, of the block devices have
been inactivated; but migration_completion() then jumps to 'fail'
rather than 'fail_invalidate' and skips an attempt to reclaim those
those disks by calling bdrv_activate_all(). Even if we do attempt to
reclaim disks, we aren't taking note of failure there, either.
Thus, we have reached a state where the migration engine has forgotten
all state about whether a block device is inactive, because we did not
set s->block_inactive in enough places; so migration allows the source
to reach vm_start() and resume execution, violating the block layer
invariant that the guest CPUs should not be restarted while a device
is inactive. Note that the code in migration.c:migrate_fd_cancel()
will also try to reactivate all block devices if s->block_inactive was
set, but because we failed to set that flag after the first failure,
the source assumes it has reclaimed all devices, even though it still
has remaining inactivated devices and does not try again. Normally,
qmp_cont() will also try to reactivate all disks (or correctly fail if
the disks are not reclaimable because NFS is not yet back up), but the
auto-resumption of the source after a migration failure does not go
through qmp_cont(). And because we have left the block layer in an
inconsistent state with devices still inactivated, the later migration
attempt is hitting the assertion failure.
Since it is important to not resume the source with inactive disks,
this patch marks s->block_inactive before attempting inactivation,
rather than after succeeding, in order to prevent any vm_start() until
it has successfully reactivated all devices.
See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Rest of counters that refer to pages has a _pages suffix.
And historically, this showed the number of full pages transferred.
The name "normal" refered to the fact that they were sent without any
optimization (compression, xbzrle, zero_page, ...).
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Rest of counters that refer to pages has a _pages suffix.
And historically, this showed the number of pages composed of the same
character, here comes the name "duplicated". But since years ago, it
refers to the number of zero_pages.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
In the spirit of:
commit 394d323bc3451e4d07f13341cb8817fac8dfbadd
Author: Peter Xu <peterx@redhat.com>
Date: Tue Oct 11 17:55:51 2022 -0400
migration: Use atomic ops properly for page accountings
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Using MgrationStats as type for ram_counters mean that we didn't have
to re-declare each value in another struct. The need of atomic
counters have make us to create MigrationAtomicStats for this atomic
counters.
Create RAMStats type which is a merge of MigrationStats and
MigrationAtomicStats removing unused members.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
Fix typos found by David Edmondson
It does not even pair with a qatomic_mb_set(), so it is clearer to use
load-acquire in this case; they are synonyms.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There should be no paths from a coroutine_fn to aio_poll, however in
practice coroutine_mixed_fn will call aio_poll in the !qemu_in_coroutine()
path. By marking mixed functions, we can track accurately the call paths
that execute entirely in coroutine context, and find more missing
coroutine_fn markers. This results in more accurate checks that
coroutine code does not end up blocking.
If the marking were extended transitively to all functions that call
these ones, static analysis could be done much more efficiently.
However, this is a start and makes it possible to use vrc's path-based
searches to find potential bugs where coroutine_fns call blocking functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I removed that bit on commit:
commit c8df4a7aef
Author: Juan Quintela <quintela@redhat.com>
Date: Mon Oct 3 02:00:03 2022 +0200
migration: Split save_live_pending() into state_pending_*
Fixes: c8df4a7aef
Suggested-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Since ec6f3ab9, migration with compress enabled was broken, because
the compress threads use a dummy QEMUFile which just acts as a
buffer and that commit accidentally changed it to use the outgoing
migration channel instead.
Fix this by using the dummy file again in the compress threads.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In 8.0 devel window we reworked preempt channel creation, so that there'll
be no race condition when the migration channel and preempt channel got
established in the wrong order in commit 5655aab079.
However no one noticed that the change will also be not compatible with
older qemus, majorly 7.1/7.2 versions where preempt mode started to be
supported.
Leverage the same pre-7.2 flag introduced in the previous patch to recover
the behavior hopefully before 8.0 releases, so we don't break migration
when we migrate from 8.0 to older qemu binaries.
Fixes: 5655aab079 ("migration: Postpone postcopy preempt channel to be after main")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
postcopy_qemufile_src object should be owned by one thread, either the main
thread (e.g. when at the beginning, or at the end of migration), or by the
return path thread (when during a preempt enabled postcopy migration). If
that's not the case the access to the object might be racy.
postcopy_preempt_shutdown_file() can be potentially racy, because it's
called at the end phase of migration on the main thread, however during
which the return path thread hasn't yet been recycled; the recycle happens
in await_return_path_close_on_source() which is after this point.
It means, logically it's posslbe the main thread and the return path thread
are both operating on the same qemufile. While I don't think qemufile is
thread safe at all.
postcopy_preempt_shutdown_file() used to be needed because that's where we
send EOS to dest so that dest can safely shutdown the preempt thread.
To avoid the possible race, remove this only place that a race can happen.
Instead we figure out another way to safely close the preempt thread on
dest.
The core idea during postcopy on deciding "when to stop" is that dest will
send a postcopy SHUT message to src, telling src that all data is there.
Hence to shut the dest preempt thread maybe better to do it directly on
dest node.
This patch proposed such a way that we change postcopy_prio_thread_created
into PreemptThreadStatus, so that we kick the preempt thread on dest qemu
by a sequence of:
mis->preempt_thread_status = PREEMPT_THREAD_QUIT;
qemu_file_shutdown(mis->postcopy_qemufile_dst);
While here shutdown() is probably so far the easiest way to kick preempt
thread from a blocked qemu_get_be64(). Then it reads preempt_thread_status
to make sure it's not a network failure but a willingness to quit the
thread.
We could have avoided that extra status but just rely on migration status.
The problem is postcopy_ram_incoming_cleanup() is just called early enough
so we're still during POSTCOPY_ACTIVE no matter what.. So just make it
simple to have the status introduced.
One flag x-preempt-pre-7-2 is added to keep old pre-7.2 behaviors of
postcopy preempt.
Fixes: 9358982744 ("migration: Send requested page directly in rp-return thread")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Uses of blk_nb_sectors must check whether the result is negative.
Otherwise, underflow can happen. Fortunately, alloc_aio_bitmap()
and bmds_aio_inflight() both have an alternative way to retrieve the
number of sectors in the file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230407153303.391121-6-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This had been pulled in via qemu/plugin.h from hw/core/cpu.h,
but that will be removed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230310195252.210956-5-richard.henderson@linaro.org>
[AJB: add various additional cases shown by CI]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230315174331.2959-15-alex.bennee@linaro.org>
Reviewed-by: Emilio Cota <cota@braap.org>
Include CONFIG_DEVICES so that populate_vfio_info is instantiated for
CONFIG_VFIO. Without it, the 'info migrate' command never returns
info about vfio.
Fixes: 43bd0bf30f ("migration: Move populate_vfio_info() into a separate file")
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The p->flags could be updated via the send_prepare callback, e.g. OR-ed
with MULTIFD_FLAG_ZLIB via zlib_send_prepare. Assign p->flags to the
local "flags" before the send_prepare callback could only get partial of
p->flags. Fix it by moving the assignment of p->flags to the local flags
after the callback, so that the correct flags can be traced.
Fixes: ab7cbb0b9a ("multifd: Make no compression operations into its own structure")
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It's no longer needed since commit
44bcfd45e9 ("migration/rdma: destination: create the return patch after the first accept")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
xbzrle_encode_buffer_avx512() checks for overflows too scarcely in its
outer loop, causing out-of-bounds writes:
$ ../configure --target-list=aarch64-softmmu --enable-sanitizers --enable-avx512bw
$ make tests/unit/test-xbzrle && ./tests/unit/test-xbzrle
==5518==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62100000b100 at pc 0x561109a7714d bp 0x7ffed712a440 sp 0x7ffed712a430
WRITE of size 1 at 0x62100000b100 thread T0
#0 0x561109a7714c in uleb128_encode_small ../util/cutils.c:831
#1 0x561109b67f6a in xbzrle_encode_buffer_avx512 ../migration/xbzrle.c:275
#2 0x5611099a7428 in test_encode_decode_overflow ../tests/unit/test-xbzrle.c:153
#3 0x7fb2fb65a58d (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a58d)
#4 0x7fb2fb65a333 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a333)
#5 0x7fb2fb65aa79 in g_test_run_suite (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7aa79)
#6 0x7fb2fb65aa94 in g_test_run (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7aa94)
#7 0x5611099a3a23 in main ../tests/unit/test-xbzrle.c:218
#8 0x7fb2fa78c082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
#9 0x5611099a608d in _start (/qemu/build/tests/unit/test-xbzrle+0x28408d)
0x62100000b100 is located 0 bytes to the right of 4096-byte region [0x62100000a100,0x62100000b100)
allocated by thread T0 here:
#0 0x7fb2fb823a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
#1 0x7fb2fb637ef0 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57ef0)
Fix that by performing the overflow check in the inner loop, instead.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
__builtin_ctzll() produces undefined results when the argument is 0.
This can be seen through test-xbzrle, which produces the following
warning:
../migration/xbzrle.c:265: runtime error: passing zero to ctz(), which is not a valid argument
Replace __builtin_ctzll() with our ctz64() wrapper which properly
handles 0.
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The RDMA code has return-path handling code, but it's only enabled
if postcopy is enabled; if the 'return-path' migration capability
is enabled, the return path is NOT setup but the core migration
code still tries to use it and breaks.
Enable the RDMA return path if either postcopy or the return-path
capability is enabled.
bz: https://bugzilla.redhat.com/show_bug.cgi?id=2063615
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
QEMU main thread will wait until dest preempt channel established during
processing the LISTEN command (within the whole postcopy PACKAGED data), by
waiting on the semaphore postcopy_qemufile_dst_done.
That's racy, because it's possible that the dest QEMU main thread hasn't
yet accept()ed the new connection when processing the LISTEN event. The
sem_wait() will yield the main thread without being able to run anything
else including the accept() of the new socket, which can cause deadlock
within the main thread.
To avoid the race, move the "wait channel" from main thread to the preempt
thread right at the start.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: 5655aab079 ("migration: Postpone postcopy preempt channel to be after main")
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
* Use cmd instead of /bin/sh on Windows.
* Try to auto-detect cmd.exe's path, but default to a hard-coded path.
Note that this will require that gspawn-win[32|64]-helper.exe and
gspawn-win[32|64]-helper-console.exe are included in the Windows binary
distributions (cc: Stefan Weil).
Signed-off-by: "John Berberian, Jr" <jeb.study@gmail.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>