Commit Graph

1894 Commits

Author SHA1 Message Date
Stefan Hajnoczi
06e0f098d6 io: follow coroutine AioContext in qio_channel_yield()
The ongoing QEMU multi-queue block layer effort makes it possible for multiple
threads to process I/O in parallel. The nbd block driver is not compatible with
the multi-queue block layer yet because QIOChannel cannot be used easily from
coroutines running in multiple threads. This series changes the QIOChannel API
to make that possible.

In the current API, calling qio_channel_attach_aio_context() sets the
AioContext where qio_channel_yield() installs an fd handler prior to yielding:

  qio_channel_attach_aio_context(ioc, my_ctx);
  ...
  qio_channel_yield(ioc); // my_ctx is used here
  ...
  qio_channel_detach_aio_context(ioc);

This API design has limitations: reading and writing must be done in the same
AioContext and moving between AioContexts involves a cumbersome sequence of API
calls that is not suitable for doing on a per-request basis.

There is no fundamental reason why a QIOChannel needs to run within the
same AioContext every time qio_channel_yield() is called. QIOChannel
only uses the AioContext while inside qio_channel_yield(). The rest of
the time, QIOChannel is independent of any AioContext.

In the new API, qio_channel_yield() queries the AioContext from the current
coroutine using qemu_coroutine_get_aio_context(). There is no need to
explicitly attach/detach AioContexts anymore and
qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone.
One coroutine can read from the QIOChannel while another coroutine writes from
a different AioContext.

This API change allows the nbd block driver to use QIOChannel from any thread.
It's important to keep in mind that the block driver already synchronizes
QIOChannel access and ensures that two coroutines never read simultaneously or
write simultaneously.

This patch updates all users of qio_channel_attach_aio_context() to the
new API. Most conversions are simple, but vhost-user-server requires a
new qemu_coroutine_yield() call to quiesce the vu_client_trip()
coroutine when not attached to any AioContext.

While the API is has become simpler, there is one wart: QIOChannel has a
special case for the iohandler AioContext (used for handlers that must not run
in nested event loops). I didn't find an elegant way preserve that behavior, so
I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true|false)
for opting in to the new AioContext model. By default QIOChannel uses the
iohandler AioHandler. Code that formerly called
qio_channel_attach_aio_context() now calls
qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is
created.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230830224802.493686-5-stefanha@redhat.com>
[eblake: also fix migration/rdma.c]
Signed-off-by: Eric Blake <eblake@redhat.com>
2023-09-07 20:32:11 -05:00
Stefan Hajnoczi
156618d9ea Pull request
v3:
 - Drop UFS emulation due to CI failures
 - Add "aio-posix: zero out io_uring sqe user_data"
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTvLIEACgkQnKSrs4Gr
 c8itVggAka3RMkEclbeW7JKJBOolm3oUuJTobV8oJfDNMQ8mmom9JkXVUctyPWQT
 EF+oeqZz1omjr0Dk7YEA2toCahTbXm/UsG7i6cZg8JXPl6e9sOne0j+p5zO5x/kc
 YlG43SBQJHdp/BfTm/gvwUh0W2on0wadaeEV82m3ZyIrZGTgNcrC1p1gj5dwF5VX
 SqW02mgALETECyJpo8O7y9vNUYGxEtETG9jzAhtrugGpYk4bPeXlm/rc+2zwV+ET
 YCnfUvhjhlu5vS4nkta6natg0If16ODjy35vWYm/aGlgveGTqQq9HWgTL71eNuxm
 Smn+hJHuvkyBclKjbGiiO1W1MuG1/g==
 =UvNK
 -----END PGP SIGNATURE-----

Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging

Pull request

v3:
- Drop UFS emulation due to CI failures
- Add "aio-posix: zero out io_uring sqe user_data"

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTvLIEACgkQnKSrs4Gr
# c8itVggAka3RMkEclbeW7JKJBOolm3oUuJTobV8oJfDNMQ8mmom9JkXVUctyPWQT
# EF+oeqZz1omjr0Dk7YEA2toCahTbXm/UsG7i6cZg8JXPl6e9sOne0j+p5zO5x/kc
# YlG43SBQJHdp/BfTm/gvwUh0W2on0wadaeEV82m3ZyIrZGTgNcrC1p1gj5dwF5VX
# SqW02mgALETECyJpo8O7y9vNUYGxEtETG9jzAhtrugGpYk4bPeXlm/rc+2zwV+ET
# YCnfUvhjhlu5vS4nkta6natg0If16ODjy35vWYm/aGlgveGTqQq9HWgTL71eNuxm
# Smn+hJHuvkyBclKjbGiiO1W1MuG1/g==
# =UvNK
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 30 Aug 2023 07:48:17 EDT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [ultimate]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [ultimate]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
  aio-posix: zero out io_uring sqe user_data
  tests/qemu-iotests/197: add testcase for CoR with subclusters
  block/io: align requests to subcluster_size
  block: add subcluster_size field to BlockDriverInfo
  block-migration: Ensure we don't crash during migration cleanup

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-08-30 09:20:27 -04:00
Fabiano Rosas
f187609f27 block-migration: Ensure we don't crash during migration cleanup
We can fail the blk_insert_bs() at init_blk_migration(), leaving the
BlkMigDevState without a dirty_bitmap and BlockDriverState. Account
for the possibly missing elements when doing cleanup.

Fix the following crashes:

Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
359         BlockDriverState *bs = bitmap->bs;
 #0  0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
 #1  0x0000555555bba331 in unset_dirty_tracking () at ../migration/block.c:371
 #2  0x0000555555bbad98 in block_migration_cleanup_bmds () at ../migration/block.c:681

Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
7073        QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
 #0  0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
 #1  0x0000555555e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at ../block.c:7095
 #2  0x0000555555bbae13 in block_migration_cleanup_bmds () at ../migration/block.c:690

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230731203338.27581-1-farosas@suse.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-08-30 07:39:10 -04:00
Andrei Gudkov
3eb82637fb migration/dirtyrate: Fix precision losses and g_usleep overshoot
Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Message-Id: <8ddb0d40d143f77aab8f602bd494e01e5fa01614.1691161009.git.gudkov.andrei@huawei.com>
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-08-29 10:19:03 +08:00
Juan Quintela
697c4c86ab migration/rdma: Split qemu_fopen_rdma() into input/output functions
This is how everything else in QEMUFile is structured.
As a bonus they are three less lines of code.

Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230530183941.7223-17-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Juan Quintela
ac6f48e15d qemu-file: Make qemu_file_get_error_obj() static
It was not used outside of qemu_file.c anyways.

Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230530183941.7223-21-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Juan Quintela
8c5ee0bfb8 qemu-file: Simplify qemu_file_shutdown()
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230530183941.7223-20-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Juan Quintela
9ccf83f486 qemu_file: Make qemu_file_is_writable() static
It is not used outside of qemu_file, and it shouldn't.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230530183941.7223-19-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Juan Quintela
cf786549ce migration: Change qemu_file_transferred to noflush
We do a qemu_fclose() just after that, that also does a qemu_fflush(),
so remove one qemu_fflush().

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230530183941.7223-3-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Juan Quintela
fc95c63b60 qemu-file: Rename qemu_file_transferred_ fast -> noflush
Fast don't say much.  Noflush indicates more clearly that it is like
qemu_file_transferred but without the flush.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230530183941.7223-2-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Wei Wang
82137e6c8c migration: enforce multifd and postcopy preempt to be set before incoming
qemu_start_incoming_migration needs to check the number of multifd
channels or postcopy ram channels to configure the backlog parameter (i.e.
the maximum length to which the queue of pending connections for sockfd
may grow) of listen(). So enforce the usage of postcopy-preempt and
multifd as below:
- need to use "-incoming defer" on the destination; and
- set_capability and set_parameter need to be done before migrate_incoming

Otherwise, disable the use of the features and report error messages to
remind users to adjust the commands.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230606101910.20456-2-wei.w.wang@intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Tejus GK
908927db28 migration: Update error description whenever migration fails
There are places in migration.c where the migration is marked failed with
MIGRATION_STATUS_FAILED, but the failure reason is never updated. Hence
libvirt doesn't know why the migration failed when it queries for it.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Tejus GK <tejus.gk@nutanix.com>
Message-ID: <20230621130940.178659-2-tejus.gk@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
15699cf542 migration: Extend query-migrate to provide dirty page limit info
Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
acac51ba24 migration: Implement dirty-limit convergence algo
Implement dirty-limit convergence algo for live migration,
which is kind of like auto-converge algo but using dirty-limit
instead of cpu throttle to make migration convergent.

Enable dirty page limit if dirty_rate_high_cnt greater than 2
when dirty-limit capability enabled, Disable dirty-limit if
migration be canceled.

Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit"
commands are not allowed during dirty-limit live migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-7@git.sr.ht>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
310ad5625e migration: Put the detection logic before auto-converge checking
This commit is prepared for the implementation of dirty-limit
convergence algo.

The detection logic of throttling condition can apply to both
auto-converge and dirty-limit algo, putting it's position
before the checking logic for auto-converge feature.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-6@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
bb9993c672 migration: Refactor auto-converge capability logic
Check if block migration is running before throttling
guest down in auto-converge way.

Note that this modification is kind of like code clean,
because block migration does not depend on auto-converge
capability, so the order of checks can be adjusted.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-5@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
dc62395557 migration: Introduce dirty-limit capability
Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.

Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.

Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.

dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
09f9ec9913 qapi/migration: Introduce vcpu-dirty-limit parameters
Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.

"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.

This two parameters are used to help implement the dirty
page rate limit algo of migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
4d80785719 qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirtyrate calculation period configurable.

Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Fabiano Rosas
01ec0f3a92 migration/multifd: Protect accesses to migration_threads
This doubly linked list is common for all the multifd and migration
threads so we need to avoid concurrent access.

Add a mutex to protect the data from concurrent access. This fixes a
crash when removing two MigrationThread objects from the list at the
same time during cleanup of multifd threads.

Fixes: 671326201d ("migration: Introduce interface query-migrationthreads")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230607161306.31425-3-farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Fabiano Rosas
788fa68041 migration/multifd: Rename threadinfo.c functions
We're about to add more functions to this file so make it use the same
coding style as the rest of the code.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230607161306.31425-2-farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Michael Tokarev
d8b71d96b3 migration: spelling fixes
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
2023-07-25 17:13:20 +03:00
David Hildenbrand
f161c88a03 migration/ram: Expose ramblock_is_ignored() as migrate_ram_is_ignored()
virtio-mem wants to know whether it should not mess with the RAMBlock
content (e.g., discard RAM, preallocate memory) on incoming migration.

So let's expose that function as migrate_ram_is_ignored() in
migration/misc.h

Message-ID: <20230706075612.67404-4-david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-07-12 09:25:37 +02:00
Laszlo Ersek
aaf26bd382 migration: unexport migrate_fd_error()
The only migrate_fd_error() call sites are in "migration/migration.c",
which is also where we define migrate_fd_error(). Make the function
static, and remove its declaration from "migration/migration.h".

Cc: Juan Quintela <quintela@redhat.com> (maintainer:Migration)
Cc: Leonardo Bras <leobras@redhat.com> (reviewer:Migration)
Cc: Peter Xu <peterx@redhat.com> (reviewer:Migration)
Cc: qemu-trivial@nongnu.org
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-08 07:24:38 +03:00
Laszlo Ersek
8c69ae9eff migration: factor out "resume_requested" in qmp_migrate()
It cuts back on those awkward, duplicated !(has_resume && resume)
expressions.

Cc: Juan Quintela <quintela@redhat.com> (maintainer:Migration)
Cc: Leonardo Bras <leobras@redhat.com> (reviewer:Migration)
Cc: Peter Xu <peterx@redhat.com> (reviewer:Migration)
Cc: qemu-trivial@nongnu.org
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-08 07:24:38 +03:00
Avihai Horon
808642a2f6 vfio/migration: Reset bytes_transferred properly
Currently, VFIO bytes_transferred is not reset properly:
1. bytes_transferred is not reset after a VM snapshot (so a migration
   following a snapshot will report incorrect value).
2. bytes_transferred is a single counter for all VFIO devices, however
   upon migration failure it is reset multiple times, by each VFIO
   device.

Fix it by introducing a new function vfio_reset_bytes_transferred() and
calling it during migration and snapshot start.

Remove existing bytes_transferred reset in VFIO migration state
notifier, which is not needed anymore.

Fixes: 3710586caa ("qapi: Add VFIO devices migration stats in Migration stats")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-30 06:02:51 +02:00
Avihai Horon
538ef4fe2f migration: Enable switchover ack capability
Now that switchover ack logic has been implemented, enable the
capability.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-30 06:02:51 +02:00
Avihai Horon
1b4adb10f8 migration: Implement switchover ack logic
Implement switchover ack logic. This prevents the source from stopping
the VM and completing the migration until an ACK is received from the
destination that it's OK to do so.

To achieve this, a new SaveVMHandlers handler switchover_ack_needed()
and a new return path message MIG_RP_MSG_SWITCHOVER_ACK are added.

The switchover_ack_needed() handler is called during migration setup in
the destination to check if switchover ack is used by the migrated
device.

When switchover is approved by all migrated devices in the destination
that support this capability, the MIG_RP_MSG_SWITCHOVER_ACK return path
message is sent to the source to notify it that it's OK to do
switchover.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-30 06:02:51 +02:00
Avihai Horon
6574232fff migration: Add switchover ack capability
Migration downtime estimation is calculated based on bandwidth and
remaining migration data. This assumes that loading of migration data in
the destination takes a negligible amount of time and that downtime
depends only on network speed.

While this may be true for RAM, it's not necessarily true for other
migrated devices. For example, loading the data of a VFIO device in the
destination might require from the device to allocate resources, prepare
internal data structures and so on. These operations can take a
significant amount of time which can increase migration downtime.

This patch adds a new capability "switchover ack" that prevents the
source from stopping the VM and completing the migration until an ACK
is received from the destination that it's OK to do so.

This can be used by migrated devices in various ways to reduce downtime.
For example, a device can send initial precopy metadata to pre-allocate
resources in the destination and use this capability to make sure that
the pre-allocation is completed before the source VM is stopped, so it
will have full effect.

This new capability relies on the return path capability to communicate
from the destination back to the source.

The actual implementation of the capability will be added in the
following patches.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-30 06:02:51 +02:00
Philippe Mathieu-Daudé
de6cd7599b meson: Replace softmmu_ss -> system_ss
We use the user_ss[] array to hold the user emulation sources,
and the softmmu_ss[] array to hold the system emulation ones.
Hold the latter in the 'system_ss[]' array for parity with user
emulation.

Mechanical change doing:

  $ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-10-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Philippe Mathieu-Daudé
c7b64948f8 meson: Replace CONFIG_SOFTMMU -> CONFIG_SYSTEM_ONLY
Since we *might* have user emulation with softmmu,
use the clearer 'CONFIG_SYSTEM_ONLY' key to check
for system emulation.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-9-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Steve Sistare
b0182e537e exec/memory: Introduce RAM_NAMED_FILE flag
migrate_ignore_shared() is an optimization that avoids copying memory
that is visible and can be mapped on the target.  However, a
memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED
flag set is not migrated when migrate_ignore_shared() is true.  This is
wrong, because the block has no named backing store, and its contents will
be lost.  To fix, ignore shared memory iff it is a named file.  Define a
new flag RAM_NAMED_FILE to distinguish this case.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1686151116-253260-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-06-13 11:28:58 +02:00
Philippe Mathieu-Daudé
7d5b0d6864 bulk: Remove pointless QOM casts
Mechanical change running Coccinelle spatch with content
generated from the qom-cast-macro-clean-cocci-gen.py added
in the previous commit.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230601093452.38972-3-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-06-05 20:48:34 +02:00
Fiona Ebner
3a8b81f2e6 migration: stop tracking ram writes when cancelling background migration
Currently, it is only done when the iteration finishes successfully.
Not cleaning up the userfaultfd write protection can lead to
symptoms/issues such as the process hanging in memmove or GDB not
being able to attach.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-Id: <20230526115908.196171-1-f.ebner@proxmox.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-06-02 01:03:19 +02:00
Vladimir Sementsov-Ogievskiy
a4c6275aa1 migration: restore vmstate on migration failure
1. Otherwise failed migration just drops guest-panicked state, which is
   not good for management software.

2. We do keep different paused states like guest-panicked during
   migration with help of global_state state.

3. We do restore running state on source when migration is cancelled or
   failed.

4. "postmigrate" state is documented as "guest is paused following a
   successful 'migrate'", so originally it's only for successful path
   and we never documented current behavior.

Let's restore paused states like guest-panicked in case of cancel or
fail too. Allow same transitions like for inmigrate state.

This commit changes the behavior that was introduced by commit
42da5550d6 "migration: set state to post-migrate on failure" and
provides a bit different fix on related
  https://bugzilla.redhat.com/show_bug.cgi?id=1355683

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-6-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-06-02 01:03:19 +02:00
Vladimir Sementsov-Ogievskiy
f4584076fc migration: switch from .vm_was_running to .vm_old_state
No logic change here, only refactoring. That's a preparation for next
commit where we finally restore the stopped vm state on migration
failure or cancellation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-5-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-06-02 01:03:19 +02:00
Vladimir Sementsov-Ogievskiy
c33f1829f8 migration: never fail in global_state_store()
Actually global_state_store() can never fail. Let's get rid of extra
error paths.

To make things clear, use new runstate_get() and use same approach for
global_state_store() and global_state_store_running().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230517123752.21615-3-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-06-02 01:03:19 +02:00
Stefan Hajnoczi
60f782b6b7 aio: remove aio_disable_external() API
All callers now pass is_external=false to aio_set_fd_handler() and
aio_set_event_notifier(). The aio_disable_external() API that
temporarily disables fd handlers that were registered is_external=true
is therefore dead code.

Remove aio_disable_external(), aio_enable_external(), and the
is_external arguments to aio_set_fd_handler() and
aio_set_event_notifier().

The entire test-fdmon-epoll test is removed because its sole purpose was
testing aio_disable_external().

Parts of this patch were generated using the following coccinelle
(https://coccinelle.lip6.fr/) semantic patch:

  @@
  expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque;
  @@
  - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque)
  + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque)

  @@
  expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready;
  @@
  - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready)
  + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready)

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-21-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-05-30 17:37:26 +02:00
Richard Henderson
b5c0d842d6 migration: Build migration_files once
The items in migration_files are built for libmigration and included
info softmmu_ss from there; no need to also include them directly.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:51:18 -07:00
Richard Henderson
7ba7db9fa1 migration/xbzrle: Use i386 host/cpuinfo.h
Perform the function selection once, and only if CONFIG_AVX512_OPT
is enabled.  Centralize the selection to xbzrle.c, instead of
spreading the init across 3 files.

Remove xbzrle-bench.c.  The benefit of being able to benchmark
the different implementations is less important than not peeking
into the internals of the implementation.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:51:18 -07:00
Richard Henderson
1b48d0abdf migration/xbzrle: Shuffle function order
Place the CONFIG_AVX512BW_OPT block at the top,
which will aid function selection in the next patch.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:51:18 -07:00
Richard Henderson
146f515110 Migration Pull request
Hi
 
 Based on latest reviewed parts of migration:
 - Disable colo (vladimir)
 - Migration atomic counters (juan)
 
 Please apply.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmRmXJUACgkQ9IfvGFhy
 1yNRAxAAjDYJELL34Qovt/WE9qKhYJEvIUGTl1IMWJ22YMFnqIFKRdka57dWoU3P
 7EK1BHmokEEtzGT7Fe1ecERXsOwQIJDIkDTJ5g8Oc8Jt1iqY1AC8h5T+LghijCar
 mbZ6qWHaSjsg2lmek/xc9quymzFGGK36PSyB5WkaLRviKQn4RIkEDpUaWny7nDbA
 Q8zJJpBqNFqKfC5/DN0ePa3QQscXQJhey3nxqFd8hYp8RFNIV5UJVW5Lf6ombtK7
 atgdWC4ckkfO2z3OsghKeo/UiMFWpPktgBVVMhDLmk+P/E6czc2gfzD6SCvrPKTj
 XowI8hro22HVmq9bEY8PtbjMOfpxrAxer+tM2KR/0O9l3UzUacFsi7KGqCJ1/trQ
 1tSDjlgyczb8GOgLwwxj8XE+jPHPfVrzCNfDqrBKBNxz6nnZSdZUwhV5mG8FdVtm
 oVVV96BIrNXLl/lIxYIFD/Zyvl8/lrSWQdLkEHTzihYQeXaQfyvPVbV/dOLT4sii
 YUuGCuEhF+DW/qz43G1krwq5/bfxsiZoQzrMV/Odtf0wYQKkabA3KNBIda/vxBCR
 dsLQ7QtmOwKmCzjqw4LUov9vDNYOYr98o7ZqwJ3qeKL4QgFwtEZUFO3VW6UR8fnF
 arVXiTn9wVlkTpu4sT5hLm9400iadhX4Fppji7Ce0tUpLbWbghA=
 =3x32
 -----END PGP SIGNATURE-----

Merge tag 'migration-20230518-pull-request' of https://gitlab.com/juan.quintela/qemu into staging

Migration Pull request

Hi

Based on latest reviewed parts of migration:
- Disable colo (vladimir)
- Migration atomic counters (juan)

Please apply.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmRmXJUACgkQ9IfvGFhy
# 1yNRAxAAjDYJELL34Qovt/WE9qKhYJEvIUGTl1IMWJ22YMFnqIFKRdka57dWoU3P
# 7EK1BHmokEEtzGT7Fe1ecERXsOwQIJDIkDTJ5g8Oc8Jt1iqY1AC8h5T+LghijCar
# mbZ6qWHaSjsg2lmek/xc9quymzFGGK36PSyB5WkaLRviKQn4RIkEDpUaWny7nDbA
# Q8zJJpBqNFqKfC5/DN0ePa3QQscXQJhey3nxqFd8hYp8RFNIV5UJVW5Lf6ombtK7
# atgdWC4ckkfO2z3OsghKeo/UiMFWpPktgBVVMhDLmk+P/E6czc2gfzD6SCvrPKTj
# XowI8hro22HVmq9bEY8PtbjMOfpxrAxer+tM2KR/0O9l3UzUacFsi7KGqCJ1/trQ
# 1tSDjlgyczb8GOgLwwxj8XE+jPHPfVrzCNfDqrBKBNxz6nnZSdZUwhV5mG8FdVtm
# oVVV96BIrNXLl/lIxYIFD/Zyvl8/lrSWQdLkEHTzihYQeXaQfyvPVbV/dOLT4sii
# YUuGCuEhF+DW/qz43G1krwq5/bfxsiZoQzrMV/Odtf0wYQKkabA3KNBIda/vxBCR
# dsLQ7QtmOwKmCzjqw4LUov9vDNYOYr98o7ZqwJ3qeKL4QgFwtEZUFO3VW6UR8fnF
# arVXiTn9wVlkTpu4sT5hLm9400iadhX4Fppji7Ce0tUpLbWbghA=
# =3x32
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 18 May 2023 10:12:53 AM PDT
# gpg:                using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [undefined]
# gpg:                 aka "Juan Quintela <quintela@trasno.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* tag 'migration-20230518-pull-request' of https://gitlab.com/juan.quintela/qemu:
  migration: Fix duplicated included in meson.build
  migration/multifd: Compute transferred bytes correctly
  migration: We don't need the field rate_limit_used anymore
  migration: Use migration_transferred_bytes() to calculate rate_limit
  migration: Add a trace for migration_transferred_bytes
  migration: Move migration_total_bytes() to migration-stats.c
  migration: Move rate_limit_max and rate_limit_used to migration_stats
  qemu-file: Account for rate_limit usage on qemu_fflush()
  migration: Don't use INT64_MAX for unlimited rate
  migration: process_incoming_migration_co(): move colo part to colo
  migration: split migration_incoming_co
  configure: add --disable-colo-proxy option

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-18 11:07:06 -07:00
Juan Quintela
ba9d2cbc01 migration: Fix duplicated included in meson.build
This is the commint with the merge error (not in the submited patch).

commit 52623f23b0
Author: Lukas Straub <lukasstraub2@web.de>
Date:   Thu Apr 20 11:48:35 2023 +0200

    ram-compress.c: Make target independent

    Make ram-compress.c target independent.

Fixes: 52623f23b0
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230509170217.83246-1-quintela@redhat.com>
2023-05-18 18:41:53 +02:00
Juan Quintela
cbec7eb768 migration/multifd: Compute transferred bytes correctly
In the past, we had to put the in the main thread all the operations
related with sizes due to qemu_file not beeing thread safe.  As now
all counters are atomic, we can update the counters just after the
do the write.  As an aditional bonus, we are able to use the right
value for the compression methods.  Right now we were assuming that
there were no compression at all.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230515195709.63843-17-quintela@redhat.com>
2023-05-18 18:41:46 +02:00
Juan Quintela
bd7ceaf6d5 migration: We don't need the field rate_limit_used anymore
Since previous commit, we calculate how much data we have send with
migration_transferred_bytes() so no need to maintain this counter and
remember to always update it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230515195709.63843-10-quintela@redhat.com>
2023-05-18 18:40:51 +02:00
Juan Quintela
813cd61669 migration: Use migration_transferred_bytes() to calculate rate_limit
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230515195709.63843-9-quintela@redhat.com>
2023-05-18 18:40:51 +02:00
Juan Quintela
3db9c05a90 migration: Add a trace for migration_transferred_bytes
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230515195709.63843-8-quintela@redhat.com>
2023-05-18 18:40:51 +02:00
Juan Quintela
99319e2daf migration: Move migration_total_bytes() to migration-stats.c
Once there rename it to migration_transferred_bytes() and pass a
QEMUFile instead of a migration object.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230515195709.63843-7-quintela@redhat.com>
2023-05-18 18:40:51 +02:00
Juan Quintela
e1fde0e038 migration: Move rate_limit_max and rate_limit_used to migration_stats
These way we can make them atomic and use this functions from any
place.  I also moved all functions that use rate_limit to
migration-stats.

Functions got renamed, they are not qemu_file anymore.

qemu_file_rate_limit -> migration_rate_exceeded
qemu_file_set_rate_limit -> migration_rate_set
qemu_file_get_rate_limit -> migration_rate_get
qemu_file_reset_rate_limit -> migration_rate_reset
qemu_file_acct_rate_limit -> migration_rate_account.

Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230515195709.63843-6-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-18 18:40:51 +02:00
Juan Quintela
de37f8b9c2 qemu-file: Account for rate_limit usage on qemu_fflush()
That is the moment we know we have transferred something.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230515195709.63843-5-quintela@redhat.com>
2023-05-18 18:40:51 +02:00