Commit Graph

1559 Commits

Author SHA1 Message Date
David Hildenbrand
6fee3a1fd9 migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots
We already don't ever migrate memory that corresponds to discarded ranges
as managed by a RamDiscardManager responsible for the mapped memory region
of the RAMBlock.

virtio-mem uses this mechanism to logically unplug parts of a RAMBlock.
Right now, we still populate zeropages for the whole usable part of the
RAMBlock, which is undesired because:

1. Even populating the shared zeropage will result in memory getting
   consumed for page tables.
2. Memory backends without a shared zeropage (like hugetlbfs and shmem)
   will populate an actual, fresh page, resulting in an unintended
   memory consumption.

Discarded ("logically unplugged") parts have to remain discarded. As
these pages are never part of the migration stream, there is no need to
track modifications via userfaultfd WP reliably for these parts.

Further, any writes to these ranges by the VM are invalid and the
behavior is undefined.

Note that Linux only supports userfaultfd WP on private anonymous memory
for now, which usually results in the shared zeropage getting populated.
The issue will become more relevant once userfaultfd WP supports shmem
and hugetlb.

Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:44 +01:00
David Hildenbrand
f7b9dcfbcf migration/ram: Factor out populating pages readable in ram_block_populate_pages()
Let's factor out prefaulting/populating to make further changes easier to
review and add a comment what we are actually expecting to happen. While at
it, use the actual page size of the ramblock, which defaults to
qemu_real_host_page_size for anonymous memory. Further, rename
ram_block_populate_pages() to ram_block_populate_read() as well, to make
it clearer what we are doing.

In the future, we might want to use MADV_POPULATE_READ to speed up
population.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:44 +01:00
David Hildenbrand
7648297d40 migration: Simplify alignment and alignment checks
Let's use QEMU_ALIGN_DOWN() and friends to make the code a bit easier to
read.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:44 +01:00
David Hildenbrand
9470c5e082 migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination
Currently, when someone (i.e., the VM) accesses discarded parts inside a
RAMBlock with a RamDiscardManager managing the corresponding mapped memory
region, postcopy will request migration of the corresponding page from the
source. The source, however, will never answer, because it refuses to
migrate such pages with undefined content ("logically unplugged"): the
pages are never dirty, and get_queued_page() will consequently skip
processing these postcopy requests.

Especially reading discarded ("logically unplugged") ranges is supposed to
work in some setups (for example with current virtio-mem), although it
barely ever happens: still, not placing a page would currently stall the
VM, as it cannot make forward progress.

Let's check the state via the RamDiscardManager (the state e.g.,
of virtio-mem is migrated during precopy) and avoid sending a request
that will never get answered. Place a fresh zero page instead to keep
the VM working. This is the same behavior that would happen
automatically without userfaultfd being active, when accessing virtual
memory regions without populated pages -- "populate on demand".

For now, there are valid cases (as documented in the virtio-mem spec) where
a VM might read discarded memory; in the future, we will disallow that.
Then, we might want to handle that case differently, e.g., warning the
user that the VM seems to be mis-behaving.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:44 +01:00
David Hildenbrand
be39b4cd20 migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source
We don't want to migrate memory that corresponds to discarded ranges as
managed by a RamDiscardManager responsible for the mapped memory region of
the RAMBlock. The content of these pages is essentially stale and
without any guarantees for the VM ("logically unplugged").

Depending on the underlying memory type, even reading memory might populate
memory on the source, resulting in an undesired memory consumption. Of
course, on the destination, even writing a zeropage consumes memory,
which we also want to avoid (similar to free page hinting).

Currently, virtio-mem tries achieving that goal (not migrating "unplugged"
memory that was discarded) by going via qemu_guest_free_page_hint() - but
it's hackish and incomplete.

For example, background snapshots still end up reading all memory, as
they don't do bitmap syncs. Postcopy recovery code will re-add
previously cleared bits to the dirty bitmap and migrate them.

Let's consult the RamDiscardManager after setting up our dirty bitmap
initially and when postcopy recovery code reinitializes it: clear
corresponding bits in the dirty bitmaps (e.g., of the RAMBlock and inside
KVM). It's important to fixup the dirty bitmap *after* our initial bitmap
sync, such that the corresponding dirty bits in KVM are actually cleared.

As colo is incompatible with discarding of RAM and inhibits it, we don't
have to bother.

Note: if a misbehaving guest would use discarded ranges after migration
started we would still migrate that memory: however, then we already
populated that memory on the migration source.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:44 +01:00
Peter Xu
60fd680193 migration: Add migrate_add_blocker_internal()
An internal version that removes -only-migratable implications.  It can be used
for temporary migration blockers like dump-guest-memory.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:44 +01:00
Peter Xu
4c170330aa migration: Make migration blocker work for snapshots too
save_snapshot() checks migration blocker, which looks sane.  At the meantime we
should also teach the blocker add helper to fail if during a snapshot, just
like for migrations.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:43 +01:00
Hyman Huang(黄勇)
0e21bf2460 migration/dirtyrate: implement dirty-ring dirtyrate calculation
use dirty ring feature to implement dirtyrate calculation.

introduce mode option in qmp calc_dirty_rate to specify what
method should be used when calculating dirtyrate, either
page-sampling or dirty-ring should be passed.

introduce "dirty_ring:-r" option in hmp calc_dirty_rate to
indicate dirty ring method should be used for calculation.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <7db445109bd18125ce8ec86816d14f6ab5de6a7d.1624040308.git.huangy81@chinatelecom.cn>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:43 +01:00
Hyman Huang(黄勇)
9865d0f68f migration/dirtyrate: move init step of calculation to main thread
since main thread may "query dirty rate" at any time, it's better
to move init step into main thead so that synchronization overhead
between "main" and "get_dirtyrate" can be reduced.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <109f8077518ed2f13068e3bfb10e625e964780f1.1624040308.git.huangy81@chinatelecom.cn>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:43 +01:00
Hyman Huang(黄勇)
15eb2d644c migration/dirtyrate: adjust order of registering thread
registering get_dirtyrate thread in advance so that both
page-sampling and dirty-ring mode can be covered.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <d7727581a8e86d4a42fc3eacf7f310419b9ebf7e.1624040308.git.huangy81@chinatelecom.cn>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:43 +01:00
Hyman Huang(黄勇)
71864eadd9 migration/dirtyrate: introduce struct and adjust DirtyRateStat
introduce "DirtyRateMeasureMode" to specify what method should be
used to calculate dirty rate, introduce "DirtyRateVcpu" to store
dirty rate for each vcpu.

use union to store stat data of specific mode

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <661c98c40f40e163aa58334337af8f3ddf41316a.1624040308.git.huangy81@chinatelecom.cn>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:43 +01:00
Hyman Huang(黄勇)
63b41db4bc memory: make global_dirty_tracking a bitmask
since dirty ring has been introduced, there are two methods
to track dirty pages of vm. it seems that "logging" has
a hint on the method, so rename the global_dirty_log to
global_dirty_tracking would make description more accurate.

dirty rate measurement may start or stop dirty tracking during
calculation. this conflict with migration because stop dirty
tracking make migration leave dirty pages out then that'll be
a problem.

make global_dirty_tracking a bitmask can let both migration and
dirty rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION
and GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty
tracking aims for, migration or dirty rate.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <9c9388657cfa0301bd2c1cfa36e7cf6da4aeca19.1624040308.git.huangy81@chinatelecom.cn>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:43 +01:00
Li Zhijian
b390afd8c5 migration/rdma: Fix out of order wrid
destination:
../qemu/build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server-migration.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5902,disable-ticketing -incoming rdma:192.168.22.23:8888
qemu-system-x86_64: -spice streaming-video=filter,port=5902,disable-ticketing: warning: short-form boolean option 'disable-ticketing' deprecated
Please use disable-ticketing=on instead
QEMU 6.0.50 monitor - type 'help' for more information
(qemu) trace-event qemu_rdma_block_for_wrid_miss on
(qemu) dest_init RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet
qemu_rdma_block_for_wrid_miss A Wanted wrid CONTROL SEND (2000) but got CONTROL RECV (4000)

source:
../qemu/build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5901,disable-ticketing -S
qemu-system-x86_64: -spice streaming-video=filter,port=5901,disable-ticketing: warning: short-form boolean option 'disable-ticketing' deprecated
Please use disable-ticketing=on instead
QEMU 6.0.50 monitor - type 'help' for more information
(qemu)
(qemu) trace-event qemu_rdma_block_for_wrid_miss on
(qemu) migrate -d rdma:192.168.22.23:8888
source_resolve_host RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet
(qemu) qemu_rdma_block_for_wrid_miss A Wanted wrid WRITE RDMA (1) but got CONTROL RECV (4000)

NOTE: we use soft RoCE as the rdma device.
[root@iaas-rpma images]# rdma link show rxe_eth0/1
link rxe_eth0/1 state ACTIVE physical_state LINK_UP netdev eth0

This migration could not be completed when out of order(OOO) CQ event occurs.
The send queue and receive queue shared a same completion queue, and
qemu_rdma_block_for_wrid() will drop the CQs it's not interested in. But
the dropped CQs by qemu_rdma_block_for_wrid() could be later CQs it wants.
So in this case, qemu_rdma_block_for_wrid() will block forever.

OOO cases will occur in both source side and destination side. And a
forever blocking happens on only SEND and RECV are out of order. OOO between
'WRITE RDMA' and 'RECV' doesn't matter.

below the OOO sequence:
       source                             destination
      rdma_write_one()                   qemu_rdma_registration_handle()
1.    S1: post_recv X                    D1: post_recv Y
2.    wait for recv CQ event X
3.                                       D2: post_send X     ---------------+
4.                                       wait for send CQ send event X (D2) |
5.    recv CQ event X reaches (D2)                                          |
6.  +-S2: post_send Y                                                       |
7.  | wait for send CQ event Y                                              |
8.  |                                    recv CQ event Y (S2) (drop it)     |
9.  +-send CQ event Y reaches (S2)                                          |
10.                                      send CQ event X reaches (D2)  -----+
11.                                      wait recv CQ event Y (dropped by (8))

Although a hardware IB works fine in my a hundred of runs, the IB specification
doesn't guaratee the CQ order in such case.

Here we introduce a independent send completion queue to distinguish
ibv_post_send completion queue from the original mixed completion queue.
It helps us to poll the specific CQE we are really interested in.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 12:49:29 +01:00
Li Zhijian
911965ace9 migration/rdma: advise prefetch write for ODP region
The responder mr registering with ODP will sent RNR NAK back to
the requester in the face of the page fault.
---------
ibv_poll_cq wc.status=13 RNR retry counter exceeded!
ibv_poll_cq wrid=WRITE RDMA!
---------
ibv_advise_mr(3) helps to make pages present before the actual IO is
conducted so that the responder does page fault as little as possible.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-10-19 08:39:04 +02:00
Li Zhijian
e2daccb0d0 migration/rdma: Try to register On-Demand Paging memory region
Previously, for the fsdax mem-backend-file, it will register failed with
Operation not supported. In this case, we can try to register it with
On-Demand Paging[1] like what rpma_mr_reg() does on rpma[2].

[1]: https://community.mellanox.com/s/article/understanding-on-demand-paging--odp-x
[2]: http://pmem.io/rpma/manpages/v0.9.0/rpma_mr_reg.3

CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-10-19 08:39:04 +02:00
Li Zhijian
5ad15e8614 migration: allow enabling mutilfd for specific protocol only
To: <quintela@redhat.com>, <dgilbert@redhat.com>, <qemu-devel@nongnu.org>
CC: Li Zhijian <lizhijian@cn.fujitsu.com>
Date: Sat, 31 Jul 2021 22:05:52 +0800 (5 weeks, 4 days, 17 hours ago)

And change the default to true so that in '-incoming defer' case, user is able
to change multifd capability.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-10-19 08:39:04 +02:00
Li Zhijian
b7acd65707 migration: allow multifd for socket protocol only
To: <quintela@redhat.com>, <dgilbert@redhat.com>, <qemu-devel@nongnu.org>
CC: Li Zhijian <lizhijian@cn.fujitsu.com>
Date: Sat, 31 Jul 2021 22:05:51 +0800 (5 weeks, 4 days, 17 hours ago)

multifd with unsupported protocol will cause a segment fault.
(gdb) bt
 #0  0x0000563b4a93faf8 in socket_connect (addr=0x0, errp=0x7f7f02675410) at ../util/qemu-sockets.c:1190
 #1 0x0000563b4a797a03 in qio_channel_socket_connect_sync
(ioc=0x563b4d16e8c0, addr=0x0, errp=0x7f7f02675410) at
../io/channel-socket.c:145
 #2  0x0000563b4a797abf in qio_channel_socket_connect_worker (task=0x563b4cd86c30, opaque=0x0) at ../io/channel-socket.c:168
 #3  0x0000563b4a792631 in qio_task_thread_worker (opaque=0x563b4cd86c30) at ../io/task.c:124
 #4  0x0000563b4a91da69 in qemu_thread_start (args=0x563b4c44bb80) at ../util/qemu-thread-posix.c:541
 #5  0x00007f7fe9b5b3f9 in ?? ()
 #6  0x0000000000000000 in ?? ()

It's enough to check migrate_multifd_is_allowed() in multifd cleanup() and
multifd setup() though there are so many other places using migrate_use_multifd().

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-10-19 08:39:04 +02:00
David Hildenbrand
1230a25f6f migration/ram: Don't passs RAMState to migration_clear_memory_region_dirty_bitmap_*()
The parameter is unused, let's drop it.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-10-19 08:39:04 +02:00
Lukas Straub
e9ab82b858 multifd: Unconditionally unregister yank function
To: qemu-devel <qemu-devel@nongnu.org>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Juan Quintela
 <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras Soares
 Passos <lsoaresp@redhat.com>
Date: Wed, 4 Aug 2021 21:26:32 +0200 (5 weeks, 11 hours, 52 minutes ago)

[[PGP Signed Part:No public key for 35AB0B289C5DB258 created at 2021-08-04T21:26:32+0200 using RSA]]
Unconditionally unregister yank function in multifd_load_cleanup().
If it is not unregistered here, it will leak and cause a crash
in yank_unregister_instance(). Now if the ioc is still in use
afterwards, it will only lead to qemu not being able to recover
from a hang related to that ioc.

After checking the code, i am pretty sure that ref is always 1
when arriving here. So all this currently does is remove the
unneeded check.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-10-19 08:39:04 +02:00
Lukas Straub
20171ea895 multifd: Implement yank for multifd send side
To: qemu-devel <qemu-devel@nongnu.org>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Juan Quintela
 <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras Soares
 Passos <lsoaresp@redhat.com>
Date: Wed, 1 Sep 2021 17:58:57 +0200 (1 week, 15 hours, 17 minutes ago)

[[PGP Signed Part:No public key for 35AB0B289C5DB258 created at 2021-09-01T17:58:57+0200 using RSA]]
When introducing yank functionality in the migration code I forgot
to cover the multifd send side.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-10-19 08:39:04 +02:00
Emanuele Giuseppe Esposito
68b88468f6 migration: add missing qemu_mutex_lock_iothread in migration_completion
qemu_savevm_state_complete_postcopy assumes the iothread lock (BQL)
to be held, but instead it isn't.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20211005080751.3797161-3-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-05 13:10:29 +02:00
Emanuele Giuseppe Esposito
3c158eba1e migration: block-dirty-bitmap: add missing qemu_mutex_lock_iothread
init_dirty_bitmap_migration assumes the iothread lock (BQL)
to be held, but instead it isn't.

Instead of adding the lock to qemu_savevm_state_setup(),
follow the same pattern as the other ->save_setup callbacks
and lock+unlock inside dirty_bitmap_save_setup().

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211005080751.3797161-2-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-05 13:10:29 +02:00
Markus Armbruster
7d6f6933aa migration: Handle migration_incoming_setup() errors consistently
Commit b673eab4e2 "multifd: Make multifd_load_setup() get an Error
parameter" changed migration_incoming_setup() to take an Error **
argument, and adjusted the callers accordingly.  It neglected to
change adjust multifd_load_setup(): it still exit()s on error.  Clean
that up.

The error now gets propagated up two call chains: via
migration_fd_process_incoming() to rdma_accept_incoming_migration(),
and via migration_ioc_process_incoming() to
migration_channel_process_incoming().  Both chain ends report the
error with error_report_err(), but otherwise ignore it.  Behavioral
change: we no longer exit() on this error.

This is consistent with how we handle other errors here, e.g. from
multifd_recv_new_channel() via migration_ioc_process_incoming() to
migration_channel_process_incoming().  Whether it's consistently right
or consistently wrong I can't tell.

Also clean up the return value from the unusual 0 on success, 1 on
error to the more common true on success, false on error.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210720125408.387910-11-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-26 17:15:28 +02:00
Markus Armbruster
f9734d5d40 error: Use error_fatal to simplify obvious fatal errors (again)
We did this with scripts/coccinelle/use-error_fatal.cocci before, in
commit 50beeb6809 and 007b06578a.  This commit cleans up rarer
variations that don't seem worth matching with Coccinelle.

Cc: Thomas Huth <thuth@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210720125408.387910-2-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-08-26 17:15:28 +02:00
Wei Wang
3143577d6a migration: clear the memory region dirty bitmap when skipping free pages
When skipping free pages to send, their corresponding dirty bits in the
memory region dirty bitmap need to be cleared. Otherwise the skipped
pages will be sent in the next round after the migration thread syncs
dirty bits from the memory region dirty bitmap.

Cc: David Hildenbrand <david@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Message-Id: <20210722083055.23352-1-wei.w.wang@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-26 12:50:13 +01:00
Peter Xu
39675ffffb migration: Move the yank unregister of channel_close out
It's efficient, but hackish to call yank unregister calls in channel_close(),
especially it'll be hard to debug when qemu crashed with some yank function
leaked.

Remove that hack, but instead explicitly unregister yank functions at the
places where needed, they are:

  (on src)
  - migrate_fd_cleanup
  - postcopy_pause

  (on dst)
  - migration_incoming_state_destroy
  - postcopy_pause_incoming

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210722175841.938739-6-peterx@redhat.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-26 12:45:03 +01:00
Peter Xu
c6ad5be7ae migration: Teach QEMUFile to be QIOChannel-aware
migration uses QIOChannel typed qemufiles.  In follow up patches, we'll need
the capability to identify this fact, so that we can get the backing QIOChannel
from a QEMUFile.

We can also define types for QEMUFile but so far since we only need to be able
to identify QIOChannel, introduce a boolean which is simpler.

Introduce another helper qemu_file_get_ioc() to return the ioc backend of a
qemufile if has_ioc is set.

No functional change.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210722175841.938739-5-peterx@redhat.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-26 12:44:59 +01:00
Peter Xu
18711405b5 migration: Introduce migration_ioc_[un]register_yank()
There're plenty of places in migration/* that checks against either socket or
tls typed ioc for yank operations.  Provide two helpers to hide all these
information.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210722175841.938739-4-peterx@redhat.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-26 12:44:54 +01:00
Peter Xu
43044ac0ee migration: Make from_dst_file accesses thread-safe
Accessing from_dst_file is potentially racy in current code base like below:

  if (s->from_dst_file)
    do_something(s->from_dst_file);

Because from_dst_file can be reset right after the check in another
thread (rp_thread).  One example is migrate_fd_cancel().

Use the same qemu_file_lock to protect it too, just like to_dst_file.

When it's safe to access without lock, comment it.

There's one special reference in migration_thread() that can be replaced by
the newly introduced rp_thread_created flag.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Message-Id: <20210722175841.938739-3-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  with Peter's fixup
2021-07-26 12:44:46 +01:00
Peter Xu
53021ea165 migration: Fix missing join() of rp_thread
It's possible that the migration thread skip the join() of the rp_thread in
below race and crash on src right at finishing migration:

       migration_thread                     rp_thread
       ----------------                     ---------
    migration_completion()
                                        (before rp_thread quits)
                                        from_dst_file=NULL
                                        [thread got scheduled out]
      s->rp_state.from_dst_file==NULL
        (skip join() of rp_thread)
    migrate_fd_cleanup()
      qemu_fclose(s->to_dst_file)
      yank_unregister_instance()
        assert(yank_find_entry())  <------- crash

It could mostly happen with postcopy, but that shouldn't be required, e.g., I
think it could also trigger with MIGRATION_CAPABILITY_RETURN_PATH set.

It's suspected that above race could be the root cause of a recent (but rare)
migration-test break reported by either Dave or PMM:

https://lore.kernel.org/qemu-devel/YPamXAHwan%2FPPXLf@work-vm/

The issue is: from_dst_file is reset in the rp_thread, so if the thread reset
it to NULL fast enough then the migration thread will assume there's no
rp_thread at all.

This could potentially cause more severe issue (e.g. crash) after the yank code.

Fix it by using a boolean to keep "whether we've created rp_thread".

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210722175841.938739-2-peterx@redhat.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-26 12:44:26 +01:00
Peter Xu
63268c4970 migration: Move bitmap_mutex out of migration_bitmap_clear_dirty()
Taking the mutex every time for each dirty bit to clear is too slow, especially
we'll take/release even if the dirty bit is cleared.  So far it's only used to
sync with special cases with qemu_guest_free_page_hint() against migration
thread, nothing really that serious yet.  Let's move the lock to be upper.

There're two callers of migration_bitmap_clear_dirty().

For migration, move it into ram_save_iterate().  With the help of MAX_WAIT
logic, we'll only run ram_save_iterate() for no more than 50ms-ish time, so
taking the lock once there at the entry.  It also means any call sites to
qemu_guest_free_page_hint() can be delayed; but it should be very rare, only
during migration, and I don't see a problem with it.

For COLO, move it up to colo_flush_ram_cache().  I think COLO forgot to take
that lock even when calling ramblock_sync_dirty_bitmap(), where another example
is migration_bitmap_sync() who took it right.  So let the mutex cover both the
ramblock_sync_dirty_bitmap() and migration_bitmap_clear_dirty() calls.

It's even possible to drop the lock so we use atomic operations upon rb->bmap
and the variable migration_dirty_pages.  I didn't do it just to still be safe,
also not predictable whether the frequent atomic ops could bring overhead too
e.g. on huge vms when it happens very often.  When that really comes, we can
keep a local counter and periodically call atomic ops.  Keep it simple for now.

Cc: Wei Wang <wei.w.wang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: Leonardo Bras Soares Passos <lsoaresp@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210630200805.280905-1-peterx@redhat.com>
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-13 16:21:57 +01:00
Peter Xu
ca7bd0821b migration: Clear error at entry of migrate_fd_connect()
For each "migrate" command, remember to clear the s->error before going on.
For one reason, when there's a new error it'll be still remembered; see
migrate_set_error() who only sets the error if error==NULL.  Meanwhile if a
failed migration completes (e.g., postcopy recovered and finished), we
shouldn't dump an error when calling migrate_fd_cleanup() at last.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210708190653.252961-4-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-13 16:21:57 +01:00
Peter Xu
ca30f24d12 migration: Don't do migrate cleanup if during postcopy resume
Below process could crash qemu with postcopy recovery:

  1. (hmp) migrate -d ..
  2. (hmp) migrate_start_postcopy
  3. [network down, postcopy paused]
  4. (hmp) migrate -r $WRONG_PORT
     when try the recover on an invalid $WRONG_PORT, cleanup_bh will be cleared
  5. (hmp) migrate -r $RIGHT_PORT
     [qemu crash on assert(cleanup_bh)]

The thing is we shouldn't cleanup if it's postcopy resume; the error is set
mostly because the channel is wrong, so we return directly waiting for the user
to retry.

migrate_fd_cleanup() should only be called when migration is cancelled or
completed.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210708190653.252961-3-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-13 16:21:57 +01:00
Peter Xu
2e3e3da3c2 migration: Release return path early for paused postcopy
When postcopy pause triggered, we rely on the migration thread to cleanup the
to_dst_file handle, and the return path thread to cleanup the from_dst_file
handle (which is stored in the local variable "rp").

Within the process, from_dst_file cleanup (qemu_fclose) is postponed until it's
setup again due to a postcopy recovery.

It used to work before yank was born; after yank is introduced we rely on the
refcount of IOC to correctly unregister yank function in channel_close().  If
without the early and on-time release of from_dst_file handle the yank function
will be leftover during paused postcopy.

Without this patch, below steps (quoted from Xiaohui) could trigger qemu src
crash:

  1.Boot vm on src host
  2.Boot vm on dst host
  3.Enable postcopy on src&dst host
  4.Load stressapptest in vm and set postcopy speed to 50M
  5.Start migration from src to dst host, change into postcopy mode when migration is active.
  6.When postcopy is active, down the network card(do migration via this network) on dst host.
  7.Wait untill postcopy is paused on src&dst host.
  8.Before up network card, recover migration on dst host, will get error like following.
  9.Ignore the error of step 8, go on recovering migration on src host:

  After step 9, qemu on src host will core dump after some seconds:
  qemu-kvm: ../util/yank.c:107: yank_unregister_instance: Assertion `QLIST_EMPTY(&entry->yankfns)' failed.
  1.sh: line 38: 44662 Aborted                 (core dumped)

Reported-by: Li Xiaohui <xiaohli@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210708190653.252961-2-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-13 16:21:57 +01:00
Laurent Vivier
a51dcef08b migration: failover: emit a warning when the card is not fully unplugged
When the migration fails or is canceled we wait the end of the unplug
operation to be able to plug it back. But if the unplug operation
is never finished we stop to wait and QEMU emits a warning to inform
the user.

Based-on: 20210629155007.629086-1-lvivier@redhat.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20210701131458.112036-1-lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-13 16:21:57 +01:00
Li Zhijian
224f364a49 migration/rdma: prevent from double free the same mr
backtrace:
'0x00007ffff5f44ec2 in __ibv_dereg_mr_1_1 (mr=0x7fff1007d390) at /home/lizhijian/rdma-core/libibverbs/verbs.c:478
478             void *addr              = mr->addr;
(gdb) bt
 #0  0x00007ffff5f44ec2 in __ibv_dereg_mr_1_1 (mr=0x7fff1007d390) at /home/lizhijian/rdma-core/libibverbs/verbs.c:478
 #1  0x0000555555891fcc in rdma_delete_block (block=<optimized out>, rdma=0x7fff38176010) at ../migration/rdma.c:691
 #2  qemu_rdma_cleanup (rdma=0x7fff38176010) at ../migration/rdma.c:2365
 #3  0x00005555558925b0 in qio_channel_rdma_close_rcu (rcu=0x555556b8b6c0) at ../migration/rdma.c:3073
 #4  0x0000555555d652a3 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:281
 #5  0x0000555555d5edf9 in qemu_thread_start (args=0x7fffe88bb4d0) at ../util/qemu-thread-posix.c:541
 #6  0x00007ffff54c73f9 in start_thread () at /lib64/libpthread.so.0
 #7  0x00007ffff53f3b03 in clone () at /lib64/libc.so.6 '

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210708144521.1959614-1-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-13 16:21:57 +01:00
Olaf Hering
179a808045 migration: fix typo in mig_throttle_guest_down comment
Fixes commit 3d0684b2ad ("ram: Update
all functions comments")

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210708162159.18045-1-olaf@aepfle.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-07-09 18:42:46 +02:00
Li Zhijian
eb1960aac1 misc: Remove redundant new line in perror()
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210706094433.1766952-1-lizhijian@cn.fujitsu.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-07-09 18:42:46 +02:00
Li Zhijian
e5f607913c migration/rdma: Use error_report to suppress errno message
Since the prior calls are successful, in this case a errno doesn't
indicate a real error which would just make us confused.

before:
(qemu) migrate -d rdma:192.168.22.23:8888
source_resolve_host RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet
rdma_get_cm_event != EVENT_ESTABLISHED after rdma_connect: No space left on device

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210628071959.23455-1-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Laurent Vivier
944bc52842 migration: failover: continue to wait card unplug on error
If the user cancels the migration in the unplug-wait state,
QEMU will try to plug back the card and this fails because the card
is partially unplugged.
To avoid the problem, continue to wait the card unplug, but to
allow the migration to be canceled if the card never finishes to unplug
use a timeout.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210629155007.629086-3-lvivier@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Laurent Vivier
fde93d99d9 migration: move wait-unplug loop to its own function
The loop is used in migration_thread() and bg_migration_thread(),
so we can move it to its own function and call it from these both places.

Moreover, in migration_thread() we have a wrong state transition from
SETUP to ACTIVE while state could be WAIT_UNPLUG. This is correctly
managed in bg_migration_thread() so use this code instead.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20210629155007.629086-2-lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Peter Xu
b7f9afd48e migration: Allow reset of postcopy_recover_triggered when failed
It's possible qemu_start_incoming_migration() failed at any point, when it
happens we should reset postcopy_recover_triggered to false so that the user
can still retry with a saner incoming port.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210629181356.217312-3-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Peter Xu
cc48c587d2 migration: Move yank outside qemu_start_incoming_migration()
Starting from commit b5eea99ec2, qmp_migrate_recover() calls unregister
before calling qemu_start_incoming_migration(). I believe it wanted to mitigate
the next call to yank_register_instance(), but I think that's wrong.

Firstly, if during recover, we should keep the yank instance there, not
"quickly removing and adding it back".

Meanwhile, calling qmp_migrate_recover() twice with b5eea99ec2 will directly
crash the dest qemu (right now it can't; but it'll start to work right after
the next patch) because the 1st call of qmp_migrate_recover() will unregister
permanently when the channel failed to establish, then the 2nd call of
qmp_migrate_recover() crashes at yank_unregister_instance().

This patch fixes it by moving yank ops out of qemu_start_incoming_migration()
into qmp_migrate_incoming.  For qmp_migrate_recover(), drop the unregister of
yank instance too since we keep it there during the recovery phase.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210629181356.217312-2-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Feng Lin
c00d434ac6 migration: fix the memory overwriting risk in add_to_iovec
When testing migration, a Segmentation fault qemu core is generated.
0  error_free (err=0x1)
1  0x00007f8b862df647 in qemu_fclose (f=f@entry=0x55e06c247640)
2  0x00007f8b8516d59a in migrate_fd_cleanup (s=s@entry=0x55e06c0e1ef0)
3  0x00007f8b8516d66c in migrate_fd_cleanup_bh (opaque=0x55e06c0e1ef0)
4  0x00007f8b8626a47f in aio_bh_poll (ctx=ctx@entry=0x55e06b5a16d0)
5  0x00007f8b8626e71f in aio_dispatch (ctx=0x55e06b5a16d0)
6  0x00007f8b8626a33d in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
7  0x00007f8b866bdba4 in g_main_context_dispatch ()
8  0x00007f8b8626cde9 in glib_pollfds_poll ()
9  0x00007f8b8626ce62 in os_host_main_loop_wait (timeout=<optimized out>)
10 0x00007f8b8626cffd in main_loop_wait (nonblocking=nonblocking@entry=0)
11 0x00007f8b862ef01f in main_loop ()
Using gdb print the struct QEMUFile f = {
  ...,
  iovcnt = 65, last_error = 21984,
  last_error_obj = 0x1, shutdown = true
}
Well iovcnt is overflow, because the max size of MAX_IOV_SIZE is 64.
struct QEMUFile {
    ...;
    struct iovec iov[MAX_IOV_SIZE];
    unsigned int iovcnt;
    int last_error;
    Error *last_error_obj;
    bool shutdown;
};
iovcnt and last_error is overwrited by add_to_iovec().
Right now, add_to_iovec() increase iovcnt before check the limit.
And it seems that add_to_iovec() assumes that iovcnt will set to zero
in qemu_fflush(). But qemu_fflush() will directly return when f->shutdown
is true.

The situation may occur when libvirtd restart during migration, after
f->shutdown is set, before calling qemu_file_set_error() in
qemu_file_shutdown().

So the safiest way is checking the iovcnt before increasing it.

Signed-off-by: Feng Lin <linfeng23@huawei.com>
Message-Id: <20210625062138.1899-1-linfeng23@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Fix typo in 'writeable' which is actually misnamed 'writable'
2021-07-05 10:51:26 +01:00
Philippe Mathieu-Daudé
5590f65fac migration/tls: Use qcrypto_tls_creds_check_endpoint()
Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-29 18:30:20 +01:00
David Hildenbrand
8dbe22c686 memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()
Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The
new flag has the following semantics:

"
RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge
pages if applicable) is skipped: will bail out if not supported. When not
set, the OS will do the reservation, if supported for the memory type.
"

Allow passing it into:
- memory_region_init_ram_nomigrate()
- memory_region_init_resizeable_ram()
- memory_region_init_ram_from_file()

... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag.
Bail out if the flag is not supported, which is the case right now for
both, POSIX and win32. We will add Linux support next and allow specifying
RAM_NORESERVE via memory backends.

The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
Daniel P. Berrangé
85cd1cc668 migration: use GDateTime for formatting timestamp in snapshot names
The GDateTime APIs provided by GLib avoid portability pitfalls, such
as some platforms where 'struct timeval.tv_sec' field is still 'long'
instead of 'time_t'. When combined with automatic cleanup, GDateTime
often results in simpler code too.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-14 13:28:50 +01:00
Daniel P. Berrangé
626ff6515d migration: add trace point when vm_stop_force_state fails
This is a critical failure scenario for migration that is hard to
diagnose from existing probes. Most likely it is caused by an error
from bdrv_flush(), but we're not logging the errno anywhere, hence
this new probe.

Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-06-14 13:28:50 +01:00
Rao, Lei
3ba024457f Remove migrate_set_block_enabled in checkpoint
We can detect disk migration in migrate_prepare, if disk migration
is enabled in COLO mode, we can directly report an error.and there
is no need to disable block migration at every checkpoint.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2021-06-11 10:30:13 +08:00
Peter Xu
a4a571d978 hmp: Add "calc_dirty_rate" and "info dirty_rate" cmds
These two commands are missing when adding the QMP sister commands.
Add them, so developers can play with them easier.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <4cc0039fc3ad6145136770cf3b0f056c09a2910b.1623027729.git.huangy81@chinatelecom.cn>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-06-08 20:18:26 +01:00
Hyman Huang(黄勇)
7afa08cd8f migration/dirtyrate: make sample page count configurable
introduce optional sample-pages argument in calc-dirty-rate,
making sample page count per GB configurable so that more
accurate dirtyrate can be calculated.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Message-Id: <3103453a3b2796f929269c99a6ad81a9a7f1f405.1623027729.git.huangy81@chinatelecom.cn>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Wrapped a couple of long lines
2021-06-08 20:18:25 +01:00
Dr. David Alan Gilbert
a59136f3b1 migration/socket: Close the listener at the end
Delay closing the listener until the cleanup hook at the end; mptcp
needs the listener to stay open while the other paths come in.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210421112834.107651-5-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-06-08 19:36:19 +01:00
Dr. David Alan Gilbert
1df6ddb43b migration: Add cleanup hook for inwards migration
Add a cleanup hook for incoming migration that gets called
at the end as a way for a transport to allow cleanup.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210421112834.107651-4-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-06-08 19:36:18 +01:00
Li Zhijian
6b8c2eb5c6 migration/rdma: Fix cm event use after free
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210602023506.3821293-1-lizhijian@cn.fujitsu.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-06-08 18:55:43 +01:00
Leonardo Bras
7de2e85653 yank: Unregister function when using TLS migration
After yank feature was introduced in migration, whenever migration
is started using TLS, the following error happens in both source and
destination hosts:

(qemu) qemu-kvm: ../util/yank.c:107: yank_unregister_instance:
Assertion `QLIST_EMPTY(&entry->yankfns)' failed.

This happens because of a missing yank_unregister_function() when using
qio-channel-tls.

Fix this by also allowing TYPE_QIO_CHANNEL_TLS object type to perform
yank_unregister_function() in channel_close() and multifd_load_cleanup().

Also, inside migration_channel_connect() and
migration_channel_process_incoming() move yank_register_function() so
it only runs once on a TLS migration.

Fixes: b5eea99ec2 ("migration: Add yank feature", 2021-01-13)
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1964326
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>

--
Changes since v2:
- Dropped all references to ioc->master
- yank_register_function() and yank_unregister_function() now only run
  once in a TLS migration.

Changes since v1:
- Cast p->c to QIOChannelTLS into multifd_load_cleanup()
Message-Id: <20210601054030.1153249-1-leobras.c@gmail.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-06-08 18:50:03 +01:00
Stefano Garzarella
d0fb9657a3 docs: fix references to docs/devel/tracing.rst
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.

We still have several references to the old file, so let's fix them
with the following command:

  sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-06-02 06:51:09 +02:00
Peter Maydell
c5847f5e4e Virtiofs, migration and hmp pull 2021-05-26
Fixes for a loadvm regression from Kevin,
 some virtiofsd cleanups from Vivek and Mahmoud, and
 some RDMA migration fixups from Li.
 
 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmCuiMIACgkQBRYzHrxb
 /edPvxAAimpeeMuCBKPswmuxyAqAoes6FKWNohzCyTCWXEmhvZGFl6SEgMxW0YWh
 DRLfyGUp9CODqNS2oVbj3I7/R5rDvpaqWFJ8GiBH99pc5B+pxYtXjGhjR8JWZ691
 ZvoQsV/nN9dC7woIG5rcus09aPg1X8n8lD2GcV007Gacj3FGGHY6+DOmpfYmrdoG
 QY2OTQ2lMW3l1ACOhtdpWzVKgldYwTDLZsGWuFy+z5b64ijXyImpWa39Hu0HoWdS
 vds5D9GGMfI8kgOlIgsCoC26kre40952VqhOS5ie3axSZt48kufLMUaspiHMiA/B
 mwmrqMrNXoqC7pAWnv8Ur4NjEBlTkPtL4+kIMX0zeqJsRG24Fiee0e42CvNjFixF
 1F419HHKFQPYOvCb/SPLv18gmcyfHdTEbB+yjYIO48ccrPfu/LtTTKG85D1lFw7C
 cy3AWKz6EzYuzs0wtia+0K+5UwOhm2h9Z77kyX5MIfyrb+KWAfupFW6l9NnfmBtp
 +sJBSkIrph6oNmZADx2eSwev+pdlcy0uTOULOiPTF4Iu826LQxoaV9pFcvA+/HXZ
 EbKNrYxDNNYCugLybtULIiFJ8f1i90/UtQlrqAr21deC+gJYliDZScXnzi6lPuv7
 afn4QgMmFLus4ehB0TKcA7QbWL3yyrivXO4TSQNzcZaeXAvHOw0=
 =dIM2
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20210526a' into staging

Virtiofs, migration and hmp pull 2021-05-26

Fixes for a loadvm regression from Kevin,
some virtiofsd cleanups from Vivek and Mahmoud, and
some RDMA migration fixups from Li.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

# gpg: Signature made Wed 26 May 2021 18:43:30 BST
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20210526a:
  migration/rdma: source: poll cm_event from return path
  migration/rdma: destination: create the return patch after the first accept
  migration/rdma: Fix rdma_addrinfo res leaks
  migration/rdma: cleanup rdma in rdma_start_incoming_migration error path
  migration/rdma: Fix cm_event used before being initialized
  tools/virtiofsd/fuse_opt.c: Replaced a malloc with GLib's g_try_malloc
  tools/virtiofsd/buffer.c: replaced a calloc call with GLib's g_try_new0
  virtiofsd: Set req->reply_sent right after sending reply
  virtiofsd: Check EOF before short read
  virtiofsd: Simplify skip byte logic
  virtiofsd: get rid of in_sg_left variable
  virtiofsd: Use iov_discard_front() to skip bytes
  virtiofsd: Get rid of unreachable code in read
  virtiofsd: Check for EINTR in preadv() and retry
  hmp: Fix loadvm to resume the VM on success instead of failure

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-27 14:57:01 +01:00
Li Zhijian
e49e49dd73 migration/rdma: source: poll cm_event from return path
source side always blocks if postcopy is only enabled at source side.
users are not able to cancel this migration in this case.

Let source side have chance to cancel this migration

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210525080552.28259-4-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Typo fix
2021-05-26 18:39:32 +01:00
Li Zhijian
44bcfd45e9 migration/rdma: destination: create the return patch after the first accept
destination side:
$ build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server-migration.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5902,disable-ticketing -incoming rdma:192.168.1.10:8888
(qemu) migrate_set_capability postcopy-ram on
(qemu)
dest_init RDMA Device opened: kernel name rocep1s0f0 uverbs device name uverbs0, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs0, infiniband class device path /sys/class/infiniband/rocep1s0f0, transport: (2) Ethernet
Segmentation fault (core dumped)

 (gdb) bt
 #0  qemu_rdma_accept (rdma=0x0) at ../migration/rdma.c:3272
 #1  rdma_accept_incoming_migration (opaque=0x0) at     ../migration/rdma.c:3986
 #2  0x0000563c9e51f02a in aio_dispatch_handler
     (ctx=ctx@entry=0x563ca0606010, node=0x563ca12b2150) at ../util/aio-posix.c:329
 #3  0x0000563c9e51f752 in aio_dispatch_handlers (ctx=0x563ca0606010) at      ../util/aio-posix.c:372
 #4  aio_dispatch (ctx=0x563ca0606010) at ../util/aio-posix.c:382
 #5  0x0000563c9e4f4d9e in aio_ctx_dispatch (source=<optimized out>,      callback=<optimized out>, user_data=<optimized out>)    at ../util/async.c:306
 #6  0x00007fe96ef3fa9f in g_main_context_dispatch () at      /lib64/libglib-2.0.so.0
 #7  0x0000563c9e4ffeb8 in glib_pollfds_poll () at     ../util/main-loop.c:231
 #8  os_host_main_loop_wait (timeout=12188789) at     ../util/main-loop.c:254
 #9  main_loop_wait (nonblocking=nonblocking@entry=0) at     ../util/main-loop.c:530
 #10 0x0000563c9e3c7211 in qemu_main_loop () at     ../softmmu/runstate.c:725
 #11 0x0000563c9dfd46fe in main (argc=<optimized out>, argv=<optimized     out>, envp=<optimized out>) at ../softmmu/main.c:50

The rdma return path will not be created when qemu incoming is starting
since migrate_copy() is false at that moment, then a  NULL return path
rdma was referenced if the user enabled postcopy later.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210525080552.28259-3-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26 18:39:32 +01:00
Li Zhijian
f53b450ada migration/rdma: Fix rdma_addrinfo res leaks
rdma_freeaddrinfo() is the reverse operation of rdma_getaddrinfo()

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210525080552.28259-2-lizhijian@cn.fujitsu.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26 18:39:32 +01:00
Li Zhijian
4e812d2338 migration/rdma: cleanup rdma in rdma_start_incoming_migration error path
the error path after calling qemu_rdma_dest_init() should do rdma cleanup

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-Id: <20210520081148.17001-1-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26 18:39:32 +01:00
Li Zhijian
efb208dc9c migration/rdma: Fix cm_event used before being initialized
A segmentation fault was triggered when i try to abort a postcopy + rdma
migration.

since rdma_ack_cm_event releases a uninitialized cm_event in these case.

like below:
2496     ret = rdma_get_cm_event(rdma->channel, &cm_event);
2497     if (ret) {
2498         perror("rdma_get_cm_event after rdma_connect");
2499         ERROR(errp, "connecting to destination!");
2500         rdma_ack_cm_event(cm_event); <<<< cause segmentation fault
2501         goto err_rdma_source_connect;
2502     }

Refer to the rdma_get_cm_event() code, cm_event will be
updated/changed only if rdma_get_cm_event() returns 0. So it's okey to
remove the ack in error patch.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>

Message-Id: <20210519064740.10828-1-lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-26 18:39:32 +01:00
Paolo Bonzini
b02629550d replication: move include out of root directory
The replication.h file is included from migration/colo.c and tests/unit/test-replication.c,
so it should be in include/.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26 14:49:46 +02:00
Peter Maydell
9b1e81d1c2 * Replace YAML anchors by extends in the gitlab-CI yaml files
* Many small qtest fixes (e.g. to fix issues discovered by Coverity)
 * Poison more config switches in common code
 * Fix the failing Travis-CI and Cirrus-CI tasks
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmCeXFMRHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbUV7g//VRxN74v2pO6zSsSNavYTkMa3jZE1Io3l
 w9lsOr3DM0HIMhyEMwSJsiygj0s8TNanUtFBHfDfo1k1gmZYd5FiM1sB7ZB/sB/S
 9Q9wjmeV2NMG7pcr9zLmiJaEM2LGfGbso44/m0c+gpSyxzg2TeH7sF+38AUZI9/R
 J6/gOzIs4WWdSV4f8kp+YPPQCyOtxZsfxDuk1z1fKsBgXE5iEBUqYyyZUm4gYJkN
 4ALmqc/NVeLszt08qkPHxwXQc488nCJP31psxx6MQ7toKfCAamT07Pp3oi70cakC
 y+HVfsIHc7SxaZyj8sbJCU3LJHwDd8N82ydJUrv216qDffb38fNb+m6y9mcYy2MH
 C25uGe9mucTuTP1WIttC6nCKg/MCgi4PWIzqEhkeKj3TxpTUJJP+BUIfSubV38Gc
 T+XUCNkWWW8sTeRiyE3m9pEZ+gz1MFubaIr/Owephch1SjRYn4zUwJFHHi3I4PY4
 7XjEo8y2M9tKckW3pCfYi+aIDzC1DcRtzvXUGtdemX5xVjTAGlKkFLWl/brdCn2U
 y+JrwrL8cQ3Fy7bXzmK6m9mmhcIW6PcSOBE7RnSESyKzqM5VBSBgnjMqnHcP3FrV
 FYzihTCDcFKy3ELUe3Kbc1TgG/07stQs9xInGXN7ZFsIwadfHgoNVQ6JpOqq1oQa
 fvmLPBhq+a4=
 =QZuG
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-05-14' into staging

* Replace YAML anchors by extends in the gitlab-CI yaml files
* Many small qtest fixes (e.g. to fix issues discovered by Coverity)
* Poison more config switches in common code
* Fix the failing Travis-CI and Cirrus-CI tasks

# gpg: Signature made Fri 14 May 2021 12:17:39 BST
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/thuth-gitlab/tags/pull-request-2021-05-14:
  cirrus.yml: Fix the MSYS2 task
  pc-bios/s390-ccw: Fix inline assembly for older versions of Clang
  tests/qtest/migration-test: Use g_autofree to avoid leaks on error paths
  configure: Poison all current target-specific #defines
  migration: Move populate_vfio_info() into a separate file
  include/sysemu: Poison all accelerator CONFIG switches in common code
  tests: Avoid side effects inside g_assert() arguments
  tests/qtest/rtc-test: Remove pointless NULL check
  tests/qtest/tpm-util.c: Free memory with correct free function
  tests/migration-test: Fix "true" vs true
  tests/qtest/npcm7xx_pwm-test.c: Avoid g_assert_true() for non-test assertions
  tests/qtest/ahci-test.c: Calculate iso_size with 64-bit arithmetic
  util/compatfd.c: Replaced a malloc call with g_malloc.
  libqtest: refuse QTEST_QEMU_BINARY=qemu-kvm
  docs/devel/qgraph: add troubleshooting information
  libqos/qgraph: fix "UNAVAILBLE" typo
  gitlab-ci: Replace YAML anchors by extends (native_test_job)
  gitlab-ci: Replace YAML anchors by extends (native_build_job)
  gitlab-ci: Replace YAML anchors by extends (container_job)
  tests/docker/dockerfiles: Add ccache to containers where it was missing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-14 19:33:23 +01:00
Thomas Huth
43bd0bf30f migration: Move populate_vfio_info() into a separate file
The CONFIG_VFIO switch only works in target specific code. Since
migration/migration.c is common code, the #ifdef does not have
the intended behavior here. Move the related code to a separate
file now which gets compiled via specific_ss instead.

Fixes: 3710586caa ("qapi: Add VFIO devices migration stats in Migration stats")
Message-Id: <20210414112004.943383-3-thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-05-14 12:31:51 +02:00
David Hildenbrand
542147f4e5 migration/ram: Use offset_in_ramblock() in range checks
We never read or write beyond the used_length of memory blocks when
migrating. Make this clearer by using offset_in_ramblock() consistently.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-11-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:14 +01:00
David Hildenbrand
c1668bde5c migration/multifd: Print used_length of memory block
We actually want to print the used_length, against which we check.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-10-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:14 +01:00
David Hildenbrand
898ba906cc migration/ram: Handle RAM block resizes during postcopy
Resizing while migrating is dangerous and does not work as expected.
The whole migration code works with the usable_length of a ram block and
does not expect this value to change at random points in time.

In the case of postcopy, relying on used_length is racy as soon as the
guest is running. Also, when used_length changes we might leave the
uffd handler registered for some memory regions, reject valid pages
when migrating and fail when sending the recv bitmap to the source.

Resizing can be trigger *after* (but not during) a reset in
ACPI code by the guest
- hw/arm/virt-acpi-build.c:acpi_ram_update()
- hw/i386/acpi-build.c:acpi_ram_update()

Let's remember the original used_length in a separate variable and
use it in relevant postcopy code. Make sure to update it when we resize
during precopy, when synchronizing the RAM block sizes with the source.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-9-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:14 +01:00
David Hildenbrand
6a23f6399a migration/ram: Simplify host page handling in ram_load_postcopy()
Add two new helper functions. This will come in come handy once we want to
handle ram block resizes while postcopy is active.

Note that ram_block_from_stream() will already print proper errors.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-8-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Added brackets in host_page_from_ram_block_offset
     to cause uintptr_t to cast the sum, to fix armhf-cross build
2021-05-13 18:21:13 +01:00
David Hildenbrand
cc61c703b6 migration/ram: Discard RAM when growing RAM blocks after ram_postcopy_incoming_init()
In case we grow our RAM after ram_postcopy_incoming_init() (e.g., when
synchronizing the RAM block state with the migration source), the resized
part would not get discarded. Let's perform that when being notified
about a resize while postcopy has been advised, but is not listening
yet. With precopy, the process is as following:

1. VM created
- RAM blocks are created
2. Incomming migration started
- Postcopy is advised
- All pages in RAM blocks are discarded
3. Precopy starts
- RAM blocks are resized to match the size on the migration source.
- RAM pages from precopy stream are loaded
- Uffd handler is registered, postcopy starts listening
4. Guest started, postcopy running
- Pagefaults get resolved, pages get placed

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-7-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
David Hildenbrand
c7c0e72408 migration/ram: Handle RAM block resizes during precopy
Resizing while migrating is dangerous and does not work as expected.
The whole migration code works on the usable_length of ram blocks and does
not expect this to change at random points in time.

In the case of precopy, the ram block size must not change on the source,
after syncing the RAM block list in ram_save_setup(), so as long as the
guest is still running on the source.

Resizing can be trigger *after* (but not during) a reset in
ACPI code by the guest
- hw/arm/virt-acpi-build.c:acpi_ram_update()
- hw/i386/acpi-build.c:acpi_ram_update()

Use the ram block notifier to get notified about resizes. Let's simply
cancel migration and indicate the reason. We'll continue running on the
source. No harm done.

Update the documentation. Postcopy will be handled separately.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-5-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Manual merge
2021-05-13 18:21:13 +01:00
Markus Armbruster
372043f389 migration: Drop redundant query-migrate result @blocked
Result @blocked is redundant.  Unfortunately, we realized this too
close to the release to risk dropping it, so we deprecated it
instead, in commit e11ce6c06.

Since it was deprecated from the start, we can delete it without
the customary grace period.  Do so.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210429140424.2802929-1-armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
Kunkun Jiang
ba1b7c812c migration/ram: Optimize ram_save_host_page()
Starting from pss->page, ram_save_host_page() will check every page
and send the dirty pages up to the end of the current host page or
the boundary of used_length of the block. If the host page size is
a huge page, the step "check" will take a lot of time.

It will improve performance to use migration_bitmap_find_dirty().

Tested on Kunpeng 920; VM parameters: 1U 4G (page size 1G)
The time of ram_save_host_page() in the last round of ram saving:
before optimize: 9250us		after optimize: 34us

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210316125716.1243-3-jiangkunkun@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
Kunkun Jiang
23feba906e migration/ram: Reduce unnecessary rate limiting
When the host page is a huge page and something is sent in the
current iteration, migration_rate_limit() should be executed.
If not, it can be omitted.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210316125716.1243-2-jiangkunkun@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
David Hildenbrand
1a37352277 migrate/ram: remove "ram_bulk_stage" and "fpo_enabled"
The bulk stage is kind of weird: migration_bitmap_find_dirty() will
indicate a dirty page, however, ram_save_host_page() will never save it, as
migration_bitmap_clear_dirty() detects that it is not dirty.

We already fill the bitmap in ram_list_init_bitmaps() with ones, marking
everything dirty - it didn't used to be that way, which is why we needed
an explicit first bulk stage.

Let's simplify: make the bitmap the single source of thuth. Explicitly
handle the "xbzrle_enabled after first round" case.

Regarding XBZRLE (implicitly handled via "ram_bulk_stage = false" right
now), there is now a slight change in behavior:
- Colo: When starting, it will be disabled (was implicitly enabled)
  until the first round actually finishes.
- Free page hinting: When starting, XBZRLE will be disabled (was implicitly
  enabled) until the first round actually finished.
- Snapshots: When starting, XBZRLE will be disabled. We essentially only
  do a single run, so I guess it will never actually get disabled.

Postcopy seems to indirectly disable it in ram_save_page(), so there
shouldn't be really any change.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210216105039.40680-1-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
Thomas Huth
2068cabd3f Do not include cpu.h if it's not really necessary
Stop including cpu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-4-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
4c386f8064 Do not include sysemu/sysemu.h if it's not really necessary
Stop including sysemu/sysemu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:50 +02:00
Andrey Gruzdev
82ea3e3b99 migration: Rename 'bs' to 'block' in background snapshot code
Rename 'bs' to commonly used 'block' in migration/ram.c background
snapshot code.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reported-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210401092226.102804-5-andrey.gruzdev@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-04-07 18:37:56 +01:00
Andrey Gruzdev
eeccb99c9d migration: Pre-fault memory before starting background snasphot
This commit solves the issue with userfault_fd WP feature that
background snapshot is based on. For any never poluated or discarded
memory page, the UFFDIO_WRITEPROTECT ioctl() would skip updating
PTE for that page, thereby loosing WP setting for it.

So we need to pre-fault pages for each RAM block to be protected
before making a userfault_fd wr-protect ioctl().

Fixes: 278e2f551a (migration: support
  UFFD write fault processing in ram_save_iterate())
Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210401092226.102804-4-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert:
  Bodged ifdef __linux__ on ram_write_tracking_prepare, should really
      go in a stub
2021-04-07 18:37:28 +01:00
Andrey Gruzdev
1a8e44a89f migration: Inhibit virtio-balloon for the duration of background snapshot
The same thing as for incoming postcopy - we cannot deal with concurrent
RAM discards when using background snapshot feature in outgoing migration.

Fixes: 8518278a6a (migration: implementation
  of background snapshot thread)
Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210401092226.102804-3-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-04-06 18:56:01 +01:00
Andrey Gruzdev
ecb23efea0 migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread
Added missing qemu_fflush() on buffer file holding precopy device state.
Increased initial QIOChannelBuffer allocation to 512KB to avoid reallocs.
Typical configurations often require >200KB for device state and VMDESC.

Fixes: 8518278a6a (migration: implementation
  of background snapshot thread)
Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Message-Id: <20210401092226.102804-2-andrey.gruzdev@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-04-06 18:56:01 +01:00
Peter Maydell
415fa2fe91 For 6.0 misc patches under my radar.
V2:
  - "tests: Add tests for yank with the chardev-change case" updated
  - drop the readthedoc theme patch
 -----BEGIN PGP SIGNATURE-----
 
 iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmBltIwcHG1hcmNhbmRy
 ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5QzCEACOWNUlT5mTylY52sB4
 RXxt+vFFby6aj/M5/tXv7T4EHShWkycV5kEnGjUKiWQXfHfHRfOur6PXkbTMG5zY
 UBEuVAMWW50O2VQZR51W+kohZxsxNnimK2gnCTNGjDWOiofTFAcDf7Ycfxbg1TYU
 fsO3m/dl9cy1fBgCsm64+61T60DC5W0JRsxoRCR1qr4vbJtXjoYe9i21GMWOr548
 EVZo3XQDe5WYeTRyTpf1lHU0dLPrJqZuKmF6M3IQWXG7+ns7iMA0v/STmwsBwqSr
 W6vygj2vPKAi1b1X1z/t/IGXP7mOtTZMUZWxhdOcxqEgYyP4rZji02U33CCd0fCi
 wbD8VOmwvtqPeEHXu/b/dhpacgHis1w8jyJspAcW0MIpFJ+1mn+xtWnmMUlA2cOS
 Vmgirinycsim9TKA+jS3vTwT+/wwzqtWUY267m09tVhJwxvGOXQH1i+mlRRLoNcs
 2vf5iWanRbZgFJme8UYtqYB96pWIJjMa1FkMexJgK3VXgMA+Rjkr4MqIyuPoquyp
 /3PgoUU1LUmGh8F+mi8m88tpdgad6iM+UWXeRALsP7UFvP1Psjz8f6Fhh8uBeE7E
 wsdBsdTwwZ3zgLD4DxjpcZdLM+G7PT0nbeodnPWRuwebsYt3FymoCdmkS1CEn9ZT
 kbQxdeJhTa7QoacZUmQSAoXO6g==
 =UwXe
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/marcandre/tags/for-6.0-pull-request' into staging

For 6.0 misc patches under my radar.

V2:
 - "tests: Add tests for yank with the chardev-change case" updated
 - drop the readthedoc theme patch

# gpg: Signature made Thu 01 Apr 2021 12:54:52 BST
# gpg:                using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg:                issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/marcandre/tags/for-6.0-pull-request:
  tests: Add tests for yank with the chardev-change case
  chardev: Fix yank with the chardev-change case
  chardev/char.c: Always pass id to chardev_new
  chardev/char.c: Move object_property_try_add_child out of chardev_new
  yank: Always link full yank code
  yank: Remove dependency on qiochannel
  docs: simplify each section title
  dbus-vmstate: Increase the size of input stream buffer used during load
  util: fix use-after-free in module_load_one

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-01 17:08:48 +01:00
Lukas Straub
1a92d6d500 yank: Remove dependency on qiochannel
Remove dependency on qiochannel by removing yank_generic_iochannel and
letting migration and chardev use their own yank function for
iochannel.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20ff143fc2db23e27cd41d38043e481376c9cec1.1616521341.git.lukasstraub2@web.de>
2021-04-01 15:27:44 +04:00
Jessica Clarke
76f67bac79 meson: Propagate gnutls dependency to migration
Commit 3eacf70bb5 neglected to fix this
for softmmu configs, which pull in migration's use of gnutls.

This fixes the following compilation failure on Arm-based Macs:

  In file included from migration/multifd.c:23:
  In file included from migration/tls.h:25:
  In file included from include/io/channel-tls.h:26:
  In file included from include/crypto/tlssession.h:24:
  include/crypto/tlscreds.h:28:10: fatal error: 'gnutls/gnutls.h' file not found
  #include <gnutls/gnutls.h>
           ^~~~~~~~~~~~~~~~~
  1 error generated.

(as well as for channel.c and tls.c)

Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20210320171221.37437-1-jrtc27@jrtc27.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-01 09:40:45 +02:00
Vladimir Sementsov-Ogievskiy
4290b4834c migration/block-dirty-bitmap: make incoming disabled bitmaps busy
Incoming enabled bitmaps are busy, because we do
bdrv_dirty_bitmap_create_successor() for them. But disabled bitmaps
being migrated are not marked busy, and user can remove them during the
incoming migration. Then we may crash in cancel_incoming_locked() when
try to remove the bitmap that was already removed by user, like this:

 #0  qemu_mutex_lock_impl (mutex=0x5593d88c50d1, file=0x559680554b20
   "../block/dirty-bitmap.c", line=64) at ../util/qemu-thread-posix.c:77
 #1  bdrv_dirty_bitmaps_lock (bs=0x5593d88c0ee9)
   at ../block/dirty-bitmap.c:64
 #2  bdrv_release_dirty_bitmap (bitmap=0x5596810e9570)
   at ../block/dirty-bitmap.c:362
 #3  cancel_incoming_locked (s=0x559680be8208 <dbm_state+40>)
   at ../migration/block-dirty-bitmap.c:918
 #4  dirty_bitmap_load (f=0x559681d02b10, opaque=0x559680be81e0
   <dbm_state>, version_id=1) at ../migration/block-dirty-bitmap.c:1194
 #5  vmstate_load (f=0x559681d02b10, se=0x559680fb5810)
   at ../migration/savevm.c:908
 #6  qemu_loadvm_section_part_end (f=0x559681d02b10,
   mis=0x559680fb4a30) at ../migration/savevm.c:2473
 #7  qemu_loadvm_state_main (f=0x559681d02b10, mis=0x559680fb4a30)
   at ../migration/savevm.c:2626
 #8  postcopy_ram_listen_thread (opaque=0x0)
   at ../migration/savevm.c:1871
 #9  qemu_thread_start (args=0x5596817ccd10)
   at ../util/qemu-thread-posix.c:521
 #10 start_thread () at /lib64/libpthread.so.0
 #11 clone () at /lib64/libc.so.6

Note bs pointer taken from bitmap: it's definitely bad aligned. That's
because we are in use after free, bitmap is already freed.

So, let's make disabled bitmaps (being migrated) busy during incoming
migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210322094906.5079-2-vsementsov@virtuozzo.com>
2021-03-24 13:41:19 +00:00
Daniel P. Berrangé
cbde7be900 migrate: remove QMP/HMP commands for speed, downtime and cache size
The generic 'migrate_set_parameters' command handle all types of param.

Only the QMP commands were documented in the deprecations page, but the
rationale for deprecating applies equally to HMP, and the replacements
exist. Furthermore the HMP commands are just shims to the QMP commands,
so removing the latter breaks the former unless they get re-implemented.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-03-18 09:22:55 +00:00
Mahmoud Mandour
373969507a migration: Replaced qemu_mutex_lock calls with QEMU_LOCK_GUARD
Replaced various qemu_mutex_lock calls and their respective
qemu_mutex_unlock calls with QEMU_LOCK_GUARD macro. This simplifies
the code by eliminating the respective qemu_mutex_unlock calls.

Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
Message-Id: <20210311031538.5325-7-ma.mandourr@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-03-15 20:01:55 +00:00
Hao Wang
fca676429c migration/tls: add error handling in multifd_tls_handshake_thread
If any error happens during multifd send thread creating (e.g. channel broke
because new domain is destroyed by the dst), multifd_tls_handshake_thread
may exit silently, leaving main migration thread hanging (ram_save_setup ->
multifd_send_sync_main -> qemu_sem_wait(&p->sem_sync)).
Fix that by adding error handling in multifd_tls_handshake_thread.

Signed-off-by: Hao Wang <wanghao232@huawei.com>
Message-Id: <20210209104237.2250941-3-wanghao232@huawei.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Chuan Zheng <zhengchuan@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-03-15 20:01:55 +00:00
Hao Wang
a339149afa migration/tls: fix inverted semantics in multifd_channel_connect
Function multifd_channel_connect() return "true" to indicate failure,
which is rather confusing. Fix that.

Signed-off-by: Hao Wang <wanghao232@huawei.com>
Message-Id: <20210209104237.2250941-2-wanghao232@huawei.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Chuan Zheng <zhengchuan@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-03-15 20:01:55 +00:00
Peter Krempa
6e9f21a2aa migration: dirty-bitmap: Allow control of bitmap persistence
Bitmap's source persistence is transported over the migration stream and
the destination mirrors it. In some cases the destination might want to
persist bitmaps which are not persistent on the source (e.g. the result
of merging bitmaps from a number of layers on the source when migrating
into a squashed image) but currently it would need to create another set
of persistent bitmaps and merge them.

This patch adds a 'transform' property to the alias map which allows
overriding the persistence of migrated bitmaps both on the source and
destination sides.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Message-Id: <b20afb675917b86f6359ac3591166ac6d4233573.1613150869.git.pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks, drop dead conditional]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 15:24:36 -06:00
Peter Krempa
0d1e450c7b migration: dirty-bitmap: Use struct for alias map inner members
Currently the alias mapping hash stores just strings of the target
objects internally. In further patches we'll be adding another member
which will need to be stored in the map so pass a copy of the whole
BitmapMigrationBitmapAlias QAPI struct into the map.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Message-Id: <fc5f27e1fe16cb75e08a248c2d938de3997b9bfb.1613150869.git.pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: adjust long lines]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 14:50:55 -06:00
Stefan Reiter
e846b74650 migration: only check page size match if RAM postcopy is enabled
Postcopy may also be advised for dirty-bitmap migration only, in which
case the remote page size will not be available and we'll instead read
bogus data, blocking migration with a mismatch error if the VM uses
hugepages.

Fixes: 58110f0acb ("migration: split common postcopy out of ram postcopy")
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Message-Id: <20210204163522.13291-1-s.reiter@proxmox.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:52 +00:00
Daniel P. Berrangé
0f0d83a456 migration: introduce snapshot-{save, load, delete} QMP commands
savevm, loadvm and delvm are some of the few HMP commands that have never
been converted to use QMP. The reasons for the lack of conversion are
that they blocked execution of the event thread, and the semantics
around choice of disks were ill-defined.

Despite this downside, however, libvirt and applications using libvirt
have used these commands for as long as QMP has existed, via the
"human-monitor-command" passthrough command. IOW, while it is clearly
desirable to be able to fix the problems, they are not a blocker to
all real world usage.

Meanwhile there is a need for other features which involve adding new
parameters to the commands. This is possible with HMP passthrough, but
it provides no reliable way for apps to introspect features, so using
QAPI modelling is highly desirable.

This patch thus introduces new snapshot-{load,save,delete} commands to
QMP that are intended to replace the old HMP counterparts. The new
commands are given different names, because they will be using the new
QEMU job framework and thus will have diverging behaviour from the HMP
originals. It would thus be misleading to keep the same name.

While this design uses the generic job framework, the current impl is
still blocking. The intention that the blocking problem is fixed later.
None the less applications using these new commands should assume that
they are asynchronous and thus wait for the job status change event to
indicate completion.

In addition to using the job framework, the new commands require the
caller to be explicit about all the block device nodes used in the
snapshot operations, with no built-in default heuristics in use.

Note that the existing "query-named-block-nodes" can be used to query
what snapshots currently exist for block nodes.

Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-13-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: removed tests for now, the output ordering isn't
deterministic
2021-02-08 11:19:52 +00:00
Daniel P. Berrangé
bef7e9e2c7 migration: introduce a delete_snapshot wrapper
Make snapshot deletion consistent with the snapshot save
and load commands by using a wrapper around the blockdev
layer. The main difference is that we get upfront validation
of the passed in device list (if any).

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-10-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
f1a9fcdd01 migration: wire up support for snapshot device selection
Modify load_snapshot/save_snapshot to accept the device list and vmstate
node name parameters previously added to the block layer.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-9-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
f781f84189 migration: control whether snapshots are ovewritten
The traditional HMP "savevm" command will overwrite an existing snapshot
if it already exists with the requested name. This new flag allows this
to be controlled allowing for safer behaviour with a future QMP command.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-8-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
3d3e9b1f66 block: rename and alter bdrv_all_find_snapshot semantics
Currently bdrv_all_find_snapshot() will return 0 if it finds
a snapshot, -1 if an error occurs, or if it fails to find a
snapshot. New callers to be added want to distinguish between
the error scenario and failing to find a snapshot.

Rename it to bdrv_all_has_snapshot and make it return -1 on
error, 0 if no snapshot is found and 1 if snapshot is found.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-7-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
c22d644ca7 block: allow specifying name of block device for vmstate storage
Currently the vmstate will be stored in the first block device that
supports snapshots. Historically this would have usually been the
root device, but with UEFI it might be the variable store. There
needs to be a way to override the choice of block device to store
the state in.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-6-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
cf3a74c94f block: add ability to specify list of blockdevs during snapshot
When running snapshot operations, there are various rules for which
blockdevs are included/excluded. While this provides reasonable default
behaviour, there are scenarios that are not well handled by the default
logic. Some of the conditions do not have a single correct answer.

Thus there needs to be a way for the mgmt app to provide an explicit
list of blockdevs to perform snapshots across. This can be achieved
by passing a list of node names that should be used.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-5-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
f61fe11aa6 migration: stop returning errno from load_snapshot()
None of the callers care about the errno value since there is a full
Error object populated. This gives consistency with save_snapshot()
which already just returns a boolean value.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[PMD: Return false/true instead of -1/0, document function]
Acked-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210204124834.774401-4-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Philippe Mathieu-Daudé
7ea14df230 migration: Make save_snapshot() return bool, not 0/-1
Just for consistency, following the example documented since
commit e3fe3988d7 ("error: Document Error API usage rules"),
return a boolean value indicating an error is set or not.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210204124834.774401-3-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
e26f98e209 block: push error reporting into bdrv_all_*_snapshot functions
The bdrv_all_*_snapshot functions return a BlockDriverState pointer
for the invalid backend, which the callers then use to report an
error message. In some cases multiple callers are reporting the
same error message, but with slightly different text. In the future
there will be more error scenarios for some of these methods, which
will benefit from fine grained error message reporting. So it is
helpful to push error reporting down a level.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[PMD: Initialize variables]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210204124834.774401-2-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Dr. David Alan Gilbert
3af8554bd0 migration: Add blocker information
Modify query-migrate so that it has a flag indicating if outbound
migration is blocked, and if it is a list of reasons.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202135522.127380-2-dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Markus Armbruster
54270c450a migration: Fix a few absurdly defective error messages
migrate_params_check() has a number of error messages of the form

    Parameter 'NAME' expects is invalid, it should be ...

Fix them to something like

    Parameter 'NAME' expects a ...

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202141734.2488076-5-armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Markus Armbruster
7bfc47936e migration: Fix cache_init()'s "Failed to allocate" error messages
cache_init() attempts to handle allocation failure.  The two error
messages are garbage, as untested error messages commonly are:

    Parameter 'cache size' expects Failed to allocate cache
    Parameter 'cache size' expects Failed to allocate page cache

Fix them to just

    Failed to allocate cache
    Failed to allocate page cache

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202141734.2488076-4-armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Markus Armbruster
8b9407a09f migration: Clean up signed vs. unsigned XBZRLE cache-size
73af8dd8d7 "migration: Make xbzrle_cache_size a migration
parameter" (v2.11.0) made the new parameter unsigned (QAPI type
'size', uint64_t in C).  It neglected to update existing code, which
continues to use int64_t.

migrate_xbzrle_cache_size() returns the new parameter.  Adjust its
return type.

QMP query-migrate-cache-size returns migrate_xbzrle_cache_size().
Adjust its return type.

migrate-set-parameters passes the new parameter to
xbzrle_cache_resize().  Adjust its parameter type.

xbzrle_cache_resize() passes it on to cache_init().  Adjust its
parameter type.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202141734.2488076-3-armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Andrey Gruzdev
8518278a6a migration: implementation of background snapshot thread
Introducing implementation of 'background' snapshot thread
which in overall follows the logic of precopy migration
while internally utilizes completely different mechanism
to 'freeze' vmstate at the start of snapshot creation.

This mechanism is based on userfault_fd with wr-protection
support and is Linux-specific.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210129101407.103458-5-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Andrey Gruzdev
278e2f551a migration: support UFFD write fault processing in ram_save_iterate()
In this particular implementation the same single migration
thread is responsible for both normal linear dirty page
migration and procesing UFFD page fault events.

Processing write faults includes reading UFFD file descriptor,
finding respective RAM block and saving faulting page to
the migration stream. After page has been saved, write protection
can be removed. Since asynchronous version of qemu_put_buffer()
is expected to be used to save pages, we also have to flush
migraion stream prior to un-protecting saved memory range.

Write protection is being removed for any previously protected
memory chunk that has hit the migration stream. That's valid
for pages from linear page scan along with write fault pages.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210129101407.103458-4-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  fixup pagefault.address cast for 32bit
2021-02-08 11:19:51 +00:00
Andrey Gruzdev
6e8c25b4c6 migration: introduce 'background-snapshot' migration capability
Add new capability to 'qapi/migration.json' schema.
Update migrate_caps_check() to validate enabled capability set
against introduced one. Perform checks for required kernel features
and compatibility with guest memory backends.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210129101407.103458-2-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Wainer dos Santos Moschetta
1dfafcbd39 migration/qemu-file: Fix maybe uninitialized on qemu_get_buffer_in_place()
Fixed error when compiling migration/qemu-file.c with -Werror=maybe-uninitialized
as shown here:

../migration/qemu-file.c: In function 'qemu_get_buffer_in_place':
../migration/qemu-file.c:604:18: error: 'src' may be used uninitialized in this function [-Werror=maybe-uninitialized]
  604 |             *buf = src;
      |             ~~~~~^~~~~
cc1: all warnings being treated as errors

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Message-Id: <20210128130625.569900-1-wainersm@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Jinhao Gao
39f633d429 savevm: Fix memory leak of vmstate_configuration
When VM migrate VMState of configuration, the fields(name and capabilities)
of configuration having a flag of VMS_ALLOC need to allocate memory. If the
src doesn't free memory of capabilities in SaveState after save VMState of
configuration, or the dst doesn't free memory of name and capabilities in post
load of configuration, it may result in memory leak of name and capabilities.
We free memory in configuration_post_save and configuration_post_load func,
which prevents memory leak.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Jinhao Gao <gaojinhao@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20201231061020.828-3-gaojinhao@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Eric Blake
95b3a8c8a8 qapi: More complex uses of QAPI_LIST_APPEND
These cases require a bit more thought to review; in each case, the
code was appending to a list, but not with a FOOList **tail variable.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210113221013.390592-6-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Flawed change to qmp_guest_network_get_interfaces() dropped]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-01-28 08:08:45 +01:00
Lukas Straub
b5eea99ec2 migration: Add yank feature
Register yank functions on sockets to shut them down.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <484c6a14cc2506bebedd5a237259b91363ff8f88.1609167865.git.lukasstraub2@web.de>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-01-13 10:21:17 +01:00
Peter Maydell
729cc68373 Remove superfluous timer_del() calls
This commit is the result of running the timer-del-timer-free.cocci
script on the whole source tree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20201215154107.3255-4-peter.maydell@linaro.org
2021-01-08 15:13:38 +00:00
Paolo Bonzini
b1def33d19 zstd: convert to meson
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-01-06 10:21:20 +01:00
Peter Maydell
41192db338 Machine queue, 2020-12-23
Cleanup:
 * qdev code cleanup (Eduardo Habkost)
 
 Bug fix:
 * hostmem: Free host_nodes list right after visited (Keqian Zhu)
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAl/jteYUHGVoYWJrb3N0
 QHJlZGhhdC5jb20ACgkQKAeTb5hNxaZUHw//c40nRlYdGSV5j6w3ZCSlmZRFxZTU
 UiLK51Z3hI9Q9kyLcoIQitEYlQTIbgp0qlIJ6evDd/HvQvZ+P4P0Lzm1UGOZhD0h
 nJk5+bBkP/mzMh0P9oiN20DSLk6a3Wvdiu/bQR8gm/WdLvTM1Zjek1ns5tL06ZvA
 MziG6gIypgScu2FeNxD0zC8sDO16oVrzKq7mjZcQe6XYFRsJmYjZw84v+uu/Bdf7
 MBxolkA8vYwwBJNdVsAf7I0Gw3BeArgPUOwbWyt8/tuGIOZxYjdKIj55S7j2fuju
 524sg8Di+YzxmLZaNAGksEBMj9uY39nwdHGhNElMtWCM9oOPumlps9eyLtpTagfM
 wmiVrMGWVlXV6c4kZo8R2NSF8hcDr02S7eyrUpITrh09p4nT6fBGG2ufEYiCyNao
 o9ZqMf7NUO5J60zM5EOfdGxpaN2O0M5pXCCN48NtmqvO0wIAfTc9l/OkCrrfVbEO
 Q/X1jqbj6ZcilSIl9OeLAPi7Xjx26jMeeLPUQtoZnkqDvpk/Vz6Ka1RgGG86QA5z
 2W/KCAoVrg6dO4f9vY3x84rf0Ta5kJtp2LezPgG8d++4bMSf2jN00wYvAQuCyqqW
 zbm8f57YST3vm8XMHPlmtnlKfiLI4wbVUmrDYu3rNI+JgdvhdXseGoErt15ejAcL
 B5IH2SK4AwMpSsk=
 =bnjc
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost-gl/tags/machine-next-pull-request' into staging

Machine queue, 2020-12-23

Cleanup:
* qdev code cleanup (Eduardo Habkost)

Bug fix:
* hostmem: Free host_nodes list right after visited (Keqian Zhu)

# gpg: Signature made Wed 23 Dec 2020 21:25:58 GMT
# gpg:                using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg:                issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost-gl/tags/machine-next-pull-request:
  bugfix: hostmem: Free host_nodes list right after visited
  qdev: Avoid unnecessary DeviceState* variable at set_prop_arraylen()
  qdev: Rename qdev_get_prop_ptr() to object_field_prop_ptr()
  qdev: Move qdev_prop_tpm declaration to tpm_prop.h
  qdev: Make qdev_class_add_property() more flexible
  qdev: Make PropertyInfo.create return ObjectProperty*
  qdev: Move dev->realized check to qdev_property_set()
  qdev: Wrap getters and setters in separate helpers
  qdev: Add name argument to PropertyInfo.create method
  qdev: Add name parameter to qdev_class_add_property()
  qdev: Avoid using prop->name unnecessarily
  qdev: Get just property name at error_set_from_qdev_prop_error()
  sparc: Use DEFINE_PROP for nwindows property
  qdev: Reuse DEFINE_PROP in all DEFINE_PROP_* macros
  qdev: Move softmmu properties to qdev-properties-system.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-01-01 22:57:15 +00:00
Peter Maydell
1f7c02797f QAPI patches patches for 2020-12-19
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl/dynUSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZT3igP/3bWwsKR5vKVsDUTmMfrhcgaFvQiaYoG
 F29Bond8Xy0Zd0gl7OWh/5jKL0vGlrEVPrKfYLUjMnfkeRec/pOkIB2oOmIxpnPs
 9zi4kh2hQ3dEoRBuvSnnZzedetYPTuCpWMIjlztkgfxgcimqm8TPNVSxRaSApjC3
 Y8108wGwBWVf2C0rhKO9E2xA51uo6khy05i1psUtqUlC+PuDQ/OwzQHM2dnWdDB6
 kUwBDK17nhL6WwsYqCyKLSiDModReYfDiY8GS5MDLo74dzwXiatEefCR7+sbM4xq
 eX/SBoqoeS1jLPNuCryNeGNKvNA2KAbEJTnbQA2NxBXHgZ9/1SxVZFxuPp4nDMSQ
 N7BDuDI8YtJE479RjT/ZzRG65xadGBSe/HXkXM9mZwh1zitop8SVZ9fArFBHvNzw
 Y5zAv3fQd54+87psffg4dYFK0wGmqTabLEEuVzM8KIVqcAdYA2yC2b2EHy+vsxuq
 GMkr0WaA6Sq2gthXmzdTjmUPuHdan/NIhuV6d66SbPNH2oH31piptFxuznyFWSKV
 isciFFdUrkg5QrF8DSt2nmdwMFf8QGbszqP8QIGMzhJCCS9GXIiGG8f149++q8X8
 HO1lFAdLQJdrDwCYmfx36tOvi2rS/rcoTGgvg66UX3xKko1ruoxR1ZWcS54obJN6
 vEQDZ+PxubDg
 =vGLy
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2020-12-19' into staging

QAPI patches patches for 2020-12-19

# gpg: Signature made Sat 19 Dec 2020 09:40:05 GMT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2020-12-19: (33 commits)
  qobject: Make QString immutable
  block: Use GString instead of QString to build filenames
  keyval: Use GString to accumulate value strings
  json: Use GString instead of QString to accumulate strings
  migration: Replace migration's JSON writer by the general one
  qobject: Factor JSON writer out of qobject_to_json()
  qobject: Factor quoted_str() out of to_json()
  qobject: Drop qstring_get_try_str()
  qobject: Drop qobject_get_try_str()
  Revert "qobject: let object_property_get_str() use new API"
  block: Avoid qobject_get_try_str()
  qmp: Fix tracing of non-string command IDs
  qobject: Move internals to qobject-internal.h
  hw/rdma: Replace QList by GQueue
  Revert "qstring: add qstring_free()"
  qobject: Change qobject_to_json()'s value to GString
  qobject: Use GString instead of QString to accumulate JSON
  qobject: Make qobject_to_json_pretty() take a pretty argument
  monitor: Use GString instead of QString for output buffer
  hmp: Simplify how qmp_human_monitor_command() gets output
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-01-01 14:33:03 +00:00
Markus Armbruster
3ddba9a9e9 migration: Replace migration's JSON writer by the general one
Commit 8118f0950f "migration: Append JSON description of migration
stream" needs a JSON writer.  The existing qobject_to_json() wasn't a
good fit, because it requires building a QObject to convert.  Instead,
migration got its very own JSON writer, in commit 190c882ce2 "QJSON:
Add JSON writer".  It tacitly limits numbers to int64_t, and strings
contents to characters that don't need escaping, unlike
qobject_to_json().

The previous commit factored the JSON writer out of qobject_to_json().
Replace migration's JSON writer by it.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20201211171152.146877-17-armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-12-19 10:39:16 +01:00
Eric Blake
54aa3de72e qapi: Use QAPI_LIST_PREPEND() where possible
Anywhere we create a list of just one item or by prepending items
(typically because order doesn't matter), we can use
QAPI_LIST_PREPEND().  But places where we must keep the list in order
by appending remain open-coded until later patches.

Note that as a side effect, this also performs a cleanup of two minor
issues in qga/commands-posix.c: the old code was performing
 new = g_malloc0(sizeof(*ret));
which 1) is confusing because you have to verify whether 'new' and
'ret' are variables with the same type, and 2) would conflict with C++
compilation (not an actual problem for this file, but makes
copy-and-paste harder).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201113011340.463563-5-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
[Straightforward conflicts due to commit a8aa94b5f8 "qga: update
schema for guest-get-disks 'dependents' field" and commit a10b453a52
"target/mips: Move mips_cpu_add_definition() from helper.c to cpu.c"
resolved.  Commit message tweaked.]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-12-19 10:20:14 +01:00
Eric Blake
eaedde5255 migration: Refactor migrate_cap_add
Instead of taking a list parameter and returning a new head at a
distance, just return the new item for the caller to insert into a
list via QAPI_LIST_PREPEND.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201113011340.463563-4-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-12-19 10:15:08 +01:00
Eduardo Habkost
ce35e2295e qdev: Move softmmu properties to qdev-properties-system.h
Move the property types and property macros implemented in
qdev-properties-system.c to a new qdev-properties-system.h
header.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20201211220529.2290218-16-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-12-18 15:20:17 -05:00
Tuguoyi
36d0fe6516 migration: Don't allow migration if vm is in POSTMIGRATE
The following steps will cause qemu assertion failure:
- pause vm by executing 'virsh suspend'
- create external snapshot of memory and disk using 'virsh snapshot-create-as'
- doing the above operation again will cause qemu crash

The backtrace looks like:
    at /build/qemu-5.0/migration/savevm.c:1401
    at /build/qemu-5.0/migration/savevm.c:1453

When the first migration completes, bs->open_flags will set BDRV_O_INACTIVE
flag by bdrv_inactivate_all(), and during the second migration the
bdrv_inactivate_recurse assert that the bs->open_flags is already
BDRV_O_INACTIVE enabled which cause crash.

As Vladimir suggested, this patch makes migrate_prepare check the state of vm and
return error if it is in RUN_STATE_POSTMIGRATE state.

Signed-off-by: Tuguoyi <tu.guoyi@h3c.com>
Message-Id: <6b704294ad2e405781c38fb38d68c744@h3c.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reported-by: Li Zhang <li.zhang@cloud.ionos.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-12-18 10:08:25 +00:00
Tuguoyi
2a909dc430 savevm: Delete snapshots just created in case of error
bdrv_all_create_snapshot() can fails with some snapshots created,
so it's better to delete those snapshots before returns to the caller

Signed-off-by: Tuguoyi <tu.guoyi@h3c.com>
Message-Id: <1607410416-13563-3-git-send-email-tu.guoyi@h3c.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-12-18 10:08:24 +00:00
Tuguoyi
80ef0586d3 savevm: Remove dead code in save_snapshot()
The snapshot in each bs is deleted at the beginning, so there is no need
to find the snapshot again.

Signed-off-by: Tuguoyi <tu.guoyi@h3c.com>
Message-Id: <1607410416-13563-2-git-send-email-tu.guoyi@h3c.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-12-18 10:08:24 +00:00
Paolo Bonzini
e69d50d621 migration, vl: start migration via qmp_migrate_incoming
Make qemu_start_incoming_migration local to migration/migration.c.
By using the runstate instead of a separate flag, vl need not do
anything to setup deferred incoming migration.

qmp_migrate_incoming also does not need the deferred_incoming flag
anymore, because "-incoming PROTOCOL" will clear the "once" flag
before the main loop starts.  Therefore, later invocations of
the migrate-incoming command will fail with the existing
"The incoming migration has already been started" error message.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-10 12:15:14 -05:00
Paolo Bonzini
e0d17dfd22 vl: move various initialization routines out of qemu_init
Some very simple initialization routines can be nested in existing
subsystem-level functions, do that to simplify qemu_init.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-10 12:15:11 -05:00
Chetan Pant
ef19b50d93 migration: Fix Lesser GPL version number
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.

Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Message-Id: <20201023123130.19656-1-chetan4windows@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2020-11-15 16:43:28 +01:00
Longpeng(Mike)
6ba11211bd migration: handle CANCELLING state in migration_completion()
The following sequence may cause the VM abort during migration:

1. RUN_STATE_RUNNING,MIGRATION_STATUS_ACTIVE

2. before call migration_completion(), we send migrate_cancel
   QMP command, the state machine is changed to:
     RUN_STATE_RUNNING,MIGRATION_STATUS_CANCELLING

3. call migration_completion(), and the state machine is
   switch to: RUN_STATE_RUNNING,MIGRATION_STATUS_COMPLETED

4. call migration_iteration_finish(), because the migration
   status is COMPLETED, so it will try to set the runstate
   to POSTMIGRATE, but RUNNING-->POSTMIGRATE is an invalid
   transition, so abort().

The migration_completion() should not change the migration state
to COMPLETED if it is already changed to CANCELLING.

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Message-Id: <20201105091726.148-1-longpeng2@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-11-12 15:52:20 +00:00
Chuan Zheng
9e8424088c multifd/tls: fix memoryleak of the QIOChannelSocket object when cancelling migration
When creating new tls client, the tioc->master will be referenced which results in socket
leaking after multifd_save_cleanup if we cancel migration.
Fix it by do object_unref() after tls client creation.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Message-Id: <1605104763-118687-1-git-send-email-zhengchuan@huawei.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-11-12 15:52:20 +00:00
Chuan Zheng
a18ed79b19 migration/dirtyrate: simplify includes in dirtyrate.c
Remove redundant blank line which is left by Commit 662770af7c,
also take this opportunity to remove redundant includes in dirtyrate.c.

Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Message-Id: <1604030281-112946-1-git-send-email-zhengchuan@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-11-12 15:52:14 +00:00
Chen Qun
a24292830b migration: fix uninitialized variable warning in migrate_send_rp_req_pages()
After the WITH_QEMU_LOCK_GUARD macro is added, the compiler cannot identify
 that the statements in the macro must be executed. As a result, some variables
 assignment statements in the macro may be considered as unexecuted by the compiler.

When the -Wmaybe-uninitialized capability is enabled on GCC9,the compiler showed warning:
migration/migration.c: In function ‘migrate_send_rp_req_pages’:
migration/migration.c:384:8: warning: ‘received’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 384 |     if (received) {
     |        ^

Add a default value for 'received' to prevented the warning.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201111142203.2359370-6-kuhn.chenqun@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-11-12 14:49:16 +00:00
Chuan Zheng
a1af605bd5 migration/multifd: fix hangup with TLS-Multifd due to blocking handshake
The qemu main loop could hang up forever when we enable TLS+Multifd.
The Src multifd_send_0 invokes tls handshake, it sends hello to sever
and wait response.
However, the Dst main qemu loop has been waiting recvmsg() for multifd_recv_1.
Both of Src and Dst main qemu loop are blocking and waiting for reponse which
results in hanging up forever.

Src: (multifd_send_0)                                              Dst: (multifd_recv_1)
multifd_channel_connect                                            migration_channel_process_incoming
  multifd_tls_channel_connect                                        migration_tls_channel_process_incoming
    multifd_tls_channel_connect                                        qio_channel_tls_handshake_task
       qio_channel_tls_handshake                                         gnutls_handshake
          qio_channel_tls_handshake_task                                       ...
            qcrypto_tls_session_handshake                                      ...
              gnutls_handshake                                                 ...
                   ...                                                         ...
                recvmsg (Blocking I/O waiting for response)                recvmsg (Blocking I/O waiting for response)

Fix this by offloadinig handshake work to a background thread.

Reported-by: Yan Jin <jinyan12@huawei.com>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
Message-Id: <1604643893-8223-1-git-send-email-zhengchuan@huawei.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-11-12 14:35:29 +00:00
Philippe Mathieu-Daudé
af3bbbe984 migration/ram: Fix hexadecimal format string specifier
The '%u' conversion specifier is for decimal notation.
When prefixing a format with '0x', we want the hexadecimal
specifier ('%x').

Inspired-by: Dov Murik <dovmurik@linux.vnet.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20201103112558.2554390-5-philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-11-12 14:02:41 +00:00
Rao, Lei
b70cb3b485 Reduce the time of checkpoint for COLO
we should set ram_bulk_stage to false after ram_state_init,
otherwise the bitmap will be unused in migration_bitmap_find_dirty.
all pages in ram cache will be flushed to the ram of secondary guest
for each checkpoint.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Signed-off-by: Derek Su <dereksu@qnap.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2020-11-11 16:52:23 +08:00
Peter Xu
5e77343113 migration: Postpone the kick of the fault thread after recover
The new migrate_send_rp_req_pages_pending() call should greatly improve
destination responsiveness because it will resync faulted address after
postcopy recovery.  However it is also the 1st place to initiate the page
request from the main thread.

One thing is overlooked on that migrate_send_rp_message_req_pages() is not
designed to be thread-safe.  So if we wake the fault thread before syncing all
the faulted pages in the main thread, it means they can race.

Postpone the wake up operation after the sync of faulted addresses.

Fixes: 0c26781c09 ("migration: Sync requested pages after postcopy recovery")
Tested-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201102153010.11979-3-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-11-02 18:25:48 +00:00
Peter Xu
cc5ab87200 migration: Unify reset of last_rb on destination node when recover
When postcopy recover happens, we need to reset last_rb after each return of
postcopy_pause_fault_thread() because that means we just got the postcopy
migration continued.

Unify this reset to the place right before we want to kick the fault thread
again, when we get the command MIG_CMD_POSTCOPY_RESUME from source.

This is actually more than that - because the main thread on destination will
now be able to call migrate_send_rp_req_pages_pending() too, so the fault
thread is not the only user of last_rb now.  Move the reset earlier will allow
the first call to migrate_send_rp_req_pages_pending() to use the reset value
even if called from the main thread.

(NOTE: this is not a real fix to 0c26781c09 mentioned below, however it is just
 a mark that when picking up 0c26781c09 we'd better have this one too; the real
 fix will come later)

Fixes: 0c26781c09 ("migration: Sync requested pages after postcopy recovery")
Tested-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20201102153010.11979-2-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-11-02 18:25:39 +00:00
Kirti Wankhede
3710586caa qapi: Add VFIO devices migration stats in Migration stats
Added amount of bytes transferred to the VM at destination by all VFIO
devices

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-01 12:30:51 -07:00
Peter Maydell
d55450df99 migration pull: 2020-10-26
Another go at Peter's postcopy fixes
 
 Cleanups from Bihong Yu and Peter Maydell.
 
 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAl+W9n8ACgkQBRYzHrxb
 /ef2uRAAqWTFLXuBF8+evEd1mMq2SM3ZYTuc7QKTY3MzAH6J/OMvJbZ112itqWOb
 iZ5NuuWH4PvzOhlR/PNNf1Yv3hTfv36HinG+OCh6s+6aqVx9yHOAfdBgmJIdYAeg
 Sk1jx43dvCyN2FwPs31ir3L6mwsrtfkRsS+2FeyrvRoEl4WE9mOoypCft3vdd9Dw
 zZea0Pw7vIs454D4n1vpJiQtq6B4eSAlQKpTLfQbglpTm4MgqLERzGvpT6hbQXJR
 eQyTOqRe08viIOZ+oN0B/+RVO6T9jc4Y1bEl2NSak1v4Tf7NNfDkFpLAjFm07V/1
 tIhL/NOOsHdzfHQtrZpzKQgwaceb1N5qo0PfxD6/tRf9HlXY54iw6yY75+5c5Y89
 UK8VSIYKnM2yXeVDLShxixIr3A1Z+zA41XydDwaLZczjeV7+nwrAXAjO8a+j6Dox
 zj4IyN2g5elEOmarC8qkvbDZ+TVvA2tookhWVwoz+D8ChYkcRDKP9eoYomfRwg+e
 NKRFuLBkyVPb0eEhyOV6HqJbMfTLpHneTM94v6HGz8tiK8IlMZfTTnC2Mr5gTXuS
 /cgOVhsY7+l+pKpxpGJmU3aUCYRk1CuK6MhXgjYEFMh5Siba8s0ZPZVaEm/BUyO1
 rD+tVup87xMiJq3xnmLX+opblYE9G+b67hH1KuPc5vZXiSwuTkQ=
 =OL0Q
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20201026a' into staging

migration pull: 2020-10-26

Another go at Peter's postcopy fixes

Cleanups from Bihong Yu and Peter Maydell.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

# gpg: Signature made Mon 26 Oct 2020 16:17:03 GMT
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20201026a:
  migration-test: Only hide error if !QTEST_LOG
  migration/postcopy: Release fd before going into 'postcopy-pause'
  migration: Sync requested pages after postcopy recovery
  migration: Maintain postcopy faulted addresses
  migration: Introduce migrate_send_rp_message_req_pages()
  migration: Pass incoming state into qemu_ufd_copy_ioctl()
  migration: using trace_ to replace DPRINTF
  migration: Delete redundant spaces
  migration: Open brace '{' following function declarations go on the next line
  migration: Do not initialise statics and globals to 0 or NULL
  migration: Add braces {} for if statement
  migration: Open brace '{' following struct go on the same line
  migration: Add spaces around operator
  migration: Don't use '#' flag of printf format
  migration: Do not use C99 // comments
  migration: Drop unused VMSTATE_FLOAT64 support

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-10-27 10:25:42 +00:00
Peter Maydell
091e3e3dbc bitmaps patches for 2020-10-26
- fix infloop on large bitmap granularity
 - silence compiler warning
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAl+WuYYACgkQp6FrSiUn
 Q2owKwgAhXf5jJEoUzSMFIIczxST9eaokL4Jweb65bp2y4Ii014/kzAnwyU+7B1Y
 DMniSrpySi1F74zng2eOKijgNDgQ1INm1v5P/qeoMa4UpPwL3eaNQl3FwMUCJMdN
 WVo3jl1wxzqcRDVUWDWiAMD0C03nzRKI8i2uw4USUDXiW12bLC0YQPzOCofZagld
 xC1HAnqRMCnK10lj4aoY5jOHQL9R05TLtsEcQCdI73q3c87p+sxI1eOlqJcZw1KC
 vgb6uMOXQrkTPKfVsUK5bJ+aJh97LHhWb/o/Jj65f5IMbBQzaebRby3m38Pgph2W
 +AVuxe2FBAo7o+heucbOrrFwAUcofw==
 =ArcV
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ericb/tags/pull-bitmaps-2020-10-26' into staging

bitmaps patches for 2020-10-26

- fix infloop on large bitmap granularity
- silence compiler warning

# gpg: Signature made Mon 26 Oct 2020 11:56:54 GMT
# gpg:                using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-bitmaps-2020-10-26:
  migration/block-dirty-bitmap: fix uninitialized variable warning
  migration/block-dirty-bitmap: fix larger granularity bitmaps

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-10-26 22:36:35 +00:00
Peter Xu
d246ea5039 migration/postcopy: Release fd before going into 'postcopy-pause'
Logically below race could trigger with the old code:

          test program                        migration thread
          ------------                        ----------------
       wait_until('postcopy-pause')
                                          postcopy_pause()
                                            set_state('postcopy-pause')
       do_postcopy_recover()
         arm s->to_dst_file with new fd
                                            release s->to_dst_file [1]

Here [1] could have released the just-installed recoverying channel.  Then the
migration could hang without really resuming.

Instead, it should be very safe to release the fd before setting the state into
'postcopy-pause', because there's no reason for any other thread to touch it
during 'postcopy-active'.

Dave reported a very rare postcopy recovery hang that the migration-test
program waited for the migration to complete in migrate_postcopy_complete().
We do suspect it's the same thing that we're gonna fix here.  Hard to tell.
However since we've noticed this, fix this irrelevant of the hang report.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201021212721.440373-6-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Peter Xu
0c26781c09 migration: Sync requested pages after postcopy recovery
We synchronize the requested pages right after a postcopy recovery happens.
This helps to synchronize the prioritized pages on source so that the faulted
threads can be served faster.

Reported-by: Xiaohui Li <xiaohli@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201021212721.440373-5-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Peter Xu
8f8bfffcf1 migration: Maintain postcopy faulted addresses
Maintain a list of faulted addresses on the destination host for which we're
waiting on.  This is implemented using a GTree rather than a real list to make
sure even there're plenty of vCPUs/threads that are faulting, the lookup will
still be fast with O(log(N)) (because we'll do that after placing each page).
It should bring a slight overhead, but ideally that shouldn't be a big problem
simply because in most cases the requested page list will be short.

Actually we did similar things for postcopy blocktime measurements.  This patch
didn't use that simply because:

  (1) blocktime measurement is towards vcpu threads only, but here we need to
      record all faulted addresses, including main thread and external
      thread (like, DPDK via vhost-user).

  (2) blocktime measurement will require UFFD_FEATURE_THREAD_ID, but here we
      don't want to add that extra dependency on the kernel version since not
      necessary.  E.g., we don't need to know which thread faulted on which
      page, we also don't care about multiple threads faulting on the same
      page.  But we only care about what addresses are faulted so waiting for a
      page copying from src.

  (3) blocktime measurement is not enabled by default.  However we need this by
      default especially for postcopy recover.

Another thing to mention is that this patch introduced a new mutex to serialize
the receivedmap and the page_requested tree, however that serialization does
not cover other procedures like UFFDIO_COPY.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201021212721.440373-4-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Peter Xu
7a267fc49b migration: Introduce migrate_send_rp_message_req_pages()
This is another layer wrapper for sending a page request to the source VM.  The
new migrate_send_rp_message_req_pages() will be used elsewhere in coming
patches.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201021212721.440373-3-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Peter Xu
eef621c4e6 migration: Pass incoming state into qemu_ufd_copy_ioctl()
It'll be used in follow up patches to access more fields out of it.  Meanwhile
fetch the userfaultfd inside the function.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201021212721.440373-2-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Bihong Yu
fe80c0241d migration: using trace_ to replace DPRINTF
Signed-off-by: Bihong Yu <yubihong@huawei.com>
Message-Id: <1603179176-5360-1-git-send-email-yubihong@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Bihong Yu
0bcae62333 migration: Delete redundant spaces
Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Chuan Zheng <zhengchuan@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1603163448-27122-9-git-send-email-yubihong@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Bihong Yu
cbfc71b52b migration: Open brace '{' following function declarations go on the next line
Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Chuan Zheng <zhengchuan@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1603163448-27122-8-git-send-email-yubihong@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Bihong Yu
49324e939c migration: Do not initialise statics and globals to 0 or NULL
Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Chuan Zheng <zhengchuan@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1603163448-27122-7-git-send-email-yubihong@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Bihong Yu
f4c51a6bfd migration: Add braces {} for if statement
Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Chuan Zheng <zhengchuan@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1603163448-27122-6-git-send-email-yubihong@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00
Bihong Yu
f16aee44b4 migration: Open brace '{' following struct go on the same line
Signed-off-by: Bihong Yu <yubihong@huawei.com>
Reviewed-by: Chuan Zheng <zhengchuan@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1603163448-27122-5-git-send-email-yubihong@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 16:15:04 +00:00