qemu/migration
Li Zhang 077fbb5942 multifd: Shut down the QIO channels to avoid blocking the send threads when they are terminated.
When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

(gdb) bt
0  0x00007f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
1  0x000055cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at ../util/qemu-thread-posix.c:627
2  0x000055cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
3  0x000055cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at ../migration/migration.c:1808
4  0x000055cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at ../migration/migration.c:1850
5  0x000055cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
6  0x000055cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
7  0x000055cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at ../util/aio-posix.c:381
8  0x000055cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, user_data=0x0) at ../util/async.c:311
9  0x00007f30c9c8cdf4 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
10 0x000055cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
11 0x000055cbb718521c in os_host_main_loop_wait (timeout=42251070366) at ../util/main-loop.c:255
12 0x000055cbb7185321 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
13 0x000055cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
14 0x000055cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c578888, envp=0x7ffc0c578ab0) at ../softmmu/main.c:50

To make sure that the send threads could be terminated, IO channels should be
shut down to avoid waiting IO.

Signed-off-by: Li Zhang <lizhang@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-12-15 10:31:42 +01:00
..
block-dirty-bitmap.c migration: block-dirty-bitmap: add missing qemu_mutex_lock_iothread 2021-10-05 13:10:29 +02:00
block.c migration: using trace_ to replace DPRINTF 2020-10-26 16:15:04 +00:00
block.h migration: disable auto-converge during bulk block migration 2017-09-27 11:27:14 +01:00
channel.c migration: Introduce migration_ioc_[un]register_yank() 2021-07-26 12:44:54 +01:00
channel.h migration: Route errors down through migration_channel_connect 2018-02-06 10:55:12 +00:00
colo-failover.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
colo.c migration/colo: Optimize COLO primary node start code path 2021-12-15 10:31:42 +01:00
dirtyrate.c migration/dirtyrate: implement dirty-bitmap dirtyrate calculation 2021-11-01 22:56:44 +01:00
dirtyrate.h migration/dirtyrate: introduce struct and adjust DirtyRateStat 2021-11-01 22:56:43 +01:00
exec.c migration: unify incoming processing 2018-07-10 12:48:53 +01:00
exec.h migration: Export exec.c functions in its own file 2017-06-01 18:49:22 +02:00
fd.c monitor: Use getter/setter functions for cur_mon 2020-10-09 07:08:19 +02:00
fd.h migration: Fix fd protocol for incoming defer 2019-06-05 12:43:55 +02:00
global_state.c migration: Silence compiler warning in global_state_store_running() 2020-10-02 12:28:48 +01:00
meson.build migration: Move populate_vfio_info() into a separate file 2021-05-14 12:31:51 +02:00
migration.c migration: Never call twice qemu_target_page_size() 2021-12-15 10:31:42 +01:00
migration.h migration: provide an error message to migration_cancel() 2021-11-03 09:38:53 +01:00
multifd-zlib.c multifd: remove used parameter from send_recv_pages() method 2021-12-15 10:31:42 +01:00
multifd-zstd.c multifd: remove used parameter from send_recv_pages() method 2021-12-15 10:31:42 +01:00
multifd.c multifd: Shut down the QIO channels to avoid blocking the send threads when they are terminated. 2021-12-15 10:31:42 +01:00
multifd.h multifd: remove used parameter from send_recv_pages() method 2021-12-15 10:31:42 +01:00
page_cache.c migration: Fix cache_init()'s "Failed to allocate" error messages 2021-02-08 11:19:51 +00:00
page_cache.h migration: Clean up signed vs. unsigned XBZRLE cache-size 2021-02-08 11:19:51 +00:00
postcopy-ram.c migration: Check that postcopy fd's are not NULL 2021-11-06 12:35:29 +01:00
postcopy-ram.h migration/: fix some comment spelling errors 2020-09-17 20:36:32 +02:00
qemu-file-channel.c migration: Move the yank unregister of channel_close out 2021-07-26 12:45:03 +01:00
qemu-file-channel.h migration: Export qemu-file-channel.c functions in its own file 2017-05-18 19:20:50 +02:00
qemu-file.c migration: Teach QEMUFile to be QIOChannel-aware 2021-07-26 12:44:59 +01:00
qemu-file.h migration: Teach QEMUFile to be QIOChannel-aware 2021-07-26 12:44:59 +01:00
ram.c migration: Remove is_zero_range() 2021-12-15 10:31:42 +01:00
ram.h Reset the auto-converge counter at every checkpoint. 2021-11-09 08:48:36 +01:00
rdma.c migration/rdma: Fix out of order wrid 2021-11-01 12:49:29 +01:00
rdma.h migration: Export rdma.c functions in its own file 2017-06-01 18:49:23 +02:00
savevm.c migration: Never call twice qemu_target_page_size() 2021-12-15 10:31:42 +01:00
savevm.h migration: Add blocker information 2021-02-08 11:19:51 +00:00
socket.c migration/socket: Close the listener at the end 2021-06-08 19:36:19 +01:00
socket.h migration: unify the framework of socket-type channel 2020-08-28 13:34:52 +01:00
target.c migration: Move populate_vfio_info() into a separate file 2021-05-14 12:31:51 +02:00
tls.c migration/tls: Use qcrypto_tls_creds_check_endpoint() 2021-06-29 18:30:20 +01:00
tls.h migration: Fix Lesser GPL version number 2020-11-15 16:43:28 +01:00
trace-events migration/dirtyrate: implement dirty-ring dirtyrate calculation 2021-11-01 22:56:43 +01:00
trace.h trace: switch position of headers to what Meson requires 2020-08-21 06:18:24 -04:00
vmstate-types.c migration: Replace migration's JSON writer by the general one 2020-12-19 10:39:16 +01:00
vmstate.c migration: Replace migration's JSON writer by the general one 2020-12-19 10:39:16 +01:00
xbzrle.c migration: Create migration/xbzrle.h 2017-05-18 18:04:54 +02:00
xbzrle.h migration: Create migration/xbzrle.h 2017-05-18 18:04:54 +02:00
yank_functions.c migration: Move the yank unregister of channel_close out 2021-07-26 12:45:03 +01:00
yank_functions.h migration: Move the yank unregister of channel_close out 2021-07-26 12:45:03 +01:00