qemu/block
Sergio Lopez 0abf258171 block/backup-top: Don't acquire context while dropping top
All paths that lead to bdrv_backup_top_drop(), except for the call
from backup_clean(), imply that the BDS AioContext has already been
acquired, so doing it there too can potentially lead to QEMU hanging
on AIO_WAIT_WHILE().

An easy way to trigger this situation is by issuing a two actions
transaction, with a proper and a bogus blockdev-backup, so the second
one will trigger a rollback. This will trigger a hang with an stack
trace like this one:

 #0  0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>,
     timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
 #1  0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>)
     at /usr/include/bits/poll2.h:77
 #2  0x000055e743386e09 in qemu_poll_ns
     (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336
 #3  0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true)
     at util/aio-posix.c:669
 #4  0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878
 #5  0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017
 #6  0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262
 #7  0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644
 #8  0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273
 #9  0x000055e74331461f in backup_job_create
     (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478
 #10 0x000055e74315bc52 in do_backup_common
     (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0)
     at blockdev.c:3580
 #11 0x000055e74315c37c in do_blockdev_backup
     (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0)
     at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492
 #12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018)
     at blockdev.c:1885
 #13 0x000055e743160152 in qmp_transaction
     (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340
 #14 0x000055e743287ff5 in qmp_marshal_transaction
     (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8)
     at qapi/qapi-commands-transaction.c:44
 #15 0x000055e74333de6c in do_qmp_dispatch
     (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132
 #16 0x000055e74333de6c in qmp_dispatch
     (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>)
     at qapi/qmp-dispatch.c:175
 #17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>)
     at monitor/qmp.c:145
 #18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234
 #19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117
 #20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117
 #21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459
 #22 0x000055e743385742 in aio_ctx_dispatch
     (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
 #23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176
 #24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829
 #25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219
 #26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
 #27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
 #28 0x000055e74316a3c1 in main_loop () at vl.c:1828
 #29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
     at vl.c:4504

Fix this by not acquiring the AioContext there, and ensuring all paths
leading to it have it already acquired (backup_clean()).

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111
Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-01-27 17:19:53 +01:00
..
accounting.c block: add empty account cookie type 2019-10-10 10:56:18 +02:00
aio_task.c block: introduce aio task pool 2019-10-10 10:56:17 +02:00
backup-top.c block/backup-top: Don't acquire context while dropping top 2020-01-27 17:19:53 +01:00
backup-top.h block: introduce backup-top filter driver 2019-10-10 10:56:18 +02:00
backup.c block/backup-top: Don't acquire context while dropping top 2020-01-27 17:19:53 +01:00
blkdebug.c blkdebug: Allow taking/unsharing permissions 2020-01-06 13:43:06 +01:00
blklogwrites.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
blkreplay.c block: implement bdrv_snapshot_goto for blkreplay 2019-10-14 17:12:48 +02:00
blkverify.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
block-backend.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
block-copy.c block/block-copy: fix s->copy_size for compressed cluster 2019-11-04 09:21:45 +01:00
bochs.c block: Use QEMU_IS_ALIGNED 2019-09-16 14:48:30 +02:00
cloop.c block: Use QEMU_IS_ALIGNED 2019-09-16 14:48:30 +02:00
commit.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
copy-on-read.c block/cor: Drop cor_co_truncate() 2019-10-28 11:59:51 +01:00
create.c block/create: Do not abort if a block driver is not available 2019-09-13 12:18:37 +02:00
crypto.c block: Let format drivers pass @exact 2019-10-28 12:05:30 +01:00
crypto.h Clean up ill-advised or unusual header guards 2019-05-13 08:58:55 +02:00
curl.c curl: Check curl_multi_add_handle()'s return code 2019-09-16 15:31:12 +02:00
dirty-bitmap.c bitmap: Enforce maximum bitmap name length 2019-11-18 16:01:34 -06:00
dmg-bz2.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
dmg-lzfse.c block: adding lzfse decompressing support as a module. 2018-12-14 11:52:40 +01:00
dmg.c block: Use QEMU_IS_ALIGNED 2019-09-16 14:48:30 +02:00
dmg.h Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
file-posix.c block/file-posix: Fix laio_init() error handling crash bug 2019-12-02 16:14:41 +01:00
file-win32.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
filter-compress.c block: introduce compress filter driver 2020-01-06 13:43:07 +01:00
gluster.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
io.c block: Add bdrv_co_get_self_request() 2019-11-04 09:32:51 +01:00
iscsi-opts.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
iscsi.c block: Evaluate @exact in protocol drivers 2019-10-28 12:05:24 +01:00
linux-aio.c linux-aio: increasing MAX_EVENTS to a larger hardcoded value 2020-01-13 16:41:45 +00:00
Makefile.objs block: introduce compress filter driver 2020-01-06 13:43:07 +01:00
mirror.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
nbd.c nbd: assert that Error** is not NULL in nbd_iter_channel_error 2019-12-18 08:43:19 +01:00
nfs.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
null.c replay: add BH oneshot event for block layer 2019-10-14 17:12:48 +02:00
nvme.c block/nvme: add support for discard 2019-10-28 11:34:35 +01:00
parallels.c Block patches for softfreeze: 2019-10-28 14:40:01 +00:00
parallels.h Clean up includes 2018-02-09 05:05:11 +01:00
qapi.c qapi: query-blockstat: add driver specific file-posix stats 2019-10-10 10:56:18 +02:00
qcow2-bitmap.c qcow2-bitmaps: fix qcow2_can_store_new_dirty_bitmap 2020-01-06 13:43:06 +01:00
qcow2-cache.c core: replace getpagesize() with qemu_real_host_page_size 2019-10-26 15:38:06 +02:00
qcow2-cluster.c qcow2: Limit total allocation range to INT_MAX 2019-10-14 17:12:48 +02:00
qcow2-refcount.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
qcow2-snapshot.c qcow2: Fix v3 snapshot table entry compliancy 2019-10-28 11:54:09 +01:00
qcow2-threads.c block/qcow2: refactor encryption code 2019-09-16 15:36:22 +02:00
qcow2.c qcow2: Allow writing compressed data of multiple clusters 2020-01-06 13:43:07 +01:00
qcow2.h qcow2: Fix QCOW2_COMPRESSED_SECTOR_MASK 2019-11-07 14:37:46 +01:00
qcow.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
qed-check.c block/qed: add missed coroutine_fn markers 2019-04-30 15:29:00 +02:00
qed-cluster.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-l2-cache.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-table.c block/qed: add missed coroutine_fn markers 2019-04-30 15:29:00 +02:00
qed.c block: Pass truncate exact=true where reasonable 2019-10-28 12:08:45 +01:00
qed.h block/qed: add missed coroutine_fn markers 2019-04-30 15:29:00 +02:00
quorum.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
raw-format.c block: Let format drivers pass @exact 2019-10-28 12:05:30 +01:00
rbd.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
replication.c block/backup: use backup-top instead of write notifiers 2019-10-10 10:56:18 +02:00
sheepdog.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
snapshot.c block/snapshot: rename Error ** parameter to more common errp 2019-12-18 08:43:19 +01:00
ssh.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
stream.c job: drop job_drain 2019-09-10 08:58:43 +02:00
throttle-groups.c throttle-groups: fix memory leak in throttle_group_set_limit: 2020-01-06 13:43:06 +01:00
throttle.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
trace-events block/nvme: add support for discard 2019-10-28 11:34:35 +01:00
vdi.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
vhdx-endian.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
vhdx-log.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
vhdx.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
vhdx.h block/vhdx: Use IEC binary prefixes for size constants 2019-04-30 15:29:00 +02:00
vmdk.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
vpc.c block: Add @exact parameter to bdrv_co_truncate() 2019-10-28 12:00:07 +01:00
vvfat.c block: Use QEMU_IS_ALIGNED 2019-09-16 14:48:30 +02:00
vxhs.c replay: add BH oneshot event for block layer 2019-10-14 17:12:48 +02:00
win32-aio.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
write-threshold.c qapi: Drop qapi_event_send_FOO()'s Error ** argument 2018-08-28 18:21:38 +02:00