qemu/block
Greg Kurz 1a6d3bd229 block: End quiescent sections when a BDS is deleted
If a BDS gets deleted during blk_drain_all(), it might miss a
call to bdrv_do_drained_end(). This means missing a call to
aio_enable_external() and the AIO context remains disabled for
ever. This can cause a device to become irresponsive and to
disrupt the guest execution, ie. hang, loop forever or worse.

This scenario is quite easy to encounter with virtio-scsi
on POWER when punching multiple blockdev-create QMP commands
while the guest is booting and it is still running the SLOF
firmware. This happens because SLOF disables/re-enables PCI
devices multiple times via IO/MEM/MASTER bits of PCI_COMMAND
register after the initial probe/feature negotiation, as it
tends to work with a single device at a time at various stages
like probing and running block/network bootloaders without
doing a full reset in-between. This naturally generates many
dataplane stops and starts, and thus many drain sections that
can race with blockdev_create_run(). In the end, SLOF bails
out.

It is somehow reproducible on x86 but it requires to generate
articial dataplane start/stop activity with stop/cont QMP
commands. In this case, seabios ends up looping for ever,
waiting for the virtio-scsi device to send a response to
a command it never received.

Add a helper that pairs all previously called bdrv_do_drained_begin()
with a bdrv_do_drained_end() and call it from bdrv_close().
While at it, update the "/bdrv-drain/graph-change/drain_all"
test in test-bdrv-drain so that it can catch the issue.

BugId: https://bugzilla.redhat.com/show_bug.cgi?id=1874441
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <160346526998.272601.9045392804399803158.stgit@bahia.lan>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-27 15:26:20 +01:00
..
export block/export: add vhost-user-blk multi-queue support 2020-10-23 13:42:16 +01:00
monitor block/export: Add block-export-del 2020-10-02 15:46:40 +02:00
accounting.c block: add empty account cookie type 2019-10-10 10:56:18 +02:00
aio_task.c block: introduce aio task pool 2019-10-10 10:56:17 +02:00
amend.c block/amend: Check whether the node exists 2020-07-27 12:37:25 +02:00
backup-top.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
backup-top.h block: introduce backup-top filter driver 2019-10-10 10:56:18 +02:00
backup.c backup: Deal with filters 2020-09-07 12:31:31 +02:00
blkdebug.c block/blkdebug: fix memory leak 2020-10-13 13:33:46 +02:00
blklogwrites.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
blkreplay.c block: Use bdrv_default_perms() 2020-05-18 19:05:25 +02:00
blkverify.c error: Eliminate error_propagate() with Coccinelle, part 2 2020-07-10 15:18:08 +02:00
block-backend.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
block-copy.c block-copy: Use CAF to find sync=top base 2020-09-07 12:31:31 +02:00
block-gen.h scripts: add block-coroutine-wrapper.py 2020-10-05 10:59:06 +01:00
bochs.c block: Use bdrv_default_perms() 2020-05-18 19:05:25 +02:00
cloop.c block: Use bdrv_default_perms() 2020-05-18 19:05:25 +02:00
commit.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
copy-on-read.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
coroutines.h block/io: bdrv_common_block_status_above: support include_base 2020-10-23 13:42:16 +01:00
create.c block/create: Do not abort if a block driver is not available 2019-09-13 12:18:37 +02:00
crypto.c block/crypto: disallow write sharing by default 2020-07-21 10:49:02 +02:00
crypto.h block/crypto: implement the encryption key management 2020-07-06 08:49:28 +02:00
curl.c error: Eliminate error_propagate() with Coccinelle, part 1 2020-07-10 15:18:08 +02:00
dirty-bitmap.c block/dirty-bitmap: add bdrv_has_named_bitmaps helper 2020-05-28 13:15:22 -05:00
dmg-bz2.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
dmg-lzfse.c block: adding lzfse decompressing support as a module. 2018-12-14 11:52:40 +01:00
dmg.c block: Use bdrv_default_perms() 2020-05-18 19:05:25 +02:00
dmg.h Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
file-posix.c block/file: switch to use qemu_open/qemu_create for improved errors 2020-09-16 10:33:48 +01:00
file-win32.c block/file: switch to use qemu_open/qemu_create for improved errors 2020-09-16 10:33:48 +01:00
filter-compress.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
gluster.c error: Reduce unnecessary error propagation 2020-07-10 15:18:08 +02:00
io_uring.c io_uring: use io_uring_cq_ready() to check for ready cqes 2020-06-05 09:54:48 +01:00
io.c block: End quiescent sections when a BDS is deleted 2020-10-27 15:26:20 +01:00
iscsi-opts.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
iscsi.c qapi: Restrict query-uuid command to machine code 2020-09-29 15:41:35 +02:00
linux-aio.c misc: Replace zero-length arrays with flexible array member (automatic) 2020-03-16 22:07:42 +01:00
meson.build block/export: convert vhost-user-blk server to block export API 2020-10-23 13:42:16 +01:00
mirror.c block: Inline bdrv_co_block_status_from_*() 2020-09-07 12:31:31 +02:00
nbd.c block/nbd: nbd_co_reconnect_loop(): don't connect if drained 2020-10-09 15:04:32 -05:00
nfs.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
null.c block/null: Implement bdrv_get_allocated_file_size 2020-09-07 12:31:31 +02:00
nvme.c block/nvme: Add driver statistics for access alignment and hw errors 2020-10-23 13:42:16 +01:00
parallels.c error: Avoid error_propagate() after migrate_add_blocker() 2020-07-10 15:18:08 +02:00
parallels.h Clean up includes 2018-02-09 05:05:11 +01:00
qapi-sysemu.c block: Move system emulator QMP commands to block/qapi-sysemu.c 2020-03-06 17:15:38 +01:00
qapi.c migration: introduce icount field for snapshots 2020-10-06 08:34:49 +02:00
qcow2-bitmap.c qcow2: Use macros for the L1, refcount and bitmap table entry sizes 2020-09-15 11:05:12 +02:00
qcow2-cache.c core: replace getpagesize() with qemu_real_host_page_size 2019-10-26 15:38:06 +02:00
qcow2-cluster.c qcow2: Use L1E_SIZE in qcow2_write_l1_entry() 2020-10-02 15:46:40 +02:00
qcow2-refcount.c qcow2: Make qcow2_free_any_clusters() free only one cluster 2020-09-15 11:05:13 +02:00
qcow2-snapshot.c migration: introduce icount field for snapshots 2020-10-06 08:34:49 +02:00
qcow2-threads.c qcow2: add zstd cluster compression 2020-05-13 14:20:31 +02:00
qcow2.c qcow2: Skip copy-on-write when allocating a zero cluster 2020-10-27 15:26:20 +01:00
qcow2.h qcow2: introduce icount field for snapshots 2020-10-06 08:34:49 +02:00
qcow.c block/qcow: remove runtime opts 2020-09-15 11:05:13 +02:00
qed-check.c block/qed: add missed coroutine_fn markers 2019-04-30 15:29:00 +02:00
qed-cluster.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-l2-cache.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-table.c block/qed: add missed coroutine_fn markers 2019-04-30 15:29:00 +02:00
qed.c qapi: Smooth another visitor error checking pattern 2020-07-10 15:18:08 +02:00
qed.h qed: Simplify backing reads 2020-07-06 10:34:14 +02:00
quorum.c block/quorum.c: stable children names 2020-09-15 11:05:12 +02:00
raw-format.c error: Eliminate error_propagate() with Coccinelle, part 2 2020-07-10 15:18:08 +02:00
rbd.c block/rbd: add 'namespace' to qemu_rbd_strong_runtime_opts[] 2020-09-15 11:31:10 +02:00
replication.c error: Reduce unnecessary error propagation 2020-07-10 15:18:08 +02:00
sheepdog.c block: deprecate the sheepdog block driver 2020-10-15 16:06:28 +02:00
snapshot.c block/snapshot: Fix fallback 2020-09-07 12:31:31 +02:00
ssh.c qapi: Smooth another visitor error checking pattern 2020-07-10 15:18:08 +02:00
stream.c stream: Deal with filters 2020-09-07 12:31:31 +02:00
throttle-groups.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
throttle.c qemu/atomic.h: rename atomic_ to qatomic_ 2020-09-23 16:07:44 +01:00
trace-events trace-events: Fix attribution of trace points to source 2020-09-09 17:17:58 +01:00
trace.h trace: switch position of headers to what Meson requires 2020-08-21 06:18:24 -04:00
vdi.c error: Avoid error_propagate() after migrate_add_blocker() 2020-07-10 15:18:08 +02:00
vhdx-endian.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
vhdx-log.c block: Add flags to bdrv(_co)_truncate() 2020-04-30 17:51:07 +02:00
vhdx.c block/vhdx: Support vhdx image only with 512 bytes logical sector size 2020-09-15 11:05:13 +02:00
vhdx.h block/vhdx: Use IEC binary prefixes for size constants 2019-04-30 15:29:00 +02:00
vmdk.c vmdk: fix maybe uninitialized warnings 2020-10-13 13:33:45 +02:00
vpc.c error: Avoid error_propagate() after migrate_add_blocker() 2020-07-10 15:18:08 +02:00
vvfat.c util: rename qemu_open() to qemu_open_old() 2020-09-16 10:33:48 +01:00
win32-aio.c Include qemu/module.h where needed, drop it from qemu-common.h 2019-06-12 13:18:33 +02:00
write-threshold.c qapi: Drop qapi_event_send_FOO()'s Error ** argument 2018-08-28 18:21:38 +02:00