qemu/util at 73fb4f1de3271c6407d4b110eb136805642498da - qemu

History

Fam Zheng b37548fcd1 aio: Do aio_notify_accept only during blocking aio_poll

An aio_notify() pairs with an aio_notify_accept(). The former should
happen in the main thread or a vCPU thread, and the latter should be
done in the IOThread.

There is one rare case that the main thread or vCPU thread may "steal"
the aio_notify() event just raised by itself, in bdrv_set_aio_context()
[1]. The sequence is like this:

    main thread                     IO Thread
    ===============================================================
    bdrv_drained_begin()
      aio_disable_external(ctx)
                                    aio_poll(ctx, true)
                                      ctx->notify_me += 2
    ...
    bdrv_drained_end()
      ...
        aio_notify()
    ...
    bdrv_set_aio_context()
      aio_poll(ctx, false)
[1]     aio_notify_accept(ctx)
                                      ppoll() /* Hang! */

[1] is problematic. It will clear the ctx->notifier event so that
the blocked ppoll() will not return.

(For the curious, this bug was noticed when booting a number of VMs
simultaneously in RHV.  One or two of the VMs will hit this race
condition, making the VIRTIO device unresponsive to I/O commands. When
it hangs, Seabios is busy waiting for a read request to complete (read
MBR), right after initializing the virtio-blk-pci device, using 100%
guest CPU. See also https://bugzilla.redhat.com/show_bug.cgi?id=1562750
for the original bug analysis.)

aio_notify() only injects an event when ctx->notify_me is set,
correspondingly aio_notify_accept() is only useful when ctx->notify_me
_was_ set. Move the call to it into the "blocking" branch. This will
effectively skip [1] and fix the hang.

Furthermore, blocking aio_poll is only allowed on home thread
(in_aio_context_home_thread), because otherwise two blocking
aio_poll()'s can steal each other's ctx->notifier event and cause
hanging just like described above.

Cc: qemu-stable@nongnu.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180809132259.18402-3-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

2018-08-15 10:12:35 +08:00

acl.c

…

aio-posix.c

aio: Do aio_notify_accept only during blocking aio_poll

2018-08-15 10:12:35 +08:00

aio-wait.c

…

aio-win32.c

aio: Do aio_notify_accept only during blocking aio_poll

2018-08-15 10:12:35 +08:00

aiocb.c

…

async.c

linux-aio: properly bubble up errors from initialization

2018-06-27 13:06:34 +01:00

base64.c

…

bitmap.c

…

bitops.c

…

buffer.c

…

bufferiszero.c

…

cacheinfo.c

…

compatfd.c

…

coroutine-sigaltstack.c

…

coroutine-ucontext.c

…

coroutine-win32.c

…

crc32c.c

…

cutils.c

cutils: Provide strchrnul

2018-06-29 12:32:10 +02:00

envlist.c

…

error.c

…

event_notifier-posix.c

…

event_notifier-win32.c

…

fifo8.c

…

getauxval.c

…

hbitmap.c

hbitmap: Add @advance param to hbitmap_iter_next()

2018-06-18 17:04:55 +02:00

hexdump.c

…

host-utils.c

…

id.c

…

iohandler.c

…

iov.c

…

iova-tree.c

util: remove redundant include of glib.h and add osdep.h

2018-06-29 12:22:28 +01:00

keyval.c

…

lockcnt.c

…

log.c

…

main-loop.c

main-loop: drop spin_counter

2018-06-01 16:01:29 +01:00

Makefile.objs

util: implement simple iova tree

2018-05-23 17:33:58 +03:00

memfd.c

memfd: Avoid Coverity warning about integer overflow

2018-06-01 15:13:46 +02:00

mmap-alloc.c

…

module.c

…

notify.c

…

osdep.c

glib: bump min required glib library version to 2.40

2018-06-29 12:22:28 +01:00

oslib-posix.c

…

oslib-win32.c

…

pagesize.c

…

path.c

…

qdist.c

…

qemu-config.c

block: Add block-specific QDict header

2018-06-15 14:49:44 +02:00

qemu-coroutine-io.c

…

qemu-coroutine-lock.c

…

qemu-coroutine-sleep.c

…

qemu-coroutine.c

…

qemu-error.c

…

qemu-openpty.c

…

qemu-option.c

opts: remove redundant check for NULL parameter

2018-07-17 16:24:50 +02:00

qemu-progress.c

…

qemu-sockets.c

…

qemu-thread-common.h

QemuMutex: support --enable-debug-mutex

2018-06-28 19:05:32 +02:00

qemu-thread-posix.c

qemu-thread: introduce qemu-thread-common.h

2018-06-28 19:05:31 +02:00

qemu-thread-win32.c

qemu-thread: introduce qemu-thread-common.h

2018-06-28 19:05:31 +02:00

qemu-timer-common.c

…

qemu-timer.c

timer: remove replay clock probe in deadline calculation

2018-07-30 14:00:11 +02:00

qht.c

qht: return existing entry when qht_insert fails

2018-06-15 07:42:55 -10:00

range.c

…

rcu.c

…

readline.c

…

stats64.c

…

sys_membarrier.c

…

systemd.c

…

thread-pool.c

…

throttle.c

…

timed-average.c

…

trace-events

…

unicode.c

…

uri.c

cutils: Provide strchrnul

2018-06-29 12:32:10 +02:00

uuid.c

…

vfio-helpers.c

replace functions which are only available in glib-2.24

2018-05-20 08:55:01 +03:00