qemu/util
Roman Pen 528f449f59 coroutine-lock: do not touch coroutine after another one has been entered
Submission of requests on linux aio is a bit tricky and can lead to
requests completions on submission path:

44713c9e85 ("linux-aio: Handle io_submit() failure gracefully")
0ed93d84ed ("linux-aio: process completions from ioq_submit()")

That means that any coroutine which has been yielded in order to wait
for completion can be resumed from submission path and be eventually
terminated (freed).

The following use-after-free crash was observed when IO throttling
was enabled:

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 0x7f5813dff700 (LWP 56417)]
 virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252
 (gdb) bt
 #0  virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252
                              ^^^^^^^^^^^^^^
                              remember the address

 #1  virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at virtio.c:282
 #2  virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, len=<optimized out>) at virtio.c:308
 #3  virtio_blk_req_complete (req=req@entry=0x7f5804009a30, status=status@entry=0 '\000') at virtio-blk.c:61
 #4  virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at virtio-blk.c:126
 #5  blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923
 #6  coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:78

 (gdb) p * elem
 $8 = {index = 77, out_num = 2, in_num = 1,
       in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0,
       in_sg = 0x0, out_sg = 0x7f5804009a50}
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       'in_sg' and 'out_sg' are invalid.
       e.g. it is impossible that 'in_sg' is zero,
       instead its value must be equal to:

       (gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * sizeof(elem->out_addr[0])
       $26 = 0x7f5804009af0

Seems 'elem' was corrupted.  Meanwhile another thread raised an abort:

 Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)):
 #0  raise () from /lib/x86_64-linux-gnu/libc.so.6
 #1  abort () from /lib/x86_64-linux-gnu/libc.so.6
 #2  qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113
 #3  qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60
 #4  qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119
                           ^^^^^^^^^^^^^^^^^^
                           WTF?? this is equal to elem from crashed thread

 #5  qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60
 #6  qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119
 #7  qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60
 #8  qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119
 #9  qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60
 #10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119
 #11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60
 #12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119
 #13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at qemu-coroutine-lock.c:106
 #14 timer_cb (blk=0x5598b1e74280, is_write=<optimized out>) at throttle-groups.c:419

Crash can be explained by access of 'co' object from the loop inside
qemu_co_queue_run_restart():

  while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) {
      QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next);
                           ^^^^^^^^^^^^^^^^^^^^
                           on each iteration 'co' is accessed,
                           but 'co' can be already freed

      qemu_coroutine_enter(next);
  }

When 'next' coroutine is resumed (entered) it can in its turn resume
'co', and eventually free it.  That's why we see 'co' (which was freed)
has the same address as 'elem' from the first backtrace.

The fix is obvious: use temporary queue and do not touch coroutine after
first qemu_coroutine_enter() is invoked.

The issue is quite rare and happens every ~12 hours on very high IO
and CPU load (building linux kernel with -j512 inside guest) when IO
throttling is enabled.  With the fix applied guest is running ~35 hours
and is still alive so far.

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170601160847.23720-1-roman.penyaev@profitbricks.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-07 14:39:00 +01:00
..
acl.c Drop Emacs local variables lists redundant with .dir-locals.el 2016-07-12 16:19:16 +02:00
aio-posix.c async: remove unnecessary inc/dec pairs 2017-02-21 11:39:40 +00:00
aio-win32.c async: remove unnecessary inc/dec pairs 2017-02-21 11:39:40 +00:00
aiocb.c block: move AioContext, QEMUTimer, main-loop to libqemuutil 2017-02-21 11:14:07 +00:00
async.c async: Introduce aio_co_enter 2017-04-11 20:07:15 +08:00
base64.c include/qemu/osdep.h: Don't include qapi/error.h 2016-03-22 22:20:15 +01:00
bitmap.c bitmap: add bitmap_copy_and_clear_atomic 2017-04-24 10:12:28 +02:00
bitops.c util: Clean up includes 2016-02-04 17:01:04 +00:00
buffer.c qemu-common: stop including qemu/host-utils.h from qemu-common.h 2016-05-19 16:42:28 +02:00
bufferiszero.c cutils: Rewrite x86 buffer zero checking 2016-09-14 12:25:14 +02:00
compatfd.c cpus: remove ugly cast on sigbus_handler 2017-03-03 16:40:02 +01:00
coroutine-sigaltstack.c coroutine-sigaltstack: use helper for allocating stack memory 2016-09-29 14:13:39 +02:00
coroutine-ucontext.c coroutine-ucontext: use helper for allocating stack memory 2016-09-29 14:13:39 +02:00
coroutine-win32.c coroutine: add a macro for the coroutine stack size 2016-09-29 14:13:39 +02:00
crc32c.c util: Clean up includes 2016-02-04 17:01:04 +00:00
cutils.c utils: provide size_to_str() 2017-05-17 17:30:45 +01:00
envlist.c util: Use g_malloc/g_free in envlist.c 2017-05-07 09:57:51 +03:00
error.c util/error: Fix leak in error_vprepend() 2017-04-24 09:12:59 +02:00
event_notifier-posix.c Remove/replace sysemu/char.h inclusion 2017-06-02 11:33:52 +04:00
event_notifier-win32.c event_notifier: prevent accidental use after close 2017-03-29 02:35:23 +03:00
fifo8.c migration: consolidate VMStateField.start 2017-02-13 17:27:13 +00:00
getauxval.c util: Clean up includes 2016-02-04 17:01:04 +00:00
hbitmap.c hbitmap: Add hbitmap_is_serializable() 2017-01-26 10:25:01 +08:00
hexdump.c util: Improved qemu_hexmap() to include an ascii dump of the buffer 2016-04-06 09:52:07 +08:00
host-utils.c host-utils: Implement unsigned quadword left/right shift and unit tests 2017-01-31 10:10:14 +11:00
id.c util: move declarations out of qemu-common.h 2016-03-22 22:20:17 +01:00
iohandler.c block: move AioContext, QEMUTimer, main-loop to libqemuutil 2017-02-21 11:14:07 +00:00
iov.c util: Fix assertion in iov_copy() upon zero 'bytes' and non-zero 'offset' 2016-08-03 18:44:57 +02:00
keyval.c qapi: Reject alternates that can't work with keyval_parse() 2017-05-31 16:04:09 +02:00
lockcnt.c qemu-thread: optimize QemuLockCnt with futexes on Linux 2017-01-16 13:25:18 +00:00
log.c util/cutils: Rename qemu_strtoll(), qemu_strtoull() 2017-02-23 20:35:35 +01:00
main-loop.c main-loop: Acquire main_context lock around os_host_main_loop_wait. 2017-04-03 19:13:12 +02:00
Makefile.objs qemu-ga: obey LISTEN_PID when using systemd socket activation 2017-03-19 11:12:12 +01:00
memfd.c os-posix: include sys/mman.h 2016-06-16 18:39:03 +02:00
mmap-alloc.c exec, kvm, target-ppc: Move getrampagesize() to common code 2017-03-03 11:30:59 +11:00
module.c module: Don't load the same module if requested multiple times 2016-10-07 14:14:06 +02:00
notify.c util: Clean up includes 2016-02-04 17:01:04 +00:00
osdep.c osdep: Fall back to posix lock when OFD lock is unavailable 2017-05-11 11:15:32 +02:00
oslib-posix.c oslib: strip trailing '\n' from error_setg() string argument 2017-06-07 14:38:44 +01:00
oslib-win32.c mem-prealloc: reduce large guest start-up and migration time. 2017-03-14 13:26:36 +01:00
path.c util: Removed unneeded header from path.c 2017-03-14 13:26:37 +01:00
qdist.c qdist: return "(empty)" instead of NULL when printing an empty dist 2016-08-03 18:44:56 +02:00
qemu-config.c util/qemu-config: Add loadparm to qemu machine_opts 2017-05-02 15:08:54 +02:00
qemu-coroutine-io.c coroutine: move entry argument to qemu_coroutine_create 2016-07-13 13:26:02 +02:00
qemu-coroutine-lock.c coroutine-lock: do not touch coroutine after another one has been entered 2017-06-07 14:39:00 +01:00
qemu-coroutine-sleep.c block: explicitly acquire aiocontext in timers that need it 2017-02-21 11:14:08 +00:00
qemu-coroutine.c coroutine-lock: do not touch coroutine after another one has been entered 2017-06-07 14:39:00 +01:00
qemu-error.c qemu-error: remove dependency of stubs on monitor 2016-11-01 16:06:57 +01:00
qemu-openpty.c util: Clean up includes 2016-02-04 17:01:04 +00:00
qemu-option.c QemuOpts: Simplify qemu_opts_to_qdict() 2017-05-09 09:14:39 +02:00
qemu-progress.c progress: Show current progress on SIGINFO 2017-04-28 18:48:11 +02:00
qemu-sockets.c sockets: Plug memory leak in socket_address_flatten() 2017-05-23 13:28:17 +02:00
qemu-thread-posix.c trace: add qemu mutex lock and unlock trace events 2017-05-05 12:09:59 +02:00
qemu-thread-win32.c trace: add qemu mutex lock and unlock trace events 2017-05-05 12:09:59 +02:00
qemu-timer-common.c util: Clean up includes 2016-02-04 17:01:04 +00:00
qemu-timer.c icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread 2017-03-14 13:51:34 +01:00
qht.c qht: fix unlock-after-free segfault upon resizing 2016-10-06 18:04:13 +02:00
range.c range: Replace internal representation of Range 2016-07-04 16:49:33 +03:00
rcu.c rcu: simplify memory barriers 2016-10-24 11:30:56 +02:00
readline.c util: move declarations out of qemu-common.h 2016-03-22 22:20:17 +01:00
systemd.c qemu-ga: obey LISTEN_PID when using systemd socket activation 2017-03-19 11:12:12 +01:00
thread-pool.c thread-pool: add missing qemu_bh_cancel in completion function 2017-03-17 12:54:21 +01:00
throttle.c throttle: make throttle_config(throttle_get_config()) symmetric 2017-04-21 10:36:12 +01:00
timed-average.c Fix some typos found by codespell 2016-05-18 15:04:27 +03:00
trace-events trace: add qemu mutex lock and unlock trace events 2017-05-05 12:09:59 +02:00
unicode.c util: move declarations out of qemu-common.h 2016-03-22 22:20:17 +01:00
uri.c Fix documentation and some comments (article, grammar) 2017-01-24 23:26:52 +03:00
uuid.c uuid: Tighten uuid parse 2016-09-23 11:42:52 +08:00