qemu/block
Changlong Xie 9bc9732fae nbd: Use CoQueue for free_sema instead of CoMutex
NBD is using the CoMutex in a way that wasn't anticipated. For example, if there are
N(N=26, MAX_NBD_REQUESTS=16) nbd write requests, so we will invoke nbd_client_co_pwritev
N times.
----------------------------------------------------------------------------------------
time request Actions
1    1       in_flight=1, Coroutine=C1
2    2       in_flight=2, Coroutine=C2
...
15   15      in_flight=15, Coroutine=C15
16   16      in_flight=16, Coroutine=C16, free_sema->holder=C16, mutex->locked=true
17   17      in_flight=16, Coroutine=C17, queue C17 into free_sema->queue
18   18      in_flight=16, Coroutine=C18, queue C18 into free_sema->queue
...
26   N       in_flight=16, Coroutine=C26, queue C26 into free_sema->queue
----------------------------------------------------------------------------------------

Once nbd client recieves request No.16' reply, we will re-enter C16. It's ok, because
it's equal to 'free_sema->holder'.
----------------------------------------------------------------------------------------
time request Actions
27   16      in_flight=15, Coroutine=C16, free_sema->holder=C16, mutex->locked=false
----------------------------------------------------------------------------------------

Then nbd_coroutine_end invokes qemu_co_mutex_unlock what will pop coroutines from
free_sema->queue's head and enter C17. More free_sema->holder is C17 now.
----------------------------------------------------------------------------------------
time request Actions
28   17      in_flight=16, Coroutine=C17, free_sema->holder=C17, mutex->locked=true
----------------------------------------------------------------------------------------

In above scenario, we only recieves request No.16' reply. As time goes by, nbd client will
almostly recieves replies from requests 1 to 15 rather than request 17 who owns C17. In this
case, we will encounter assert "mutex->holder == self" failed since Kevin's commit 0e438cdc
"coroutine: Let CoMutex remember who holds it". For example, if nbd client recieves request
No.15' reply, qemu will stop unexpectedly:
----------------------------------------------------------------------------------------
time request       Actions
29   15(most case) in_flight=15, Coroutine=C15, free_sema->holder=C17, mutex->locked=false
----------------------------------------------------------------------------------------

Per Paolo's suggestion "The simplest fix is to change it to CoQueue, which is like a condition
variable", this patch replaces CoMutex with CoQueue.

Cc: Wen Congyang <wency@cn.fujitsu.com>
Reported-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-Id: <1476267508-19499-1-git-send-email-xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
..
accounting.c
archipelago.c
backup.c block: Use block_job_add_bdrv() in backup_start() 2016-10-31 16:52:38 +01:00
blkdebug.c
blkreplay.c
blkverify.c
block-backend.c block: introduce BDRV_POLL_WHILE 2016-10-28 21:50:18 +08:00
bochs.c
cloop.c
commit.c block: Block all nodes involved in the block-commit operation 2016-10-31 16:52:38 +01:00
crypto.c
curl.c
dirty-bitmap.c
dmg-bz2.c
dmg.c
dmg.h
gluster.c
io.c block: Add bdrv_drain_all_{begin,end}() 2016-10-31 16:51:14 +01:00
iscsi.c
linux-aio.c
Makefile.objs
mirror.c block: Block all intermediate nodes in commit_active_start() 2016-10-31 16:52:38 +01:00
nbd-client.c nbd: Use CoQueue for free_sema instead of CoMutex 2016-11-01 16:06:57 +01:00
nbd-client.h nbd: Use CoQueue for free_sema instead of CoMutex 2016-11-01 16:06:57 +01:00
nbd.c Merge qio 2016/10/27 v1 2016-10-28 15:30:55 +01:00
nfs.c block/nfs: Introduce runtime_opts in NFS 2016-10-31 16:52:39 +01:00
null.c
parallels.c
qapi.c qapi: rename QmpOutputVisitor to QObjectOutputVisitor 2016-10-25 16:25:54 +02:00
qcow2-cache.c
qcow2-cluster.c
qcow2-refcount.c
qcow2-snapshot.c
qcow2.c
qcow2.h
qcow.c
qed-check.c
qed-cluster.c
qed-gencb.c
qed-l2-cache.c
qed-table.c block: introduce BDRV_POLL_WHILE 2016-10-28 21:50:18 +08:00
qed.c qed: Implement .bdrv_drain 2016-10-28 21:50:18 +08:00
qed.h
quorum.c
raw_bsd.c raw_bsd: add offset and size options 2016-10-31 16:52:39 +01:00
raw-posix.c raw-posix: Don't use bdrv_ioctl() 2016-10-27 19:05:23 +02:00
raw-win32.c
rbd.c
replication.c block: prepare bdrv_reopen_multiple to release AioContext 2016-10-28 21:50:18 +08:00
sheepdog.c block: only call aio_poll on the current thread's AioContext 2016-10-28 21:50:18 +08:00
snapshot.c
ssh.c block/ssh: Use InetSocketAddress options 2016-10-31 16:49:13 +01:00
stream.c block: Support streaming to an intermediate layer 2016-10-31 16:52:38 +01:00
throttle-groups.c
trace-events block: Remove bdrv_aio_pdiscard() 2016-10-27 19:05:22 +02:00
vdi.c
vhdx-endian.c
vhdx-log.c
vhdx.c
vhdx.h
vmdk.c
vpc.c
vvfat.c
win32-aio.c
write-threshold.c