From 6fccbb475bc6effc313ee9481726a1748b6dae57 Mon Sep 17 00:00:00 2001 From: Stefan Hajnoczi Date: Wed, 4 Jul 2018 15:54:10 +0100 Subject: [PATCH] throttle-groups: fix hang when group member leaves Throttle groups consist of members sharing one throttling state (including bps/iops limits). Round-robin scheduling is used to ensure fairness. If a group member already has a timer pending then other groups members do not schedule their own timers. The next group member will have its turn when the existing timer expires. A hang may occur when a group member leaves while it had a timer scheduled. Although the code carefully removes the group member from the round-robin list, it does not schedule the next member. Therefore remaining members continue to wait for the removed member's timer to expire. This patch schedules the next request if a timer is pending. Unfortunately the actual bug is a race condition that I've been unable to capture in a test case. Sometimes drive2 hangs when drive1 is removed from the throttling group: $ qemu ... -drive if=none,id=drive1,cache=none,format=qcow2,file=data1.qcow2,iops=100,group=foo \ -device virtio-blk-pci,id=virtio-blk-pci0,drive=drive1 \ -drive if=none,id=drive2,cache=none,format=qcow2,file=data2.qcow2,iops=10,group=foo \ -device virtio-blk-pci,id=virtio-blk-pci1,drive=drive2 (guest-console1)# fio -filename /dev/vda 4k-seq-read.job (guest-console2)# fio -filename /dev/vdb 4k-seq-read.job (qmp) {"execute": "block_set_io_throttle", "arguments": {"device": "drive1","bps": 0,"bps_rd": 0,"bps_wr": 0,"iops": 0,"iops_rd": 0,"iops_wr": 0}} Reported-by: Nini Gu Signed-off-by: Stefan Hajnoczi Message-id: 20180704145410.794-1-stefanha@redhat.com RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1535914 Cc: Alberto Garcia Signed-off-by: Stefan Hajnoczi --- block/throttle-groups.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/block/throttle-groups.c b/block/throttle-groups.c index 36cc0430c3..e297b04e17 100644 --- a/block/throttle-groups.c +++ b/block/throttle-groups.c @@ -564,6 +564,10 @@ void throttle_group_unregister_tgm(ThrottleGroupMember *tgm) qemu_mutex_lock(&tg->lock); for (i = 0; i < 2; i++) { + if (timer_pending(tgm->throttle_timers.timers[i])) { + tg->any_timer_armed[i] = false; + schedule_next_request(tgm, i); + } if (tg->tokens[i] == tgm) { token = throttle_group_next_tgm(tgm); /* Take care of the case where this is the last tgm in the group */