Now requests are submitted as a batch, so it is natural
to notify guest as a batch too.
This may suppress interrupt notification to VM a lot:
- in my test, decreased by ~13K/sec
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The callback has to be saved and reset in virtio_blk_data_plane_start(),
otherwise dataplane's requests will be completed in qemu aio context.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The memory allocation between hw/block/virtio-blk.c,
hw/block/dataplane/virtio-blk.c, and hw/virtio/dataplane/vring.c is
messy. Structs are allocated in different files than they are freed in.
This is risky and makes memory leaks easier.
Embed VirtQueueElement in VirtIOBlockReq to reduce the amount of memory
allocation we need to juggle. This also makes vring.c and virtio.c
slightly more similar.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VirtIOBlockReq is freed later by virtio_blk_free_request() in
hw/block/virtio-blk.c. Remove this extraneous g_slice_free().
This patch fixes the following segfault:
0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
99 bdrv_acct_done(req->dev->bs, &req->acct);
(gdb) print req
$1 = (VirtIOBlockReq *) 0x5555565ff5e0
(gdb) print req->dev
$2 = (VirtIOBlock *) 0x0
(gdb) bt
#0 0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
#1 0x0000555555840ebe in bdrv_co_em_bh (opaque=0x5555566152d0) at block.c:4675
#2 0x000055555583de77 in aio_bh_poll (ctx=ctx@entry=0x5555563a8150) at async.c:81
#3 0x000055555584b7a7 in aio_poll (ctx=0x5555563a8150, blocking=blocking@entry=true) at aio-posix.c:188
#4 0x00005555556e520e in iothread_run (opaque=0x5555563a7fd8) at iothread.c:41
#5 0x00007ffff42ba124 in start_thread () from /usr/lib/libpthread.so.0
#6 0x00007ffff16d14bd in clone () from /usr/lib/libc.so.6
Reported-by: Max Reitz <mreitz@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Before commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O), dataplane for virtio-blk submits block
I/O as a batch.
This commit 580b6b2aa2 replaces the custom linux AIO
implementation(including submit I/O as a batch) with QEMU
block layer, but this commit causes ~40% throughput regression
on virtio-blk performance, and removing submitting I/O
as a batch is one of the causes.
This patch applies the newly introduced bdrv_io_plug() and
bdrv_io_unplug() interfaces to support submitting I/O
at batch for Qemu block layer, and in my test, the change
can improve throughput by ~30% with 'aio=native'.
Following my fio test script:
[global]
direct=1
size=4G
bsrange=4k-4k
timeout=40
numjobs=4
ioengine=libaio
iodepth=64
filename=/dev/vdc
group_reporting=1
[f]
rw=randread
Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
- qemu master: 65K IOPS
- qemu master with these patches: 92K IOPS
- 2.0.0 release(dataplane using custom linux aio): 104K IOPS
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The x-data-plane=on|off option is no longer useful because the
iothread=<iothread> option conveys the same information plus which
IOThread to use.
Do not delete x-data-plane=on|off yet as a convenience to people using
this legacy experimental option. We will drop it in QEMU 2.2.
Instead, turn on data-plane when either x-data-plane=on or
iothread=<iothread> are used. The following command-line uses
data-plane:
qemu -device virtio-blk-pci,iothread=foo,drive=drive0
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the virtio transport does not support notifiers (like s390-virtio),
we can't use dataplane. Bail out early and let the user know what is
wrong.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The old name is misleading in its new usage, so rename it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This drops request handling code from dataplane, and uses code from
hw/block/virtio-blk.c.
It starts to use multiwrite as non-dataplane does.
Dataplane sets VirtIOBlock.complete_request to vring version, and calls
into non-dataplane's process handling. In complete_request_early,
qiov.size is added to vring push length, because it's also called in rw
completion now.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VirtIOBlockReq is allocated in process_request, and freed in command
functions.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In current virtio spec, inhdr is a single byte, and is unlikely to
change for both functionality and compatibility considerations.
Non-dataplane uses .in, and we are on the way to converge them. So
let's unify it to get cleaner code.
Remove .inhdr and use .in.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Field "inhdr" is added temporarily for a more mechanical change, and
will be dropped in the next commit.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since it's set but not used.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block layer will handle the unaligned request.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Dataplane now uses block layer. Protect bdrv_set_enable_write_cache with
aio_context_acquire and aio_context_release, so we can enable config-wce
to allow guest to modify the write cache online.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stop using the raw-posix file descriptor for synchronous
qemu_fdatasync(). Use bdrv_aio_flush() instead and drop the
VirtIOBlockDataPlane->fd field.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stop using a custom Linux AIO request queue from ioq.h and instead use
the QEMU block layer for I/O.
This patch adjusts the VirtIOBlockRequest struct with fields needed for
bdrv_aio_readv()/bdrv_aio_writev(). ioq.h used struct iovec and struct
iocb, which we don't need directly anymore.
Modify dataplane start/stop to set the AioContext on the
BlockDriverState. We also no longer need to get the raw-posix file
descriptor. This means image formats are now supported with dataplane!
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This drops BlockDriverState.in_use with op_blockers:
- Call bdrv_op_block_all in place of bdrv_set_in_use(bs, 1).
- Call bdrv_op_unblock_all in place of bdrv_set_in_use(bs, 0).
- Check bdrv_op_is_blocked() in place of bdrv_in_use(bs).
The specific types are used, e.g. in place of starting block backup,
bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, ...).
There is one exception in block_job_create, where
bdrv_op_blocker_is_empty() is used, because we don't know the operation
type here. This doesn't matter because in a few commits away we will drop
the check and move it to callers that _do_ know the type.
- Check bdrv_op_blocker_is_empty() in place of assert(!bs->in_use).
Note: there is only bdrv_op_block_all and bdrv_op_unblock_all callers at
this moment. So although the checks are specific to op types, this
changes can still be seen as identical logic with previously with
in_use. The difference is error message are improved because of blocker
error info.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Before IOThread was its own object, each virtio-blk device would create
its own internal thread. We need to preserve this behavior for
backwards compatibility when users do not specify -device
virtio-blk-pci,iothread=<id>.
This patch changes how the internal IOThread object is created.
Previously we used the monitor object_add() function, which is really a
layering violation. The problem is that this needs to assign a name but
we don't have a name for this internal object.
Generating names for internal objects is a pain but even worse is that
they may collide with user-defined names.
Paolo Bonzini <pbonzini@redhat.com> suggested that the internal IOThread
object should not be named. This way the conflict cannot happen and we
no longer need object_add().
One gotcha is that internal IOThread objects will not be listed by the
query-iothreads command since they are not named. This is okay though
because query-iothreads is new and the internal IOThread is just for
backwards compatibility. New users should explicitly define IOThread
objects.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When creating an IOThread implicitly (the user did not specify
x-iothread=<id>) remember that iothread_find() does not return the
object with an incremented refcount.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Today virtio-blk dataplane uses a 1:1 device-per-thread model. Now that
IOThreads have been introduced we can generalize this to N:M devices per
threads.
This patch drops thread code from dataplane in favor of running inside
an IOThread AioContext.
As a bonus we solve the case where a guest keeps submitting I/O requests
while dataplane is trying to stop. Previously the dataplane thread
would continue to process requests until the request gave it a break.
Now we can shut down in bounded time thanks to
aio_context_acquire/release.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If enabled, set the thread name at creation (on GNU systems with
pthread_set_np)
Fix up all the callers with a thread name
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Return an Error so that it can be propagated later.
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
[AF: Rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid trying to setup dataplane again if dataplane setup is already in
progress. This may happen if an eventfd is triggered during setup.
I saw this occasionally with an experimental s390 irqfd implementation:
virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> virtio_ccw_set_host_notifier
...
-> virtio_queue_set_host_notifier_fd_handler
-> virtio_queue_host_notifier_read
-> virtio_queue_notify_vq
-> virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> vring_setup
-> hostmem_init
-> memory_listener_register
-> BOOM
As virtio-ccw tries to follow what virtio-pci does, it might be triggerable
for other platforms as well.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The .io_flush() handler no longer exists and has no users. Drop the
io_flush argument to aio_set_fd_handler() and related functions.
The AioFlushEventNotifierHandler and AioFlushHandler typedefs are no
longer used and are dropped too.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Check exit conditions before entering blocking aio_poll(). This is
mainly for consistency since it's unlikely that we are stopping in the
first event loop iteration.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Dataplane must check whether a block device is in use before launching
the dataplane thread. This is necessary since the thread does not
synchronize with the main loop and I/O requests could cause corruption.
One example is when a drive is added and a block job is started before
hotplugging the virtio-blk-pci adapter. In this case we must not use
dataplane mode.
Cc: qemu-stable@nongnu.org
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Although the dataplane thread does not cooperate with dirty memory
logging yet it's fairly easy to temporarily disable dataplane during
live migration. This way virtio-blk can live migrate when
x-data-plane=on.
The dataplane thread will restart after migration is cancelled or if the
guest resuming virtio-blk operation after migration completes.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Load the virtio.c state into vring.c when we start dataplane mode and
vice versa when stopping dataplane mode. This patch makes it possible
to start and stop dataplane any time while the guest is running.
This will eventually allow us to go back to QEMU main loop for
bdrv_drain_all() and live migration. In the meantime, this patch makes
the dataplane lifecycle more robust but should make no visible
difference. It may be useful in the virtio-net dataplane effort.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This remove virtio-bindings, and use class instead.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366791683-5350-6-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>