In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.
This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant. Use a separate handler just
for aio, and disable regular handlers when dataplane is active.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We must not call virtio_blk_data_plane_notify if dataplane is
disabled: we would hit a segmentation fault in notify_guest_bh as
s->guest_notifier has not been setup and is NULL.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
In disabled mode, virtio-blk dataplane seems to be enabled, but flow
actually goes through the normal virtio path. This patch simplifies a bit
the handling of disabled mode. In disabled mode, virtio_blk_handle_output
might be called even if s->dataplane is not NULL.
This is a bit tricky, because the current check for s->dataplane will
always trigger, causing a continuous stream of calls to
virtio_blk_data_plane_start. Unfortunately, these calls will not
do anything. To fix this, set the "started" flag even in disabled
mode, and skip virtio_blk_data_plane_start if the started flag is true.
The resulting changes also prepare the code for the next patch, were
virtio-blk dataplane will reuse the same virtio_blk_handle_output function
as "regular" virtio-blk.
Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have
to move s->dataplane->started inside struct VirtIOBlock.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0. We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.
The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items. Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.
The patch is pretty large, but changes to each device are testable
more or less independently. Splitting it would mostly add churn.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The next patch will make virtqueue_pop/vring_pop allocate memory for
the VirtQueueElement. In some cases (blk, scsi, gpu) the device wants
to extend VirtQueueElement with device-specific fields and, until now,
the place of the VirtQueueElement within the containing struct didn't
matter. When allocating the entire block in virtqueue_pop/vring_pop,
however, the containing struct must basically be a "subclass" of
VirtQueueElement, with the VirtQueueElement as the first field. Make
that the case for blk and scsi; gpu is already doing it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The official way of enabling dataplane is through the "iothread"
property that references an iothread object created by "-object
iothread". Since the old "x-data-plane=on" way now even crashes, it's
probably easier to just drop it:
$ qemu-system-x86_64 -drive file=null-co://,id=d0,if=none \
-device virtio-blk-pci,drive=d0,x-data-plane=on
ERROR:/home/fam/work/qemu/qom/object.c:1515:
object_get_canonical_path_component: assertion failed: (obj->parent != NULL)
Aborted
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1449485967-19240-1-git-send-email-famz@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
After qemu_iovec_destroy, the QEMUIOVector's size is zeroed and
the zero size ultimately is used to compute virtqueue_push's len
argument. Therefore, reads from virtio-blk devices did not
migrate their results correctly. (Writes were okay).
Save the size in virtio_blk_handle_request, and use it when the request
is completed.
Based on a patch by Wen Congyang.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-id: 1427997044-392-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Drop duplicated code. Minor codechanges were required
as geometry is a sub-structure now.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
this adds a knob to disable request merging for debugging or benchmarks if dedired.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
this patch finally introduces multiread support to virtio-blk. While
multiwrite support was there for a long time, read support was missing.
The complete merge logic is moved into virtio-blk.c which has
been the only user of request merging ever since. This is required
to be able to merge chunks of requests and immediately invoke callbacks
for those requests. Secondly, this is required to switch to
direct invocation of coroutines which is planned at a later stage.
The following benchmarks show the performance of running fio with
4 worker threads on a local ram disk. The numbers show the average
of 10 test runs after 1 run as warmup phase.
| 4k | 64k | 4k
MB/s | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand
--------------+--------+---------+--------+---------+--------+--------
master | 1221 | 1187 | 4178 | 4114 | 1745 | 1213
multiread | 1829 | 1189 | 4639 | 4110 | 1894 | 1216
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As it was not obvious (at least for me) where the 32 comes from;
add a constant for it.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In preparation for calling blk_aio_ioctl. Also make the function static
as no other files need it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Device models should access their block backends only through the
block-backend.h API. Convert them, and drop direct includes of
inappropriate headers.
Just four uses of BlockDriverState are left:
* The Xen paravirtual block device backend (xen_disk.c) opens images
itself when set up via xenbus, bypassing blockdev.c. I figure it
should go through qmp_blockdev_add() instead.
* Device model "usb-storage" prompts for keys. No other device model
does, and this one probably shouldn't do it, either.
* ide_issue_trim_cb() uses bdrv_aio_discard() instead of
blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
which has only the BlockDriverState.
* PC87312State has an unused BlockDriverState[] member.
The next two commits take care of the latter two.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is consistent with how VirtIOFOOConf variables are named
elsewhere, and makes blk available for BlockBackend variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 12c5674 turned it into a pointer to member blk.conf.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Dataplane doesn't depend on linux-aio any more, so we don't need the
compiling condition now.
Configure options are kept but just print a message.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1410329871-28885-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The plan is to add new accounting metrics (latency, invalid requests, failed
requests, queue depth) and block.c is overpopulated so it will be better to work
in a separate module.
Moreover the long term plan is to have statistics in each of the BDS of the graph
for metrology purpose; this means that the device model statistics must move from
the topmost BDS to the device model.
So we need to decouple the statistic code from BlockDriverState.
This is another argument for the extraction of the code in a separate module.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Benoit Canet <benoit@irqsave.net>
CC: Fam Zheng <famz@redhat.com>
CC: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The memory allocation between hw/block/virtio-blk.c,
hw/block/dataplane/virtio-blk.c, and hw/virtio/dataplane/vring.c is
messy. Structs are allocated in different files than they are freed in.
This is risky and makes memory leaks easier.
Embed VirtQueueElement in VirtIOBlockReq to reduce the amount of memory
allocation we need to juggle. This also makes vring.c and virtio.c
slightly more similar.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The two common virtio features can be defined per bus, so move all
into bus class device to make code more clean.
As discussed with cornelia, s390-virtio-blk doesn't support
the two features at all, so keep s390-virtio as it.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> #for s390 ccw
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MST: rebase and resolve conflicts
There is no need to make DEFINE_VIRTIO_BLK_PROPERTIES() public. Inline
it into virtio-blk.c so it cannot be used by mistake from other source
files.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
This function is no longer used since parent objects now use child
aliases to set the VirtIOBlkConf directly.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Move the x-data-plane property. Originally it was outside since not
every transport may wish to support dataplane. But that makes little
sense when we have a dedicated CONFIG_VIRTIO_BLK_DATA_PLANE ifdef
already.
This move makes it easier to switch to property aliases in the next
patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
It becomes unwiedly to duplicate all virtio-blk qdev property
definitions due to an #ifdef. The C preprocessor syntax makes it a
little hard to resolve this cleanly but we can extract the #ifdef and
call a macro it defines later.
Avoiding duplication is important since it will only get worse when we
move the x-data-plane qdev property here too. We'd have a combinatorial
explosion since x-data-plane has its own #ifdef.
Suggested-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
So that dataplane can use virtio_blk_handle_request and
virtio_submit_multiwrite.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_blk_req_complete will call VirtIOBlock.complete_request() to push
data and notify guest. No functional change.
Later, this will allow dataplane to provide it's own (vring_) version.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The virtio code currently assumes that the outhdr is in its own iovec.
This is not guaranteed by the spec, so we should relax this assumption.
Convert the VirtIOBlockReq.out field to structrue so that we can use
iov_to_buf and then discard the header from the beginning of iovec.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In current virtio spec, inhdr is a single byte, and is unlikely to
change for both functionality and compatibility considerations.
Non-dataplane uses .in, and we are on the way to converge them. So
let's unify it to get cleaner code.
Remove .inhdr and use .in.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Field "inhdr" is added temporarily for a more mechanical change, and
will be dropped in the next commit.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This will make converging with dataplane code easier.
Add virtio_blk_free_request to handle the freeing of request internal
fields.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
For later reusing by dataplane code.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The common logic to process a scsi request in a VirtQueueElement is
extracted to a function to share with dataplane.
This makes VirtIOBlockReq.scsi unused, so drop it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Today virtio-blk dataplane uses a 1:1 device-per-thread model. Now that
IOThreads have been introduced we can generalize this to N:M devices per
threads.
This patch drops thread code from dataplane in favor of running inside
an IOThread AioContext.
As a bonus we solve the case where a guest keeps submitting I/O requests
while dataplane is trying to stop. Previously the dataplane thread
would continue to process requests until the request gave it a break.
Now we can shut down in bounded time thanks to
aio_context_acquire/release.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The following sequence happens:
- the SeaBIOS virtio-blk driver does not support the WCE feature, which
causes QEMU to disable writeback caching
- the Linux virtio-blk driver resets the device, finds WCE is available
but writeback caching is disabled; tells block layer to not send cache
flush commands
- the Linux virtio-blk driver sets the DRIVER_OK bit, which causes
writeback caching to be re-enabled, but the Linux virtio-blk driver does
not know of this side effect and cache flushes remain disabled
The bug is at the third step. If the guest does know about CONFIG_WCE,
QEMU should ignore the WCE feature's state. The guest will control the
cache mode solely using configuration space. This change makes Linux
do flushes correctly, but Linux will keep SeaBIOS's writethrough mode.
Hence, whenever the guest is reset, the cache mode of the disk should
be reset to whatever was specified in the "-drive" option. With this
change, the Linux virtio-blk driver finds that writeback caching is
enabled, and tells the block layer to send cache flush commands
appropriately.
Reported-by: Rusty Russell <rusty@au1.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Although the dataplane thread does not cooperate with dirty memory
logging yet it's fairly easy to temporarily disable dataplane during
live migration. This way virtio-blk can live migrate when
x-data-plane=on.
The dataplane thread will restart after migration is cancelled or if the
guest resuming virtio-blk operation after migration completes.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This fix the broken aliases, by renaming the devices.
So: * virtio-blk => virtio-blk-device.
* virtio-balloon => virtio-balloon-device.
* virtio-scsi => virtio-scsi-device.
All virtio-*-pci, virtio-*-s390, virtio-*-ccw didn't change.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1365501888-14602-1-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>