For VIRTIO_SCSI_T_TMF_ABORT_TASK and VIRTIO_SCSI_T_TMF_ABORT_TASK_SET,
use scsi_req_cancel_async to start the cancellation.
Because each tmf command may cancel multiple requests, we need to use a
counter to track the number of remaining requests we still need to wait
for.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Devices will call this function to start an asynchronous cancellation. The
bus->info->cancel will be called after the request is canceled.
Devices will probably need to track a separate TMF request that triggers this
cancellation, and wait until the cancellation is done before completing it. So
we store a notifier list in SCSIRequest and in scsi_req_cancel_complete we
notify them.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let the aio cb do the clean up and notification job after scsi_req_cancel, in
preparation for asynchronous cancellation.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The only two implementations are identical to each other, with nothing specific
to device: they only call bdrv_aio_cancel with the SCSIRequest.aiocb.
Let's move it to scsi-bus.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before, scsi_req_cancel will take ownership of the canceled request and unref
it. We did this because we didn't know whether AIO CB will be called or not
during the cancelling, so we set the io_canceled flag before calling it, and
skip unref in the potentially called callbacks, which is not very nice.
Now, bdrv_aio_cancel has a stricter contract that the completion callbacks are
always called, so we can remove the checks of req->io_canceled and just unref
it in callbacks.
It will also make implementing asynchronous cancellation easier.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that we always called the cb in bdrv_aio_cancel, let's make scsi-generic
callbacks check io_canceled flag similarly to scsi-disk.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The only user of this function is spapr_vscsi.c. We can convert to
scsi_req_cancel plus adding a check in vscsi_request_cancelled.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
[Drop prototype. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Queue the popped requests while calling
virtio_scsi_handle_cmd_req_prepare(), then submit them after all
prepared.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mechanical change, in preparation for bdrv_io_plug/bdrv_io_unplug.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Similar to virtio-blk-dataplane, we stop the iothread while migration
starts and restart it when migration finishes.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This enables the virtio-scsi-dataplane code by setting the iothread
in virtio-scsi device, and makes any function that is called by
back from dataplane to cooperate with the caller: they need to be
vring/iothread aware when handling the requests and using scsi devices
on the bus.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This implements the core part of dataplane feature of virtio-scsi.
A few fields are added in VirtIOSCSICommon to maintain the dataplane
status. These fields are managed by a new source file:
virtio-scsi-dataplane.c.
Most code in this file will run on an iothread, unless otherwise
commented as in a global mutex context, such as those functions to
start, stop and setting the iothread property.
Upon start, we set up guest/host event notifiers, in a same way as
virtio-blk does. The handlers then pop request from vring and call into
virtio-scsi.c functions to process it. So we need to make sure make all
those called functions work with iothread, too.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move VirtIOSCSIReq to header and add one field "vring" as a wrapper
structure of Vring, VirtIOSCSIVring.
This is necessary for coming dataplane code that runs uses vring on
iothread.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Similar to this property in virtio-blk for dataplane, add it as a QOM
link in virtio-scsi and an alias in virtio-scsi-pci and virtio-scsi-ccw,
in order to assign an iothread to the device.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For better code sharing, add a helper function that handles
reference counting of the virtio backend for virtio proxy devices.
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is
dropped again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon
unplug the virtio-9p child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio-9p-pci all duplicate the qdev properties of their
V9fsState child. This approach does not work well with
string or pointer properties since we must be careful
about leaking or double-freeing them.
Use the QOM alias property to forward property accesses to the
V9fsState child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-balloon child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-rng child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio-rng-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIORNG child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIORNG child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-serial child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio-serial-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIOSerial child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIOSerial child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-scsi/vhost-scsi child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
{virtio, vhost}-scsi-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIOSCSI/VHostSCSI child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIOSCSI/VHostSCSI child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-net child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio-net-pci, virtio-net-s390, and virtio-net-ccw all duplicate the
qdev properties of their VirtIONet child. This approach does not work
well with string or pointer properties since we must be careful about
leaking or double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIONet child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is the "common part" to handle one cmd request. Refactor out for
later usage of dataplane iothread code.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The command direction according to the guest-passed buffers
is already stored in the VirtIOSCSIReq. We can use it instead
of computing it again from req->elem.
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The VirtQueueElement is a very big structure (>48k!), since it will be
initialzed by virtqueue_pop, we can save the expensive zeroing here.
This saves a few microseconds per request in my test:
[fio-test] rw bs iodepth jobs bw iops latency
--------------------------------------------------------------------------------------------
Before read 4k 1 1 110 28269 34
After read 4k 1 1 131 33745 28
Whereas,
virtio-blk read 4k 1 1 217 55673 16
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Zeroing sense buffer for each scsi request is not efficient, we can just
leave it uninitialized because sense_len is set to 0.
Move the implicitly zeroed fields to the end of the structure and use a
partial memset.
The explicitly initialized fields (by scsi_req_alloc or scsi_req_new)
are moved to the beginning of the structure, before sense buffer, to
skip the memset.
Also change g_malloc0 to g_slice_alloc.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Dataplane doesn't depend on linux-aio any more, so we don't need the
compiling condition now.
Configure options are kept but just print a message.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1410329871-28885-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
To be more consistent inside this function.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1410329871-28885-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This header has no further dependencies. It only has some stable data
types and primitive functions, so we can copy it to include/hw/virtio in
order to allow vring code (and its user virtio-blk dataplane) to be
built unconditionally, even for cross compiling.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1410329871-28885-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On a system with a low limit of open files the initialization
of the event notifier could fail and QEMU exits without printing any
error information to the user.
The problem can be easily reproduced by enforcing a low limit of open
files and start QEMU with enough I/O threads to hit this limit.
The same problem raises, without the creation of I/O threads, while
QEMU initializes the main event loop by enforcing an even lower limit of
open files.
This commit adds an error message on failure:
# qemu [...] -object iothread,id=iothread0 -object iothread,id=iothread1
qemu: Failed to initialize event notifier: Too many open files in system
Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In vhdx_create_metadata(), we allocate 40 bytes to entry_buffer for
the various metadata table entries. However, we write out 64kB from
that buffer into the new file. Only write out the correct 40 bytes.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch removes support for the cow file format.
Normally we do not break backwards compatibility but in this case there
is no impact and it is the most logical option. Extraordinary claims
require extraordinary evidence so I will show why removing the cow block
driver is the right thing to do.
The cow file format is the disk image format for Usermode Linux, a way
of running a Linux system in userspace. The performance of UML was
never great and it was hacky, but it enjoyed some popularity before
hardware virtualization support became mainstream.
QEMU's block/cow.c is supposed to read this image file format.
Unfortunately the file format was underspecified:
1. Earlier Linux versions used the MAXPATHLEN constant for the backing
filename field. The value of MAXPATHLEN can change, so Linux
switched to a 4096 literal but QEMU has a 1024 literal.
2. Padding was not used on the header struct (both in the Linux kernel
and in QEMU) so the struct layout varied across architectures. In
particular, i386 and x86_64 were different due to int64_t alignment
differences. Linux now uses __attribute__((packed)), QEMU does not.
Therefore:
1. QEMU cow images do not conform to the Linux cow image file format.
2. cow images cannot be shared between different host architectures.
This means QEMU cow images are useless and QEMU has not had bug reports
from users actually hitting these issues.
Let's get rid of this thing, it serves no purpose and no one will be
affected.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1410877464-20481-1-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Utilizing all of the bring-up code in pci_enable and hba_enable,
this test issues a simple IDENTIFY command via the HBA and retrieves
the response via the PIO receive mechanisms of the HBA.
Bugs: The DPS interrupt (Descriptor Processed Status) does not
currently get set. This will need to be adjusted in a future
patch series when the AHCI DMA pathways are reworked to allow
the feature, which may be utilized by OSX guests.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-9-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This test engages the HBA functionality and initializes
values to sane defaults to allow for minimal HBA functionality.
Buffers are allocated and pointers are updated to allow minimal
I/O commands to complete as expected. Error registers and responses
are sanity checked for specification adherence.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-8-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a test routine that checks the boot-up values of the HBA
configuration memory space against the AHCI 1.3 specification
and Intel ICH9 data sheet (for Q35 machines) for adherence and
sane values.
The HBA is not yet engaged or put into the idle state.
[Replaced g_assert_false(...) with g_assert(!...) for glib <2.38
compatibility, reported by Peter Maydell <peter.maydell@linaro.org>.
--Stefan]
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1408643079-30675-7-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In a real AHCI device, several S/ATA registers are mirrored or shadowed
within the AHCI register set. These registers are not updated
synchronously for each read access, but are instead updated after a
Device-to-Host Register FIS packet is received. The D2H FIS contains
the values from these registers on the device.
In QEMU, by reaching directly into the device to grab these bits before
they are "sent," we may introduce race conditions where unexpected
values are present "before they are sent" which could cause issues for
some guests, particularly if an attempt is made to read the PxTFD
register prior to enabling the port, where incorrect values will be read.
This patch also addresses the boot-time values for the PxTFD and PxSIG
registers to bring them in line with the AHCI 1.3 specification.
Lastly, several fields (PxTFD, PxSIG and PxSACT) are read-only,
and any attempts to write to them should be ignored.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>