If a migration happens just after the guest has kicked
off an ATAPI command and kicked off DMA, we lose the atapi_dma
flag, and the destination tries to complete the command as PIO
rather than DMA. This upsets Linux; modern libata based kernels
stumble and recover OK, older kernels end up passing bad data
to userspace.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The other callers to blk_set_enable_write_cache() in this file
already check for s->blk == NULL.
Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1416259239-13281-1-git-send-email-dslutz@verizon.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This impacts both BMDMA and AHCI HBA interfaces for IDE.
Currently, we confuse the difference between a PRDT having
"0 bytes" and a PRDT having "0 complete sectors."
When we receive an incomplete sector, inconsistent error checking
leads to an infinite loop wherein the call succeeds, but it
didn't give us enough bytes -- leading us to re-call the
DMA chain over and over again. This leads to, in the BMDMA case,
leaked memory for short PRDTs, and infinite loops and resource
usage in the AHCI case.
The .prepare_buf() callback is reworked to return the number of
bytes that it successfully prepared. 0 is a valid, non-error
answer that means the table was empty and described no bytes.
-1 indicates an error.
Our current implementation uses the io_buffer in IDEState to
ultimately describe the size of a prepared scatter-gather list.
Even though the AHCI PRDT/SGList can be as large as 256GiB, the
AHCI command header limits transactions to just 4GiB. ATA8-ACS3,
however, defines the largest transaction to be an LBA48 command
that transfers 65,536 sectors. With a 512 byte sector size, this
is just 32MiB.
Since our current state structures use the int type to describe
the size of the buffer, and this state is migrated as int32, we
are limited to describing 2GiB buffer sizes unless we change the
migration protocol.
For this reason, this patch begins to unify the assertions in the
IDE pathways that the scatter-gather list provided by either the
AHCI PRDT or the PCI BMDMA PRDs can only describe, at a maximum,
2GiB. This should be resilient enough unless we need a sector
size that exceeds 32KiB.
Further, the likelihood of any guest operating system actually
attempting to transfer this much data in a single operation is
very slim.
To this end, the IDEState variables have been updated to more
explicitly clarify our maximum supported size. Callers to the
prepare_buf callback have been reworked to understand the new
return code, and all versions of the prepare_buf callback have
been adjusted accordingly.
Lastly, the ahci_populate_sglist helper, relied upon by the
AHCI implementation of .prepare_buf() as well as the PCI
implementation of the callback have had overflow assertions
added to help make clear the reasonings behind the various
type changes.
[Added %d -> %"PRId64" fix John sent because off_pos changed from int to
int64_t.
--Stefan]
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently, for emulated PIO transfers through the AHCI device,
any attempt made to request more than a single sector's worth
of data will result in the same sector being transferred over
and over.
For example, if we request 8 sectors via PIO READ SECTORS, the
AHCI device will give us the same sector eight times.
This patch adds offset tracking into the PIO pathways so that
we can fulfill these requests appropriately.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently, DMA read/write operations neglect to update
the byte count after a successful transfer like ATAPI
DMA read or PIO read/write operations do.
We correct this oversight by adding another callback into
the IDEDMAOps structure. The commit callback is called
whenever we are cleaning up a scatter-gather list.
AHCI can register this callback in order to update post-
transfer information such as byte count updates.
We use this callback in AHCI to consolidate where we delete
the SGlist as generated from the PRDT, as well as update the
byte count after the transfer is complete.
The QEMUSGList structure has an init flag added to it in order
to make qemu_sglist_destroy a nop if it is called when
there is no sglist, which simplifies cleanup and error paths.
This patch fixes several AHCI problems, notably Non-NCQ modes
of operation for Windows 7 as well as Hibernate support for Windows 7.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412204151-18117-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a BlockBackend member to TrimAIOCB, so ide_issue_trim_cb() can use
blk_aio_discard() instead of bdrv_aio_discard().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Device models should access their block backends only through the
block-backend.h API. Convert them, and drop direct includes of
inappropriate headers.
Just four uses of BlockDriverState are left:
* The Xen paravirtual block device backend (xen_disk.c) opens images
itself when set up via xenbus, bypassing blockdev.c. I figure it
should go through qmp_blockdev_add() instead.
* Device model "usb-storage" prompts for keys. No other device model
does, and this one probably shouldn't do it, either.
* ide_issue_trim_cb() uses bdrv_aio_discard() instead of
blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
which has only the BlockDriverState.
* PC87312State has an unused BlockDriverState[] member.
The next two commits take care of the latter two.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'll use it with block backends shortly, and the name is going to fit
badly there. It's a block layer thing anyway, not just a block driver
thing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'll use BlockDriverAIOCB with block backends shortly, and the name is
going to fit badly there. It's a block layer thing anyway, not just a
block driver thing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of duplicating the logic for the if_ide
(bus,unit) mappings, rely on the blockdev layer
for managing those mappings for us, and use the
drive_get_by_index call instead.
This allows ide_drive_get to work for AHCI HBAs
as well, and can be used in the Q35 initialization.
Lastly, change the nature of the argument to
ide_drive_get so that represents the number of
total drives we can support, and not the total
number of buses. This will prevent array overflows
if the units-per-default-bus property ever needs
to be adjusted for compatibility reasons.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412187569-23452-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We know that either bh is scheduled or ide_issue_trim_cb will be called
again, so we just set i, j and ret to the right values. In both cases,
ide_trim_bh_cb will be called.
Also forward the cancellation to the iocb->aiocb which we get from
bdrv_aio_discard.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Before, bdrv_aio_cancel will either complete the request (like normal)
and call CB with an actual return code, or skip calling the request (for
example when the IO req is not submitted by thread pool yet).
We will change bdrv_aio_cancel to do it differently: always call CB
before return, with either [1] a normal req completion ret code, or [2]
ret == -ECANCELED. So the callers' callback must accept both cases. The
existing logic works with case [1], but not [2].
The simplest transition of callback code is do nothing in case [2], just
as if the CB is not called by the bdrv_aio_cancel() call.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is the next step for decoupling block accounting functions from
BlockDriverState.
In a future commit the BlockAcctStats structure will be moved from
BlockDriverState to the device models structures.
Note that bdrv_get_stats was introduced so device models can retrieve the
BlockAcctStats structure of a BlockDriverState without being aware of it's
layout.
This function should go away when BlockAcctStats will be embedded in the device
models structures.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Michael Tokarev <mjt@tls.msk.ru>
CC: John Snow <jsnow@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The middle term goal is to move the BlockAcctStats structure in the device models.
(Capturing I/O accounting statistics in the device models is good for billing)
This patch make a small step in this direction by removing a reference to BDRV.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>i
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, if the block device backing the IDE drive is resized,
the information about the device as cached inside of the IDEState
structure is not updated, thus when a guest OS re-queries the drive,
it is unable to see the expanded size.
This patch adds a resize callback that updates the IDENTIFY data
buffer in order to correct this.
Lastly, a Linux guest as-is cannot resize a libata drive while in-use,
but it can see the expanded size as part of a bus rescan event.
This patch also allows guests such as Linux to see the new drive size
after a soft reboot event, without having to exit the QEMU process.
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
IDE-HD, IDE-ATAPI and IDE-CFATA all fill the
identify buffer in slightly different ways,
this is a relatively minor patch to make them
uniform, to emphasize that:
(1) We build the s->identify_data cache first, then
(2) We copy it to s->io_buffer to fulfill the request.
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Although it is possible to specify the wwn
property for cdrom devices on the command line,
the underlying driver fails to relay this information
to the guest operating system via IDENTIFY.
This is a simple patch to correct that.
See ATA8-ACS, Table 22 parts 5, 6, and 9.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit 58ac321135 introduced a check to ide dma processing which
constrains all requests to drive size. However, apparently, some
valid requests (like TRIM) does not fit in this constraint, and
fails in 2.1. So check the range only for reads and writes.
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
AHCI has code to fill in the D2H FIS trigger the IRQ all over the place.
Centralize this in a single cmd_done callback by generalizing the existing
async_cmd_done callback.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will provide a hook for sending the result of the command via the
FIS receive area.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It is now called only after the set_inactive callback. Put the two together.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Make it optional and prepare for the next patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Drop the unused return value and make the callback optional.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Drop the unused return value and make the callback optional.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Drop the unused return value and make the callback optional.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This ensures that operations are completed after a reset
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The block layer fails such reads and writes just fine. However, they
then get treated like valid operations that fail: the error action
gets executed. Unwanted; reporting the error to the guest is the only
sensible action.
Reject them before passing them to the block layer. This bypasses the
error action and I/O accounting. Not quite correct for DMA, because
DMA can fail after some success, and when that happens, the part that
succeeded isn't counted. Tolerable, because I/O accounting is an
inconsistent mess anyway.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In order to let event defines use existing types later, instead of
redefine new ones, some old type defines for spice and vnc are changed,
and BlockErrorAction is moved from block.h to qapi schema. Note that
BlockErrorAction is not merged with BlockdevOnError.
At this point, VncInfo is not made a child of VncBasicInfo, because
VncBasicInfo has mandatory fields where VncInfo makes them optional.
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
After previous Peter patch, they are redundant. This way we don't
assign them except when needed. Once there, there were lots of case
where the ".fields" indentation was wrong:
.fields = (VMStateField []) {
and
.fields = (VMStateField []) {
Change all the combinations to:
.fields = (VMStateField[]){
The biggest problem (appart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
The SMART self test counter was incorrectly being reset to zero,
not 1. This had the effect that on every 21st SMART EXECUTE OFFLINE:
* We would write off the beginning of a dynamically allocated buffer
* We forgot the SMART history
Fix this.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1397336390-24664-1-git-send-email-benoit.canet@irqsave.net
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Kevin Wolf <kwolf@redhat.com>
[PMM: tweaked commit message as per suggestions from Markus]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The alignment field is now set to the value that is promised to the
guest, rather than required by the host. The next patches will make
QEMU aware of the host-provided values, so make this clear.
The alignment is also not about memory buffers, but about the sectors on
the disk, change the documentation of the field.
At this point, the field is set by the device emulation, but completely
ignored by the block layer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
This notably fix IDE CD probing on the Plan 9 operating system,
which rely on the error register set by the Execute Device
Diagnostic command to detect drive configurations.
Thanks to Rémi Pommarel for reporting this issue.
Signed-off-by: David du Colombier <0intro@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an autogenerated patch using scripts/switch-timer-api.
Switch the entire code base to using the new timer API.
Note this patch may introduce some line length issues.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
AHCI couldn't cope with asynchronous commands that aren't doing DMA, it
simply wouldn't complete them. Due to the bug fixed in commit f68ec837,
FLUSH commands would seem to have completed immediately even if they
were still running on the host. After the commit, they would simply hang
and never unset the BSY bit, rendering AHCI unusable on any OS sending
flushes.
This patch adds another callback for the completion of asynchronous
commands. This is what AHCI really wants to use for its command
completion logic rather than an DMA completion callback.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
All commands are now converted to ide_cmd_table handlers, so it can be
unconditional now and the old switch block can go.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>