We automatically delete blockdev host parts on unplug of the guest
device. Too much magic, but we can't change that now.
The delete happens early in the guest device teardown, before the
connection to the host part is severed. Thus, the guest part's
pointer to the host part dangles for a brief time. No actual harm
comes from this, but we'll catch such dangling pointers a few commits
down the road. Clean up the dangling pointers by delaying the
automatic deletion until the guest part's pointer is gone.
Device usb-storage deliberately makes two qdev properties refer to the
same drive, because it automatically creates a second device. Again,
too much magic we can't change now. Multiple references worked okay
before, but now free_drive() dies for the second one. Zap the extra
reference.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
That's where they belong semantically (block device host part), even
though the actions are actually executed by guest device code.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Anything that moves hundreds of lines out of vl.c can't be all bad.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Show the actual default value instead of <null> when the property has
not been set.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It needs to be a qdev property, because it belongs to the drive's
guest part.
Bonus: info qtree now shows the serial number.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Once the I/O completion callback returned, aiocb will be released by the
controller. So we have to clear the reference not only in
scsi_write_complete, but also in scsi_read_complete. Otherwise we risk
inconsistencies when a reset hits us before the related request is
released.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Ensure that pending requests of an SCSI disk are purged on system reset
and also restore max_lba. The latter is no only present in the reset
handler as that one is called after init as well.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In case s->version is shorter than 4 bytes we overflow the memcpy src
buffer. Fix it by clearing the target buffer, then copy only the
amount of bytes we actually have.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Don't rely on CDROM hint for read_only attribute
Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a logical block size attribute as various guest side tools only
increase the filesystem sector size based on it, not the advisory
physical block size.
For scsi we already have support for a different logical block size
in place for CDROMs that we can built upon. Only my recent block
device characteristics VPD page needs some fixups. Note that we
leave the logial block size for CDROMs hardcoded as the 2k value
is expected for it in general.
For virtio-blk we already have a feature flag claiming to support
a variable logical block size that was added for the s390 kuli
hypervisor. Interestingly it does not actually change the units
in which the protocol works, which is still fixed at 512 bytes,
but only communicates a different minimum I/O granularity. So
all we need to do in virtio is to add a trap for unaligned I/O
and round down the device size to the next multiple of the logical
block size.
IDE does not support any other logical block size than 512 bytes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
error_report() terminates the message with a newline. Strip it it
from its arguments.
This fixes a few error messages lacking a newline:
net_handle_fd_param()'s "No file descriptor named %s found", and
tap_open()'s "vnet_hdr=1 requested, but no kernel support for
IFF_VNET_HDR available" (all three versions).
There's one place that passes arguments without newlines
intentionally: load_vmstate(). Fix it up.
You're supposed to use scsi-generic for that. Which rejects anything
but /dev/sg*.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The next commit will move the STOP event into do_vm_stop(), to
have the expected event sequence we need to emit the I/O error
event before calling vm_stop().
The expected sequence is:
{ "event": "BLOCK_IO_ERROR" [...] }
{ "event": "STOP" }
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Export the physical block size in the READ CAPACITY (16) command,
and add the new block limits VPD page to export the minimum and
optiomal I/O sizes.
Note that we also need to bump the scsi revision level to SPC-2
as that is the minimum requirement by at least the Linux kernel
to try READ CAPACITY (16) first and look at the block limits VPD
page.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add three new qdev properties to export block topology information to
the guest. This is needed to get optimal I/O alignment for RAID arrays
or SSDs.
The options are:
- physical_block_size to specify the physical block size of the device,
this is going to increase from 512 bytes to 4096 kilobytes for many
modern storage devices
- min_io_size to specify the minimal I/O size without performance impact,
this is typically set to the RAID chunk size for arrays.
- opt_io_size to specify the optimal sustained I/O size, this is
typically the RAID stripe width for arrays.
I decided to not auto-probe these values from blkid which might easily
be possible as I don't know how to deal with these issues on migration.
Note that we specificly only set the physical_block_size, and not the
logial one which is the unit all I/O is described in. The reason for
that is that IDE does not support increasing the logical block size and
at last for now I want to stick to one meachnisms in queue and allow
for easy switching of transports for a given backing image which would
not be possible if scsi and virtio use real 4k sectors, while ide only
uses the physical block exponent.
To make this more common for the different block drivers introduce a
new BlockConf structure holding all common block properties and a
DEFINE_BLOCK_PROPERTIES macro to add them all together, mirroring
what is done for network drivers. Also switch over all block drivers
to use it, except for the floppy driver which has weird driveA/driveB
properties and probably won't require any advanced block options ever.
Example usage for a virtio device with 4k physical block size and
8k optimal I/O size:
-drive file=scratch.img,media=disk,cache=none,id=scratch \
-device virtio-blk-pci,drive=scratch,physical_block_size=4096,opt_io_size=8192
aliguori: updated patch to take into account BLOCK events
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Just call bdrv_mon_event() in the right place.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Win32 suffers from a very big memory leak when dealing with SCSI devices.
Each read/write request allocates memory with qemu_memalign (ie
VirtualAlloc) but frees it with qemu_free (ie free).
Pair all qemu_memalign() calls with qemu_vfree() to prevent such leaks.
Signed-off-by: Herve Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds a new property named 'ver' to scsi-disk which allows to
specify the version which the virtual disk/cdrom should report to the
guest. By default this is the qemu version (i.e. 0.12). usage:
-drive if=none,id=disk,file=...
-device lsi
-device scsi-disk,drive=disk,bus=scsi.0,unit=0,ver=42
You can also switch the version for all scsi drives using:
-global scsi-disk.ver=42
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
According to the SCSI-2 specification,
http://ldkelley.com/SCSI2/SCSI2/SCSI2/SCSI2-08.html#8.2.5 ,
"if the allocation length of the command descriptor block (CDB) is too
small to transfer all of the parameters, the additional length shall
not be adjusted to reflect the truncation."
The 36 mandatory bytes of response are written to outbuf, and then
only the length requested in CDB is transferred.
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Vendor identification, product identification and product revision level
should be padded with spaces without a terminating NULL character, see
SCSI-2 standard, 8.2.5.1 Standard INQUIRY data.
Signed-off-by: Laszlo Ast <laszlo.ast@siemens-enterprise.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Either rename variables and functions to refer to write errors (which is what
they actually do) or introduce a parameter to distinguish reads and writes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add READ_16 + friends to scsi-defs.h, scsi_command_name() and the
request parsing helper functions.
Use them in scsi-disk.c too.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move VERIFY emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move REPORT_LUNS emulation from scsi_send_command() to
scsi_disk_emulate_command().
Also add REPORT_LUNS to scsi-defs.h and scsi_command_name().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move SERVICE_ACTION_IN emulation from scsi_send_command() to
scsi_disk_emulate_command().
Also add SERVICE_ACTION_IN to scsi-defs.h and scsi_command_name().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move GET_CONFIGURATION emulation from scsi_send_command() to
scsi_disk_emulate_command().
Also add GET_CONFIGURATION to scsi-defs.h and scsi_command_name().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move READ_TOC emulation from scsi_send_command() to
scsi_disk_emulate_command(). Add scsi_disk_emulate_read_toc() function
which holds the longisch READ_TOC emulation code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move SYNCHRONIZE_CACHE emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move READ_CAPACITY emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move ALLOW_MEDIUM_REMOVAL emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move START_STOP emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move MODE_SENSE emulation from scsi_send_command() to
scsi_disk_emulate_command(). Create two helper functions:
mode_sense_page() which writes the actual mode pages and
scsi_disk_emulate_mode_sense() which holds the longish MODE_SENSE
emulation code, calling into mode_sense_page() as needed.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move RESERVE+RELEASE emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move INQUIRY emulation from scsi_send_command() to
scsi_disk_emulate_command(). Also split the longish INQUITY emulation
code into the new scsi_disk_emulate_inquiry() function. Serial number
handling is slightly changed, we don't copy it any more but look it up
directly in DriveInfo which we have at hand anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move REQUEST_SENSE emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add new scsi_disk_emulate_command() function, which will -- when
finished -- handle all scsi disk command emulation except actual I/O
(READ+WRITE commands) which goes to the block layer. The function
builds on top of the new SCSIRequest struct.
SCSI command emulation code is moved over from scsi_send_command() in
steps to ease review and make it easier to pin down regressions (if any)
using bisect. This patch moves over TEST_UNIT_READY only.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Also add and use the scsi_req_complete() helper function for calling the
completion callback.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Largely based on <scsi/scsi.h> from linux. Added into the tree so we
can use the defines everywhere, not just in scsi-generic.c (which is
linux-specific).
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Create generic functions to allocate, find and release SCSIRequest
structs. Make scsi-disk and scsi-generic use them.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Changes:
* Move from open-coded lists to QTAILQ macros.
* Move the struct elements to the common data structures
(SCSIDevice + SCSIRequest).
* Drop free request pools.
* Fix request cleanup in the destroy callback.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Rename the SCSIRequest structs in scsi-disk.c and scsi-generic.c to
SCSIDiskReq and SCSIGenericReq. Create a SCSIRequest struct and move
the common elements over.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The SCSI-2 documentation suggests, that although the block
descriptor is optional for an arbitrary SCSI-2 device (chapter 8.2.10,
http://ldkelley.com/SCSI2/SCSI2/SCSI2/SCSI2/SCSI2-08.html )
it is mandatory for a disk: chapters 9.1.2, 9.3.3
( http://ldkelley.com/SCSI2/SCSI2/SCSI2/SCSI2-09.html ) don't say
"optional" any more, just "The block descriptor in the MODE SENSE
data describes the block lengths that are used on the medium."
v2: limit the number of sectors reported in the block descriptor to 24 bits.
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Changes:
* drive_uninit() wants a DriveInfo now.
* drive_uninit() also calls bdrv_delete(),
so callers don't need to do that.
* drive_uninit() calls are moved over to the ->exit()
callbacks, destroy_bdrvs() is zapped.
* setting bdrv->private is not needed any more as the
only user (destroy_bdrvs) is gone.
* usb-storage needs no drive_uninit, scsi-disk will
handle that.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a enable_write_cache flag in the block driver state, and use it to
decide if we claim to have a volatile write cache that needs controlled
flushing from the guest. The flag is off if cache=writethrough is
defined because O_DSYNC guarantees that every write goes to stable
storage, and it is on for cache=none and cache=writeback.
Both scsi-disk and ide now use the new flage, changing from their
defaults of always off (ide) or always on (scsi-disk).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When a VM state change handler changes VM state, other VM state change
handlers can see the state transitions out of order.
bmdma_map(), scsi_disk_init() and virtio_blk_init() install VM state
change handlers to restart DMA. These handlers can vm_stop() by
running into a write error on a drive with werror=stop. This throws
the VM state change handler callback into disarray. Here's an example
case I observed:
0. The virtual IDE drive goes south. All future writes return errors.
1. Something encounters a write error, and duly stops the VM with
vm_stop().
2. vm_stop() calls vm_state_notify(0).
3. vm_state_notify() runs the callbacks in list vm_change_state_head.
It contains ide_dma_restart_cb() installed by bmdma_map(). It also
contains audio_vm_change_state_handler() installed by audio_init().
4. audio_vm_change_state_handler() stops audio stuff.
5. User continues VM with monitor command "c". This runs vm_start().
6. vm_start() calls vm_state_notify(1).
7. vm_state_notify() runs the callbacks in vm_change_state_head.
8. ide_dma_restart_cb() happens to come first. It does its work, runs
into a write error, and duly stops the VM with vm_stop().
9. vm_stop() runs vm_state_notify(0).
10. vm_state_notify() runs the callbacks in vm_change_state_head.
11. audio_vm_change_state_handler() stops audio stuff. Which isn't
running.
12. vm_stop() finishes, ide_dma_restart_cb() finishes, step 7's
vm_state_notify() resumes running handlers.
13. audio_vm_change_state_handler() starts audio stuff. Oopsie.
Fix this by moving the actual write from each VM state change handler
into a new bottom half (suggested by Gleb Natapov).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
I used the following command to enable debugging:
perl -p -i -e 's/^\/\/#define DEBUG/#define DEBUG/g' * */* */*/*
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Always use the vectored APIs to reduce code churn once we switch the BlockDriver
API to be vectored.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7019 c046a42c-6fe2-441c-8c8c-71466251a162
Implement Test Unit Ready command (return NOT READY as above
if !bdrv_is_inserted(...))
Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6954 c046a42c-6fe2-441c-8c8c-71466251a162
Add asc 0x3a, ascq 0: Medium not present to NOT READY sense
(needed to keep some guests from retrying causing long sleeps in the
kernel)
Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6953 c046a42c-6fe2-441c-8c8c-71466251a162
The bdrv layer uses a signed offset. Furthermore, block-raw-posix
only seeks when that offset is positive. Passing a negative offset
to block-raw-posix can result in data being written at the current
seek cursor's position.
It may be possible to exploit this to seek to the end of the disk
and extend the virtual disk by writing data to a negative sector
offset. After a reboot, this could lead to the guest having a
larger disk than it had before.
Close the hole by sanity checking the lba against the size of the
disk.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6475 c046a42c-6fe2-441c-8c8c-71466251a162
Paul Brook pointed out that the number of sectors reported
by the SCSI read capacity commands needs to be divided by
s->cluster_size, because bdrv_get_geometry reports the number
of 512 byte sectors, while emulated CDROMs report 2048 byte
sectors back to the guest.
This has no consequences for emulated hard disks, which use
a cluster size of 1.
aliguori: fixed typo
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6469 c046a42c-6fe2-441c-8c8c-71466251a162
Implement SCSI READ(16), WRITE(16) and SAI READ CAPACITY(16) commands,
so SCSI disks larger than 2TB can work with guests that support these
newer SCSI commands.
The cast to (uint64_t) is needed because otherwise gcc will use a
signed int, which gets sign extended into uint64_t lba, resulting
in bad block numbers for READ 10 and READ 16 with block numbers
larger than 2^31.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6468 c046a42c-6fe2-441c-8c8c-71466251a162
Sector numbers can overflow on a virtual scsi disk of over 1TB
in size. Qemu's bdrv_read expects an int64_t, so fix the overflow
by going to that data type.
On large disks, we clip the capacity to 2TB instead of returning
"capacity modulo 2TB".
Turn sector_count into an unsigned to prevent a signed/unsigned
overflow with SCSI transfers larger than 2TB. We're unlikely to
ever hit this bug, but fixing it is just one line.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6467 c046a42c-6fe2-441c-8c8c-71466251a162
Windows calculates HW "uniqueness" based on a hard drive serial number
among other things. The patch allows to specify drive serial number
from a command line.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6214 c046a42c-6fe2-441c-8c8c-71466251a162
Openserver 5.0.5 sends an Inquiry command to the emulated SCSI disk
expecting a response length of 40 bytes. Currently the response to an
Inquiry command is hardcoded to 36 bytes. When receiving a response of
length 36 instead of 40 Openserver panics.
Modifications to original patch based on feedback from Ryan Harper and Paul
Brook. Thanks guys.
Signed-off-by: Justin Chevrier <address@hidden>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5903 c046a42c-6fe2-441c-8c8c-71466251a162
taken from Xen 17267:f4a92f0db20f, original patch by Samuel Thibault.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5385 c046a42c-6fe2-441c-8c8c-71466251a162