This will let allow requests to be dispatched through different callbacks,
either common or per-device.
This patch adjusts the API, the next one will move members to SCSIReqOps.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
With this patch, sense data is stored in the generic data structures
for SCSI devices and requests. The SCSI layer takes care of storing
sense data in the SCSIDevice for the subsequent REQUEST SENSE command.
At the same time, get_sense is removed and scsi_req_get_sense can use
an entirely generic implementation.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
A small improvement in the SCSI request API. Pass the status
at the time the request is completed, so that we can assert that
no request is completed twice. This would have detected the
problem fixed in the previous patch.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In fact, if the HBA's transfer_data callback goes on with scsi_req_continue
the request will be completed successfully instead of showing a failure.
It can even cause a segmentation fault.
An easy way to trigger it is "eject -f cd" during installation (during media
test if the installer does something like that).
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Instead of using its own definitions scsi-disk should
be using the device type of the parent device.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Correct typos of "licenced" to "licensed".
Reviewed-by: Stefan Weil <weil@mail.berlios.de>
Reviewed-by: Andreas F=E4rber <andreas.faerber@web.de>
Signed-off-by: Matthew Fernandez <matthew.fernandez@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
If the serial number is not set we should mask it out in the
list of supported VPD pages and mark it as not supported.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A debugging statement wasn't converted to the new interface.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The LUN field in the CDB is a historical relic. Ignore it as reserved,
which is what modern SCSI specifications actually say.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
scsi_req_parse() already provides for a data direction setting,
so we should be using it to check for correct direction.
And we should return the sense code 'INVALID FIELD IN CDB'
in these cases.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The get_sense callback copies existing sense information into
the provided buffer. This is required if sense information
should be transferred together with the command response.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Move the common part of scsi-disk.c and scsi-generic.c to the SCSI layer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The SCSI spec has a quite detailed list of sense codes available.
It even mandates the use of specific ones for some failure cases.
The current implementation just has one type of generic error
which is actually a violation of the spec in certain cases.
This patch introduces various predefined sense codes to have the
sense code reporting more in line with the spec.
On top of Hannes's patch I fixed the reply to REQUEST SENSE commands
with DESC=0 and a small (<18) length.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
This is for when the request must be dropped in the void,
but still memory should be freed. To this end, the devices
register a second callback in SCSIBusOps.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The code for canceling requests upon reset is already the same. Clean
it up and move it to scsi-bus.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Currently the SCSIRequest structure is abstracted away and cannot accessed
directly from the driver. This requires the handler to do a lookup on
an abstract 'tag' which identifies the SCSIRequest structure.
With this patch the SCSIRequest structure is exposed to the driver. This
allows use to use it directly as an argument to the SCSIDeviceInfo
callback functions and remove the lookup.
A new callback function 'alloc_req' is introduced matching 'free
req'; unref'ing to free up resources after use is moved into the
scsi_command_complete callbacks.
This temporarily introduces a leak of requests that are cancelled,
when they are removed from the queue and not from the driver. This
is fixed later by introducing scsi_req_cancel. That patch in turn
depends on this one, because the argument to scsi_req_cancel is a
SCSIRequest.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
With the next patch, a device may hold SCSIRequest for an indefinite
time. Split a rather big patch, and protect against access errors,
by reference counting them.
There is some ugliness in scsi_send_command implementation due to
the need to unref the request when it fails. This will go away
with the next patches, which move the unref'ing to the devices.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
This abstracts calling the command_complete callback, reducing churn
in the following patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
DriveInfo is closely tied to -drive, and like -drive, it mixes
information about host and guest part of the block device. Unlike
DriveInfo, BlockDriverState should be about the host part only.
One of the remaining guest bits there is the "type hint". -drive
option media sets it, and qdevs "ide-drive", "scsi-disk" and non-qdev
IF_XEN devices check it to pick HD vs. CD.
Communicate -drive option media via new DriveInfo member media_cd
instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A "scsi-disk" is either a hard disk or a CD-ROM, depending on the
associated BlockDriverState's type hint. Unclean; disk vs. CD belongs
to the guest part, not the host part.
Have separate qdevs "scsi-hd" and "scsi-cd" to model disk vs. CD in
the guest part.
Keep scsi-disk for backward compatibility.
Don't copy scsi-disk property removable to scsi-cd. It's not used and
always zero(!) there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Define and use dedicated constants for vm_stop reasons, they actually
have nothing to do with the EXCP_* defines used so far. At this chance,
specify more detailed reasons so that VM state change handlers can
evaluate them.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Provide the "removable" qdev property bit to override the SCSI INQUIRY
removable (RMB) bit for non-CDROM devices. This will be used by USB
Mass Storage Devices, which sometimes have this guest-visible bit set
and sometimes do not. They therefore requires a means for user
configuration.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Support discards via the WRITE SAME command with the unmap bit set, and
tell the initiator about the support for it via the block limit and the
new thin provisioning EVPD pages. Also fix the comment which incorrectly
describedthe block limits EVPD page.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If bootindex is specified on command line a string that describes device
in firmware readable way is added into sorted list. Later this list will
be passed into firmware to control boot order.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Add "fw_name" to DeviceInfo to use in device path building. In
contrast to "name" "fw_name" should refer to functionality device
provides instead of particular device model like "name" does.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
We parse the CDB twice, which is completely unnecessary.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The current sense handling in scsi-bus is only used by the
scsi-disk driver; the scsi-generic driver is using its own.
So we should move the current sense handling into the
scsi-disk driver.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We should announce and support the block device characterics page
only on block devices, not on CDROMs. And the VPD page 0x83 has
an off-by-one error.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
SCSI read/write requests should not be re-issued before the current
fragment of I/O completes. There are asserts in scsi-disk.c that guard
this constraint but they trigger on SPARC Linux 2.4. It turns out that
the asserts are too early in the code path and don't allow for read
requests to terminate.
Only the read assert needs to be moved but move the write assert too for
consistency.
Reported-by: Nigel Horne <njh@bandsman.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fix scsi-disk to use the usual completion paths that involve rerror/werror
handling instead of directly completing the requests in cases where
bdrv_aio_readv/writev returns NULL.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This pulls the request completion for error cases from the caller to
scsi_disk_emulate_command. This should not change semantics, but allows to
reuse scsi_handle_write_error() for flushes in the next patch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This implements the rerror option for SCSI disks.
It also includes minor changes to the write path where the same code is used
that was criticized in the review for the changes to the read path required for
rerror support.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Use qemu_blockalign for all allocations in the block layer. This allows
increasing the required alignment, which is need to support O_DIRECT on
devices with large block sizes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I use a legacy OS which depends on some optional SCSI commands.
In fact this implementation does nothing special, but provides minimum
support for the following commands:
REZERO UNIT
WRITE AND VERIFY(10)
WRITE AND VERIFY(12)
WRITE AND VERIFY(16)
MODE SELECT(6)
MODE SELECT(10)
SEEK(6)
SEEK(10)
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The DBD bit does not work as expected.
SCSI-Spec:
http://ldkelley.com/SCSI2/SCSI2/SCSI2-08.html#8.2.10
"A disable block descriptors (DBD) bit of zero indicates that the target
may return zero or more block descriptors in the returned MODE SENSE
data (see 8.3.3), at the target's discretion. A DBD bit of one
specifies that the target shall not return any block descriptors in the
returned MODE SENSE data."
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
SCSI-Spec:
http://ldkelley.com/SCSI2/SCSI2/SCSI2-08.html#8.2.10
"An initiator may request any one or all of the supported mode pages
from a target. If an initiator issues a MODE SENSE command with a
page code value not implemented by the target, the target shall return
CHECK CONDITION status and shall set the sense key to ILLEGAL REQUEST
and the additional sense code to INVALID FIELD IN CDB."
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block descriptor contains the number of blocks, not the highest LBA.
Real hard disks return 0 if the number of blocks exceed the maximum 0xFFFFFF.
SCSI-Spec:
http://ldkelley.com/SCSI2/SCSI2/SCSI2-08.html#8.3.3
"The number of blocks field specifies the number of logical blocks on the
medium to which the density code and block length fields apply. A value
of zero indicates that all of the remaining logical blocks of the logical
unit shall have the medium characteristics specified."
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The page control (PC) field defines the type of mode parameter values
to be returned in the mode pages:
PC=0 : Current values
PC=1 : Changeable values
PC=2 : Default values
PC=3 : Saved values
The current implementation always returns the same type of parameters.
This is OK for Current and Default values as we don't support changes
to be done by the MODE SELECT command.
For Saved values the following applies (implemented by this patch):
"A PC field value of 3h requests that the target return the saved
values of the mode parameters. Implementation of saved page parameters
is optional. Mode parameters not supported by the target shall be set
to zero. If saved values are not implemented, the command shall be
terminated with CHECK CONDITION status, the sense key set to
ILLEGAL REQUEST and the additional sense code set to
SAVING PARAMETERS NOT SUPPORTED."
For Changeable values the following applies (implemented by this patch):
"A PC field value of 1h requests that the target return a mask denoting
those mode parameters that are changeable. In the mask, the fields of
the mode parameters that are changeable shall be set to all one bits and
the fields of the mode parameters that are non-changeable (i.e. defined
by the target) shall be set to all zero bits."
In newer versions of the SCSI-2 spec the following clause was added.
"If the logical unit does not implement changeable parameters mode pages
and the device server receives a MODE SENSE command with 01b in the PC
field, then the command shall be terminated with CHECK CONDITION status,
with the sense key set to ILLEGAL REQUEST, and the additional sense code
set to INVALID FIELD IN CDB."
This was not yet included in the SCSI-2 Working Drafts from 1986-1993.
I assume that the variant to return CHECK CONDITION for PC=1 is not
widely implemented by real devices. I have a legacy OS which fails,
if MODE_SENSE returns non GOOD for PC=1. So for highest compatibility I
implemented the former variant with this patch.
The last Working Draft X3T9.2 Rev. 10L 7-SEP-93 can be found here:
http://ldkelley.com/SCSI2/SCSI2/SCSI2-08.html#8.2.10
In mode_sense_page() this patch also avoids multiple hard coded
definitions of the same mode page length. Instead I use the varable
p[1]. In fact the returned length of the mode pages 4 and 5 were wrong
(2 bytes less).
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The header for the MODE SENSE(10) command is 8 bytes long.
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The MODE DATA LENGTH field indicates the length in bytes of the following
data that is available to be transferred. The mode data length does not include
the number of bytes in the MODE DATA LENGTH field.
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Changing block.h or blockdev.h resulted in recompiling most objects.
Move DriveInfo typedef and BlockInterfaceType enum definitions
to qemu-common.h and rearrange blockdev.h use to decrease churn.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Disks without media make no sense. For SCSI, a Linux guest kernel
complains during boot. I didn't try other combinations.
scsi-generic doesn't need the additional check, because it already
requires bdrv_is_sg(), which fails without media.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
drive_init() doesn't permit rerror for if=scsi, but that's worthless:
we get it via if=none and -device.
Moreover, scsi-generic doesn't support werror. Since drive_init()
doesn't catch that, option werror was silently ignored even with
if=scsi.
Wart: unlike drive_init(), we don't reject the default action when
it's explicitly specified. That's because we can't distinguish "no
rerror option" from "rerror=report", or "no werror" from
"rerror=enospc". Left for another day.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockDriverState member removable controls whether virtual media
change (monitor commands change, eject) is allowed. It is set when
the "type hint" is BDRV_TYPE_CDROM or BDRV_TYPE_FLOPPY.
The type hint is only set by drive_init(). It sets BDRV_TYPE_FLOPPY
for if=floppy. It sets BDRV_TYPE_CDROM for media=cdrom and if=ide,
scsi, xen, or none.
if=ide and if=scsi work, because the type hint makes it a CD-ROM.
if=xen likewise, I think.
For the same reason, if=none works when it's used by ide-drive or
scsi-disk. For other guest devices, there are problems:
* fdc: you can't change virtual media
$ qemu [...] -drive if=none,id=foo,... -global isa-fdc.driveA=foo
QEMU 0.12.50 monitor - type 'help' for more information
(qemu) eject foo
Device 'foo' is not removable
unless you add media=cdrom, but that makes it readonly.
* virtio: if you add media=cdrom, you can change virtual media. If
you eject, the guest gets I/O errors. If you change, the guest sees
the drive's contents suddenly change.
* scsi-generic: if you add media=cdrom, you can change virtual media.
I didn't test what that does to the guest or the physical device,
but it can't be pretty.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make the property point to BlockDriverState, cutting out the DriveInfo
middleman. This prepares the ground for block devices that don't have
a DriveInfo.
Currently all user-defined ones have a DriveInfo, because the only way
to define one is -drive & friends (they go through drive_init()).
DriveInfo is closely tied to -drive, and like -drive, it mixes
information about host and guest part of the block device. I'm
working towards a new way to define block devices, with clean
host/guest separation, and I need to get DriveInfo out of the way for
that.
Fortunately, the device models are perfectly happy with
BlockDriverState, except for two places: ide_drive_initfn() and
scsi_disk_initfn() need to check the DriveInfo for a serial number set
with legacy -drive serial=... Use drive_get_by_blockdev() there.
Device model code should now use DriveInfo only when explicitly
dealing with drives defined the old way, i.e. without -device.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We automatically delete blockdev host parts on unplug of the guest
device. Too much magic, but we can't change that now.
The delete happens early in the guest device teardown, before the
connection to the host part is severed. Thus, the guest part's
pointer to the host part dangles for a brief time. No actual harm
comes from this, but we'll catch such dangling pointers a few commits
down the road. Clean up the dangling pointers by delaying the
automatic deletion until the guest part's pointer is gone.
Device usb-storage deliberately makes two qdev properties refer to the
same drive, because it automatically creates a second device. Again,
too much magic we can't change now. Multiple references worked okay
before, but now free_drive() dies for the second one. Zap the extra
reference.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
That's where they belong semantically (block device host part), even
though the actions are actually executed by guest device code.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Anything that moves hundreds of lines out of vl.c can't be all bad.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Show the actual default value instead of <null> when the property has
not been set.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It needs to be a qdev property, because it belongs to the drive's
guest part.
Bonus: info qtree now shows the serial number.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Once the I/O completion callback returned, aiocb will be released by the
controller. So we have to clear the reference not only in
scsi_write_complete, but also in scsi_read_complete. Otherwise we risk
inconsistencies when a reset hits us before the related request is
released.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Ensure that pending requests of an SCSI disk are purged on system reset
and also restore max_lba. The latter is no only present in the reset
handler as that one is called after init as well.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In case s->version is shorter than 4 bytes we overflow the memcpy src
buffer. Fix it by clearing the target buffer, then copy only the
amount of bytes we actually have.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Don't rely on CDROM hint for read_only attribute
Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a logical block size attribute as various guest side tools only
increase the filesystem sector size based on it, not the advisory
physical block size.
For scsi we already have support for a different logical block size
in place for CDROMs that we can built upon. Only my recent block
device characteristics VPD page needs some fixups. Note that we
leave the logial block size for CDROMs hardcoded as the 2k value
is expected for it in general.
For virtio-blk we already have a feature flag claiming to support
a variable logical block size that was added for the s390 kuli
hypervisor. Interestingly it does not actually change the units
in which the protocol works, which is still fixed at 512 bytes,
but only communicates a different minimum I/O granularity. So
all we need to do in virtio is to add a trap for unaligned I/O
and round down the device size to the next multiple of the logical
block size.
IDE does not support any other logical block size than 512 bytes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
error_report() terminates the message with a newline. Strip it it
from its arguments.
This fixes a few error messages lacking a newline:
net_handle_fd_param()'s "No file descriptor named %s found", and
tap_open()'s "vnet_hdr=1 requested, but no kernel support for
IFF_VNET_HDR available" (all three versions).
There's one place that passes arguments without newlines
intentionally: load_vmstate(). Fix it up.
You're supposed to use scsi-generic for that. Which rejects anything
but /dev/sg*.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The next commit will move the STOP event into do_vm_stop(), to
have the expected event sequence we need to emit the I/O error
event before calling vm_stop().
The expected sequence is:
{ "event": "BLOCK_IO_ERROR" [...] }
{ "event": "STOP" }
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Export the physical block size in the READ CAPACITY (16) command,
and add the new block limits VPD page to export the minimum and
optiomal I/O sizes.
Note that we also need to bump the scsi revision level to SPC-2
as that is the minimum requirement by at least the Linux kernel
to try READ CAPACITY (16) first and look at the block limits VPD
page.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add three new qdev properties to export block topology information to
the guest. This is needed to get optimal I/O alignment for RAID arrays
or SSDs.
The options are:
- physical_block_size to specify the physical block size of the device,
this is going to increase from 512 bytes to 4096 kilobytes for many
modern storage devices
- min_io_size to specify the minimal I/O size without performance impact,
this is typically set to the RAID chunk size for arrays.
- opt_io_size to specify the optimal sustained I/O size, this is
typically the RAID stripe width for arrays.
I decided to not auto-probe these values from blkid which might easily
be possible as I don't know how to deal with these issues on migration.
Note that we specificly only set the physical_block_size, and not the
logial one which is the unit all I/O is described in. The reason for
that is that IDE does not support increasing the logical block size and
at last for now I want to stick to one meachnisms in queue and allow
for easy switching of transports for a given backing image which would
not be possible if scsi and virtio use real 4k sectors, while ide only
uses the physical block exponent.
To make this more common for the different block drivers introduce a
new BlockConf structure holding all common block properties and a
DEFINE_BLOCK_PROPERTIES macro to add them all together, mirroring
what is done for network drivers. Also switch over all block drivers
to use it, except for the floppy driver which has weird driveA/driveB
properties and probably won't require any advanced block options ever.
Example usage for a virtio device with 4k physical block size and
8k optimal I/O size:
-drive file=scratch.img,media=disk,cache=none,id=scratch \
-device virtio-blk-pci,drive=scratch,physical_block_size=4096,opt_io_size=8192
aliguori: updated patch to take into account BLOCK events
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Just call bdrv_mon_event() in the right place.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Win32 suffers from a very big memory leak when dealing with SCSI devices.
Each read/write request allocates memory with qemu_memalign (ie
VirtualAlloc) but frees it with qemu_free (ie free).
Pair all qemu_memalign() calls with qemu_vfree() to prevent such leaks.
Signed-off-by: Herve Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds a new property named 'ver' to scsi-disk which allows to
specify the version which the virtual disk/cdrom should report to the
guest. By default this is the qemu version (i.e. 0.12). usage:
-drive if=none,id=disk,file=...
-device lsi
-device scsi-disk,drive=disk,bus=scsi.0,unit=0,ver=42
You can also switch the version for all scsi drives using:
-global scsi-disk.ver=42
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
According to the SCSI-2 specification,
http://ldkelley.com/SCSI2/SCSI2/SCSI2/SCSI2-08.html#8.2.5 ,
"if the allocation length of the command descriptor block (CDB) is too
small to transfer all of the parameters, the additional length shall
not be adjusted to reflect the truncation."
The 36 mandatory bytes of response are written to outbuf, and then
only the length requested in CDB is transferred.
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Vendor identification, product identification and product revision level
should be padded with spaces without a terminating NULL character, see
SCSI-2 standard, 8.2.5.1 Standard INQUIRY data.
Signed-off-by: Laszlo Ast <laszlo.ast@siemens-enterprise.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Either rename variables and functions to refer to write errors (which is what
they actually do) or introduce a parameter to distinguish reads and writes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add READ_16 + friends to scsi-defs.h, scsi_command_name() and the
request parsing helper functions.
Use them in scsi-disk.c too.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move VERIFY emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move REPORT_LUNS emulation from scsi_send_command() to
scsi_disk_emulate_command().
Also add REPORT_LUNS to scsi-defs.h and scsi_command_name().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move SERVICE_ACTION_IN emulation from scsi_send_command() to
scsi_disk_emulate_command().
Also add SERVICE_ACTION_IN to scsi-defs.h and scsi_command_name().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move GET_CONFIGURATION emulation from scsi_send_command() to
scsi_disk_emulate_command().
Also add GET_CONFIGURATION to scsi-defs.h and scsi_command_name().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move READ_TOC emulation from scsi_send_command() to
scsi_disk_emulate_command(). Add scsi_disk_emulate_read_toc() function
which holds the longisch READ_TOC emulation code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move SYNCHRONIZE_CACHE emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move READ_CAPACITY emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move ALLOW_MEDIUM_REMOVAL emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move START_STOP emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move MODE_SENSE emulation from scsi_send_command() to
scsi_disk_emulate_command(). Create two helper functions:
mode_sense_page() which writes the actual mode pages and
scsi_disk_emulate_mode_sense() which holds the longish MODE_SENSE
emulation code, calling into mode_sense_page() as needed.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move RESERVE+RELEASE emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move INQUIRY emulation from scsi_send_command() to
scsi_disk_emulate_command(). Also split the longish INQUITY emulation
code into the new scsi_disk_emulate_inquiry() function. Serial number
handling is slightly changed, we don't copy it any more but look it up
directly in DriveInfo which we have at hand anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move REQUEST_SENSE emulation from scsi_send_command() to
scsi_disk_emulate_command().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add new scsi_disk_emulate_command() function, which will -- when
finished -- handle all scsi disk command emulation except actual I/O
(READ+WRITE commands) which goes to the block layer. The function
builds on top of the new SCSIRequest struct.
SCSI command emulation code is moved over from scsi_send_command() in
steps to ease review and make it easier to pin down regressions (if any)
using bisect. This patch moves over TEST_UNIT_READY only.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Also add and use the scsi_req_complete() helper function for calling the
completion callback.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Largely based on <scsi/scsi.h> from linux. Added into the tree so we
can use the defines everywhere, not just in scsi-generic.c (which is
linux-specific).
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Create generic functions to allocate, find and release SCSIRequest
structs. Make scsi-disk and scsi-generic use them.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Changes:
* Move from open-coded lists to QTAILQ macros.
* Move the struct elements to the common data structures
(SCSIDevice + SCSIRequest).
* Drop free request pools.
* Fix request cleanup in the destroy callback.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Rename the SCSIRequest structs in scsi-disk.c and scsi-generic.c to
SCSIDiskReq and SCSIGenericReq. Create a SCSIRequest struct and move
the common elements over.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The SCSI-2 documentation suggests, that although the block
descriptor is optional for an arbitrary SCSI-2 device (chapter 8.2.10,
http://ldkelley.com/SCSI2/SCSI2/SCSI2/SCSI2/SCSI2-08.html )
it is mandatory for a disk: chapters 9.1.2, 9.3.3
( http://ldkelley.com/SCSI2/SCSI2/SCSI2/SCSI2-09.html ) don't say
"optional" any more, just "The block descriptor in the MODE SENSE
data describes the block lengths that are used on the medium."
v2: limit the number of sectors reported in the block descriptor to 24 bits.
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Changes:
* drive_uninit() wants a DriveInfo now.
* drive_uninit() also calls bdrv_delete(),
so callers don't need to do that.
* drive_uninit() calls are moved over to the ->exit()
callbacks, destroy_bdrvs() is zapped.
* setting bdrv->private is not needed any more as the
only user (destroy_bdrvs) is gone.
* usb-storage needs no drive_uninit, scsi-disk will
handle that.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a enable_write_cache flag in the block driver state, and use it to
decide if we claim to have a volatile write cache that needs controlled
flushing from the guest. The flag is off if cache=writethrough is
defined because O_DSYNC guarantees that every write goes to stable
storage, and it is on for cache=none and cache=writeback.
Both scsi-disk and ide now use the new flage, changing from their
defaults of always off (ide) or always on (scsi-disk).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When a VM state change handler changes VM state, other VM state change
handlers can see the state transitions out of order.
bmdma_map(), scsi_disk_init() and virtio_blk_init() install VM state
change handlers to restart DMA. These handlers can vm_stop() by
running into a write error on a drive with werror=stop. This throws
the VM state change handler callback into disarray. Here's an example
case I observed:
0. The virtual IDE drive goes south. All future writes return errors.
1. Something encounters a write error, and duly stops the VM with
vm_stop().
2. vm_stop() calls vm_state_notify(0).
3. vm_state_notify() runs the callbacks in list vm_change_state_head.
It contains ide_dma_restart_cb() installed by bmdma_map(). It also
contains audio_vm_change_state_handler() installed by audio_init().
4. audio_vm_change_state_handler() stops audio stuff.
5. User continues VM with monitor command "c". This runs vm_start().
6. vm_start() calls vm_state_notify(1).
7. vm_state_notify() runs the callbacks in vm_change_state_head.
8. ide_dma_restart_cb() happens to come first. It does its work, runs
into a write error, and duly stops the VM with vm_stop().
9. vm_stop() runs vm_state_notify(0).
10. vm_state_notify() runs the callbacks in vm_change_state_head.
11. audio_vm_change_state_handler() stops audio stuff. Which isn't
running.
12. vm_stop() finishes, ide_dma_restart_cb() finishes, step 7's
vm_state_notify() resumes running handlers.
13. audio_vm_change_state_handler() starts audio stuff. Oopsie.
Fix this by moving the actual write from each VM state change handler
into a new bottom half (suggested by Gleb Natapov).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
I used the following command to enable debugging:
perl -p -i -e 's/^\/\/#define DEBUG/#define DEBUG/g' * */* */*/*
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Always use the vectored APIs to reduce code churn once we switch the BlockDriver
API to be vectored.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7019 c046a42c-6fe2-441c-8c8c-71466251a162
Implement Test Unit Ready command (return NOT READY as above
if !bdrv_is_inserted(...))
Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6954 c046a42c-6fe2-441c-8c8c-71466251a162
Add asc 0x3a, ascq 0: Medium not present to NOT READY sense
(needed to keep some guests from retrying causing long sleeps in the
kernel)
Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6953 c046a42c-6fe2-441c-8c8c-71466251a162
The bdrv layer uses a signed offset. Furthermore, block-raw-posix
only seeks when that offset is positive. Passing a negative offset
to block-raw-posix can result in data being written at the current
seek cursor's position.
It may be possible to exploit this to seek to the end of the disk
and extend the virtual disk by writing data to a negative sector
offset. After a reboot, this could lead to the guest having a
larger disk than it had before.
Close the hole by sanity checking the lba against the size of the
disk.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6475 c046a42c-6fe2-441c-8c8c-71466251a162
Paul Brook pointed out that the number of sectors reported
by the SCSI read capacity commands needs to be divided by
s->cluster_size, because bdrv_get_geometry reports the number
of 512 byte sectors, while emulated CDROMs report 2048 byte
sectors back to the guest.
This has no consequences for emulated hard disks, which use
a cluster size of 1.
aliguori: fixed typo
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6469 c046a42c-6fe2-441c-8c8c-71466251a162
Implement SCSI READ(16), WRITE(16) and SAI READ CAPACITY(16) commands,
so SCSI disks larger than 2TB can work with guests that support these
newer SCSI commands.
The cast to (uint64_t) is needed because otherwise gcc will use a
signed int, which gets sign extended into uint64_t lba, resulting
in bad block numbers for READ 10 and READ 16 with block numbers
larger than 2^31.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6468 c046a42c-6fe2-441c-8c8c-71466251a162
Sector numbers can overflow on a virtual scsi disk of over 1TB
in size. Qemu's bdrv_read expects an int64_t, so fix the overflow
by going to that data type.
On large disks, we clip the capacity to 2TB instead of returning
"capacity modulo 2TB".
Turn sector_count into an unsigned to prevent a signed/unsigned
overflow with SCSI transfers larger than 2TB. We're unlikely to
ever hit this bug, but fixing it is just one line.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6467 c046a42c-6fe2-441c-8c8c-71466251a162
Windows calculates HW "uniqueness" based on a hard drive serial number
among other things. The patch allows to specify drive serial number
from a command line.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6214 c046a42c-6fe2-441c-8c8c-71466251a162
Openserver 5.0.5 sends an Inquiry command to the emulated SCSI disk
expecting a response length of 40 bytes. Currently the response to an
Inquiry command is hardcoded to 36 bytes. When receiving a response of
length 36 instead of 40 Openserver panics.
Modifications to original patch based on feedback from Ryan Harper and Paul
Brook. Thanks guys.
Signed-off-by: Justin Chevrier <address@hidden>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5903 c046a42c-6fe2-441c-8c8c-71466251a162
taken from Xen 17267:f4a92f0db20f, original patch by Samuel Thibault.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5385 c046a42c-6fe2-441c-8c8c-71466251a162