qemu/include/hw/scsi/scsi.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

252 lines
8.9 KiB
C
Raw Normal View History

#ifndef QEMU_HW_SCSI_H
#define QEMU_HW_SCSI_H
#include "block/aio.h"
#include "hw/block/block.h"
#include "hw/qdev-core.h"
#include "scsi/utils.h"
#include "qemu/notify.h"
#include "qom/object.h"
#define MAX_SCSI_DEVS 255
typedef struct SCSIBus SCSIBus;
typedef struct SCSIBusInfo SCSIBusInfo;
typedef struct SCSIDevice SCSIDevice;
typedef struct SCSIRequest SCSIRequest;
typedef struct SCSIReqOps SCSIReqOps;
#define SCSI_SENSE_BUF_SIZE_OLD 96
#define SCSI_SENSE_BUF_SIZE 252
#define DEFAULT_IO_TIMEOUT 30
struct SCSIRequest {
SCSIBus *bus;
SCSIDevice *dev;
const SCSIReqOps *ops;
uint32_t refcount;
uint32_t tag;
uint32_t lun;
int16_t status;
int16_t host_status;
void *hba_private;
uint64_t residual;
SCSICommand cmd;
NotifierList cancel_notifiers;
/* Note:
* - fields before sense are initialized by scsi_req_alloc;
* - sense[] is uninitialized;
* - fields after sense are memset to 0 by scsi_req_alloc.
* */
uint8_t sense[SCSI_SENSE_BUF_SIZE];
uint32_t sense_len;
bool enqueued;
bool io_canceled;
bool retry;
bool dma_started;
BlockAIOCB *aiocb;
QEMUSGList *sg;
QTAILQ_ENTRY(SCSIRequest) next;
};
#define TYPE_SCSI_DEVICE "scsi-device"
OBJECT_DECLARE_TYPE(SCSIDevice, SCSIDeviceClass, SCSI_DEVICE)
struct SCSIDeviceClass {
DeviceClass parent_class;
void (*realize)(SCSIDevice *dev, Error **errp);
qdev: Unrealize must not fail Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-05-05 18:29:24 +03:00
void (*unrealize)(SCSIDevice *dev);
int (*parse_cdb)(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
size_t buf_len, void *hba_private);
SCSIRequest *(*alloc_req)(SCSIDevice *s, uint32_t tag, uint32_t lun,
uint8_t *buf, void *hba_private);
void (*unit_attention_reported)(SCSIDevice *s);
};
struct SCSIDevice
{
DeviceState qdev;
VMChangeStateEntry *vmsentry;
QEMUBH *bh;
uint32_t id;
block: add topology qdev properties Add three new qdev properties to export block topology information to the guest. This is needed to get optimal I/O alignment for RAID arrays or SSDs. The options are: - physical_block_size to specify the physical block size of the device, this is going to increase from 512 bytes to 4096 kilobytes for many modern storage devices - min_io_size to specify the minimal I/O size without performance impact, this is typically set to the RAID chunk size for arrays. - opt_io_size to specify the optimal sustained I/O size, this is typically the RAID stripe width for arrays. I decided to not auto-probe these values from blkid which might easily be possible as I don't know how to deal with these issues on migration. Note that we specificly only set the physical_block_size, and not the logial one which is the unit all I/O is described in. The reason for that is that IDE does not support increasing the logical block size and at last for now I want to stick to one meachnisms in queue and allow for easy switching of transports for a given backing image which would not be possible if scsi and virtio use real 4k sectors, while ide only uses the physical block exponent. To make this more common for the different block drivers introduce a new BlockConf structure holding all common block properties and a DEFINE_BLOCK_PROPERTIES macro to add them all together, mirroring what is done for network drivers. Also switch over all block drivers to use it, except for the floppy driver which has weird driveA/driveB properties and probably won't require any advanced block options ever. Example usage for a virtio device with 4k physical block size and 8k optimal I/O size: -drive file=scratch.img,media=disk,cache=none,id=scratch \ -device virtio-blk-pci,drive=scratch,physical_block_size=4096,opt_io_size=8192 aliguori: updated patch to take into account BLOCK events Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-11 01:37:09 +03:00
BlockConf conf;
SCSISense unit_attention;
bool sense_is_ua;
uint8_t sense[SCSI_SENSE_BUF_SIZE];
uint32_t sense_len;
QTAILQ_HEAD(, SCSIRequest) requests;
uint32_t channel;
uint32_t lun;
int blocksize;
int type;
uint64_t max_lba;
uint64_t wwn;
uint64_t port_wwn;
int scsi_version;
int default_scsi_version;
uint32_t io_timeout;
hw/scsi: add VPD Block Limits emulation The VPD Block Limits Inquiry page is optional, allowing SCSI devices to not implement it. This is the case for devices like the MegaRAID SAS 9361-8i and Microsemi PM8069. In case of SCSI passthrough, the response of this request is used by the QEMU SCSI layer to set the max_io_sectors that the guest device will support, based on the value of the max_sectors_kb that the device has set in the host at that time. Without this response, the guest kernel is free to assume any value of max_io_sectors for the SCSI device. If this value is greater than the value from the host, SCSI Sense errors will occur because the guest will send read/write requests that are larger than the underlying host device is configured to support. An example of this behavior can be seen in [1]. A workaround is to set the max_sectors_kb host value back in the guest kernel (a process that can be automated using rc.local startup scripts and the like), but this has several drawbacks: - it can be troublesome if the guest has many passthrough devices that needs this tuning; - if a change in max_sectors_kb is made in the host side, manual change in the guests will also be required; - during an OS install it is difficult, and sometimes not possible, to go to a terminal and change the max_sectors_kb prior to the installation. This means that the disk can't be used during the install process. The easiest alternative here is to roll back to scsi-hd, install the guest and then go back to SCSI passthrough when the installation is done and max_sectors_kb can be set. An easier way would be to QEMU handle the absence of the Block Limits VPD device response, setting max_io_sectors accordingly and allowing the guest to use the device without the hassle. This patch adds emulation of the Block Limits VPD response for SCSI passthrough devices of type TYPE_DISK that doesn't support it. The following changes were made: - scsi_handle_inquiry_reply will now check the available VPD pages from the Inquiry EVPD reply. In case the device does not - a new function called scsi_generic_set_vpd_bl_emulation, that is called during device realize, was created to set a new flag 'needs_vpd_bl_emulation' of the device. This function retrieves the Inquiry EVPD response of the device to check for VPD BL support. - scsi_handle_inquiry_reply will now check the available VPD pages from the Inquiry EVPD reply in case the device needs VPD BL emulation, adding the Block Limits page (0xb0) to the list. This will make the guest kernel aware of the support that we're now providing by emulation. - a new function scsi_emulate_block_limits creates the emulated Block Limits response. This function is called inside scsi_read_complete in case the device requires Block Limits VPD emulation and we detected a SCSI Sense error in the VPD Block Limits reply that was issued from the guest kernel to the device. This error is expected: we're reporting support from our side, but the device isn't aware of it. With this patch, the guest now queries the Block Limits page during the device configuration because it is being advertised in the Supported Pages response. It will either receive the Block Limits page from the hardware, if it supports it, or will receive an emulated response from QEMU. At any rate, the guest now has the information to set the max_sectors_kb parameter accordingly, sparing the user of SCSI sense errors that would happen without the emulated response and in the absence of Block Limits support from the hardware. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1566195 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1566195 Reported-by: Dac Nguyen <dacng@us.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20180627172432.11120-4-danielhb413@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-27 20:24:32 +03:00
bool needs_vpd_bl_emulation;
bool hba_supports_iothread;
};
extern const VMStateDescription vmstate_scsi_device;
#define VMSTATE_SCSI_DEVICE(_field, _state) { \
.name = (stringify(_field)), \
.size = sizeof(SCSIDevice), \
.vmsd = &vmstate_scsi_device, \
.flags = VMS_STRUCT, \
.offset = vmstate_offset_value(_state, _field, SCSIDevice), \
}
/* cdrom.c */
int cdrom_read_toc(int nb_sectors, uint8_t *buf, int msf, int start_track);
int cdrom_read_toc_raw(int nb_sectors, uint8_t *buf, int msf, int session_num);
/* scsi-bus.c */
struct SCSIReqOps {
size_t size;
void (*init_req)(SCSIRequest *req);
void (*free_req)(SCSIRequest *req);
int32_t (*send_command)(SCSIRequest *req, uint8_t *buf);
void (*read_data)(SCSIRequest *req);
void (*write_data)(SCSIRequest *req);
uint8_t *(*get_buf)(SCSIRequest *req);
void (*save_request)(QEMUFile *f, SCSIRequest *req);
void (*load_request)(QEMUFile *f, SCSIRequest *req);
};
struct SCSIBusInfo {
int tcq;
int max_channel, max_target, max_lun;
int (*parse_cdb)(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
size_t buf_len, void *hba_private);
void (*transfer_data)(SCSIRequest *req, uint32_t arg);
void (*fail)(SCSIRequest *req);
void (*complete)(SCSIRequest *req, size_t residual);
void (*cancel)(SCSIRequest *req);
void (*change)(SCSIBus *bus, SCSIDevice *dev, SCSISense sense);
QEMUSGList *(*get_sg_list)(SCSIRequest *req);
void (*save_request)(QEMUFile *f, SCSIRequest *req);
void *(*load_request)(QEMUFile *f, SCSIRequest *req);
void (*free_request)(SCSIBus *bus, void *priv);
/*
* Temporarily stop submitting new requests between drained_begin() and
* drained_end(). Called from the main loop thread with the BQL held.
*
* Implement these callbacks if request processing is triggered by a file
* descriptor like an EventNotifier. Otherwise set them to NULL.
*/
void (*drained_begin)(SCSIBus *bus);
void (*drained_end)(SCSIBus *bus);
};
#define TYPE_SCSI_BUS "SCSI"
OBJECT_DECLARE_SIMPLE_TYPE(SCSIBus, SCSI_BUS)
struct SCSIBus {
BusState qbus;
int busnr;
SCSISense unit_attention;
const SCSIBusInfo *info;
int drain_count; /* protected by BQL */
};
scsi: Replace scsi_bus_new() with scsi_bus_init(), scsi_bus_init_named() The function scsi_bus_new() creates a new SCSI bus; callers can either pass in a name argument to specify the name of the new bus, or they can pass in NULL to allow the bus to be given an automatically generated unique name. Almost all callers want to use the autogenerated name; the only exception is the virtio-scsi device. Taking a name argument that should almost always be NULL is an easy-to-misuse API design -- it encourages callers to think perhaps they should pass in some standard name like "scsi" or "scsi-bus". We don't do this anywhere for SCSI, but we do (incorrectly) do it for other bus types such as i2c. The function name also implies that it will return a newly allocated object, when it in fact does in-place allocation. We more commonly name such functions foo_init(), with foo_new() being the allocate-and-return variant. Replace all the scsi_bus_new() callsites with either: * scsi_bus_init() for the usual case where the caller wants an autogenerated bus name * scsi_bus_init_named() for the rare case where the caller needs to specify the bus name and document that for the _named() version it's then the caller's responsibility to think about uniqueness of bus names. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20210923121153.23754-2-peter.maydell@linaro.org
2021-09-23 15:11:48 +03:00
/**
* scsi_bus_init_named: Initialize a SCSI bus with the specified name
* @bus: SCSIBus object to initialize
* @bus_size: size of @bus object
* @host: Device which owns the bus (generally the SCSI controller)
* @info: structure defining callbacks etc for the controller
* @bus_name: Name to use for this bus
*
* This in-place initializes @bus as a new SCSI bus with a name
* provided by the caller. It is the caller's responsibility to make
* sure that name does not clash with the name of any other bus in the
* system. Unless you need the new bus to have a specific name, you
* should use scsi_bus_init() instead.
scsi: Replace scsi_bus_new() with scsi_bus_init(), scsi_bus_init_named() The function scsi_bus_new() creates a new SCSI bus; callers can either pass in a name argument to specify the name of the new bus, or they can pass in NULL to allow the bus to be given an automatically generated unique name. Almost all callers want to use the autogenerated name; the only exception is the virtio-scsi device. Taking a name argument that should almost always be NULL is an easy-to-misuse API design -- it encourages callers to think perhaps they should pass in some standard name like "scsi" or "scsi-bus". We don't do this anywhere for SCSI, but we do (incorrectly) do it for other bus types such as i2c. The function name also implies that it will return a newly allocated object, when it in fact does in-place allocation. We more commonly name such functions foo_init(), with foo_new() being the allocate-and-return variant. Replace all the scsi_bus_new() callsites with either: * scsi_bus_init() for the usual case where the caller wants an autogenerated bus name * scsi_bus_init_named() for the rare case where the caller needs to specify the bus name and document that for the _named() version it's then the caller's responsibility to think about uniqueness of bus names. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20210923121153.23754-2-peter.maydell@linaro.org
2021-09-23 15:11:48 +03:00
*/
void scsi_bus_init_named(SCSIBus *bus, size_t bus_size, DeviceState *host,
const SCSIBusInfo *info, const char *bus_name);
/**
* scsi_bus_init: Initialize a SCSI bus
*
* This in-place-initializes @bus as a new SCSI bus and gives it
* an automatically generated unique name.
*/
static inline void scsi_bus_init(SCSIBus *bus, size_t bus_size,
DeviceState *host, const SCSIBusInfo *info)
{
scsi_bus_init_named(bus, bus_size, host, info, NULL);
}
static inline SCSIBus *scsi_bus_from_device(SCSIDevice *d)
{
return DO_UPCAST(SCSIBus, qbus, d->qdev.parent_bus);
}
SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockBackend *blk,
int unit, bool removable, int bootindex,
bool share_rw,
BlockdevOnError rerror,
BlockdevOnError werror,
const char *serial, Error **errp);
void scsi_bus_set_ua(SCSIBus *bus, SCSISense sense);
void scsi_bus_legacy_handle_cmdline(SCSIBus *bus);
SCSIRequest *scsi_req_alloc(const SCSIReqOps *reqops, SCSIDevice *d,
uint32_t tag, uint32_t lun, void *hba_private);
SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
uint8_t *buf, size_t buf_len, void *hba_private);
int32_t scsi_req_enqueue(SCSIRequest *req);
SCSIRequest *scsi_req_ref(SCSIRequest *req);
void scsi_req_unref(SCSIRequest *req);
int scsi_bus_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
size_t buf_len, void *hba_private);
int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, uint8_t *buf,
size_t buf_len);
void scsi_req_build_sense(SCSIRequest *req, SCSISense sense);
void scsi_req_print(SCSIRequest *req);
void scsi_req_continue(SCSIRequest *req);
void scsi_req_data(SCSIRequest *req, int len);
void scsi_req_complete(SCSIRequest *req, int status);
void scsi_req_complete_failed(SCSIRequest *req, int host_status);
uint8_t *scsi_req_get_buf(SCSIRequest *req);
int scsi_req_get_sense(SCSIRequest *req, uint8_t *buf, int len);
void scsi_req_cancel_complete(SCSIRequest *req);
void scsi_req_cancel(SCSIRequest *req);
void scsi_req_cancel_async(SCSIRequest *req, Notifier *notifier);
void scsi_req_retry(SCSIRequest *req);
void scsi_device_drained_begin(SCSIDevice *sdev);
void scsi_device_drained_end(SCSIDevice *sdev);
void scsi_device_purge_requests(SCSIDevice *sdev, SCSISense sense);
void scsi_device_set_ua(SCSIDevice *sdev, SCSISense sense);
void scsi_device_report_change(SCSIDevice *dev, SCSISense sense);
void scsi_device_unit_attention_reported(SCSIDevice *dev);
hw/scsi: add VPD Block Limits emulation The VPD Block Limits Inquiry page is optional, allowing SCSI devices to not implement it. This is the case for devices like the MegaRAID SAS 9361-8i and Microsemi PM8069. In case of SCSI passthrough, the response of this request is used by the QEMU SCSI layer to set the max_io_sectors that the guest device will support, based on the value of the max_sectors_kb that the device has set in the host at that time. Without this response, the guest kernel is free to assume any value of max_io_sectors for the SCSI device. If this value is greater than the value from the host, SCSI Sense errors will occur because the guest will send read/write requests that are larger than the underlying host device is configured to support. An example of this behavior can be seen in [1]. A workaround is to set the max_sectors_kb host value back in the guest kernel (a process that can be automated using rc.local startup scripts and the like), but this has several drawbacks: - it can be troublesome if the guest has many passthrough devices that needs this tuning; - if a change in max_sectors_kb is made in the host side, manual change in the guests will also be required; - during an OS install it is difficult, and sometimes not possible, to go to a terminal and change the max_sectors_kb prior to the installation. This means that the disk can't be used during the install process. The easiest alternative here is to roll back to scsi-hd, install the guest and then go back to SCSI passthrough when the installation is done and max_sectors_kb can be set. An easier way would be to QEMU handle the absence of the Block Limits VPD device response, setting max_io_sectors accordingly and allowing the guest to use the device without the hassle. This patch adds emulation of the Block Limits VPD response for SCSI passthrough devices of type TYPE_DISK that doesn't support it. The following changes were made: - scsi_handle_inquiry_reply will now check the available VPD pages from the Inquiry EVPD reply. In case the device does not - a new function called scsi_generic_set_vpd_bl_emulation, that is called during device realize, was created to set a new flag 'needs_vpd_bl_emulation' of the device. This function retrieves the Inquiry EVPD response of the device to check for VPD BL support. - scsi_handle_inquiry_reply will now check the available VPD pages from the Inquiry EVPD reply in case the device needs VPD BL emulation, adding the Block Limits page (0xb0) to the list. This will make the guest kernel aware of the support that we're now providing by emulation. - a new function scsi_emulate_block_limits creates the emulated Block Limits response. This function is called inside scsi_read_complete in case the device requires Block Limits VPD emulation and we detected a SCSI Sense error in the VPD Block Limits reply that was issued from the guest kernel to the device. This error is expected: we're reporting support from our side, but the device isn't aware of it. With this patch, the guest now queries the Block Limits page during the device configuration because it is being advertised in the Supported Pages response. It will either receive the Block Limits page from the hardware, if it supports it, or will receive an emulated response from QEMU. At any rate, the guest now has the information to set the max_sectors_kb parameter accordingly, sparing the user of SCSI sense errors that would happen without the emulated response and in the absence of Block Limits support from the hardware. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1566195 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1566195 Reported-by: Dac Nguyen <dacng@us.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20180627172432.11120-4-danielhb413@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-27 20:24:32 +03:00
void scsi_generic_read_device_inquiry(SCSIDevice *dev);
int scsi_device_get_sense(SCSIDevice *dev, uint8_t *buf, int len, bool fixed);
int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t *cmd, uint8_t cmd_size,
uint8_t *buf, uint8_t buf_size, uint32_t timeout);
SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int target, int lun);
SCSIDevice *scsi_device_get(SCSIBus *bus, int channel, int target, int lun);
/* scsi-generic.c. */
extern const SCSIReqOps scsi_generic_req_ops;
/* scsi-disk.c */
#define SCSI_DISK_QUIRK_MODE_PAGE_APPLE_VENDOR 0
#define SCSI_DISK_QUIRK_MODE_SENSE_ROM_USE_DBD 1
#define SCSI_DISK_QUIRK_MODE_PAGE_VENDOR_SPECIFIC_APPLE 2
#define SCSI_DISK_QUIRK_MODE_PAGE_TRUNCATED 3
#endif