The virtio-scsi Host Bus Adapter provides access to devices on a SCSI
bus. Those SCSI devices typically have a BlockBackend. When the
BlockBackend enters a drained section, the SCSI device must temporarily
stop submitting new I/O requests.
Implement this behavior by temporarily stopping virtio-scsi virtqueue
processing when one of the SCSI devices enters a drained section. The
new scsi_device_drained_begin() API allows scsi-disk to message the
virtio-scsi HBA.
scsi_device_drained_begin() uses a drain counter so that multiple SCSI
devices can have overlapping drained sections. The HBA only sees one
pair of .drained_begin/end() calls.
After this commit, virtio-scsi no longer depends on hw/virtio's
ioeventfd aio_set_event_notifier(is_external=true). This commit is a
step towards removing the aio_disable_external() API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-19-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
dma_blk_cb() only takes the AioContext lock around ->io_func(). That
means the rest of dma_blk_cb() is not protected. In particular, the
DMAAIOCB field accesses happen outside the lock.
There is a race when the main loop thread holds the AioContext lock and
invokes scsi_device_purge_requests() -> bdrv_aio_cancel() ->
dma_aio_cancel() while an IOThread executes dma_blk_cb(). The dbs->acb
field determines how cancellation proceeds. If dma_aio_cancel() sees
dbs->acb == NULL while dma_blk_cb() is still running, the request can be
completed twice (-ECANCELED and the actual return value).
The following assertion can occur with virtio-scsi when an IOThread is
used:
../hw/scsi/scsi-disk.c:368: scsi_dma_complete: Assertion `r->req.aiocb != NULL' failed.
Fix the race by holding the AioContext across dma_blk_cb(). Now
dma_aio_cancel() under the AioContext lock will not see
inconsistent/intermediate states.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If requests are being processed in the IOThread when a SCSIDevice is
unplugged, scsi_device_purge_requests() -> scsi_req_cancel_async() races
with I/O completion callbacks. Both threads load and store req->aiocb.
This can lead to assert(r->req.aiocb == NULL) failures and undefined
behavior.
Protect r->req.aiocb with the AioContext lock to prevent the race.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockDriver->bdrv_getlength is categorized as IO callback, and it
currently doesn't run in a coroutine. We should let it take a graph
rdlock since the callback traverses the block nodes graph, which however
is only possible in a coroutine.
Therefore turn it into a co_wrapper to move the actual function into a
coroutine where the lock can be taken.
Because now this function creates a new coroutine and polls, we need to
take the AioContext lock where it is missing, for the only reason that
internally co_wrapper calls AIO_WAIT_WHILE and it expects to release the
AioContext lock.
This is especially messy when a co_wrapper creates a coroutine and polls
in bdrv_open_driver, because this function has so many callers in so
many context that it can easily lead to deadlocks. Therefore the new
rule for bdrv_open_driver is that the caller must always hold the
AioContext lock of the given bs (except if it is a coroutine), because
the function calls bdrv_refresh_total_sectors() which is now a
co_wrapper.
Once the rwlock is ultimated and placed in every place it needs to be,
we will poll using AIO_WAIT_WHILE_UNLOCKED and remove the AioContext
lock.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-7-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
SunOS expects CD-ROM devices to have a block size of 512, and will
fail to mount or install using QEMU's default block size of 2048.
When initializing the SCSI device, allow the `physical_block_size'
block device option to override the default block size.
Signed-off-by: John Millikin <john@john-millikin.com>
Message-Id: <20220804122950.1577012-1-john@john-millikin.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a SCSI command is received from the guest, the CDB length implied
by the first byte might exceed the number of bytes the guest sent. In
this case scsi_req_new() will read uninitialized data, causing
unpredictable behavior.
Adds the buf_len parameter to scsi_req_new() and plumbs it through the
call stack.
Signed-off-by: John Millikin <john@john-millikin.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1127
Message-Id: <20220817053458.698416-1-john@john-millikin.com>
[Fill in correct length for adapters other than ESP. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The existing code assumes that the block size can be generated from p[1] << 8
in multiple places which ignores the top and bottom 8 bits. If the block size
is allowed to be set to an arbitrary value then this causes a mismatch
between the value written by the guest in the block descriptor and the value
subsequently read back using READ CAPACITY causing the guest to generate
requests that can crash QEMU.
For now restrict block size changes to bits 8-15 and also ignore requests to
set the block size to 0 which causes the SCSI emulation to crash in at least
one place with a divide by zero error.
Fixes: 356c4c441e ("scsi-disk: allow MODE SELECT block descriptor to set the block size")
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1112
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220730122656.253448-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In scsi_disk_emulate_write_same() the number of host sectors to transfer is
calculated as (s->qdev.blocksize / BDRV_SECTOR_SIZE) which is then used to
copy data in block size chunks to the iov buffer.
Since the loop copying the data to the iov buffer uses a fixed increment of
s->qdev.blocksize then using a block size that isn't a multiple of
BDRV_SECTOR_SIZE introduces a rounding error in the iov buffer size calculation
such that the iov buffer copy overflows the space allocated.
Update the iov buffer copy for() loop so that it will use the smallest of either
the current block size or the remaining transfer count to prevent the overflow.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220730122656.253448-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The MODE SELECT command can contain an optional block descriptor that can be used
to set the device block size. If the block descriptor is present then update the
block size on the SCSI device accordingly.
This allows CDROMs to be used with A/UX which requires a CDROM drive which is
capable of switching from a 2048 byte sector size to a 512 byte sector size.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A/UX sends a MODE_PAGE_R_W_ERROR command with the AWRE bit set to 0 when enumerating
CDROM drives. Since the bit is currently hardcoded to 1 then indicate that the AWRE
bit can be changed (even though we don't care about the value) so that
the MODE_PAGE_R_W_ERROR page can be set successfully.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-12-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When A/UX configures the CDROM device it sends a truncated MODE SELECT request
for page 1 (MODE_PAGE_R_W_ERROR) which is only 6 bytes in length rather than
10. This seems to be due to bug in Apple's code which calculates the CDB message
length incorrectly.
The work at [1] suggests that this truncated request is accepted on real
hardware whereas in QEMU it generates an INVALID_PARAM_LEN sense code which
causes A/UX to get stuck in a loop retrying the command in an attempt to succeed.
Alter the mode page request length check so that truncated requests are allowed
if the SCSI_DISK_QUIRK_MODE_PAGE_TRUNCATED quirk is enabled, whilst also adding a
trace event to enable the condition to be detected.
[1] https://68kmla.org/bb/index.php?threads/scsi2sd-project-anyone-interested.29040/page-7#post-316444
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When initialising a drive ready to install MacOS, Apple HD SC Setup first attempts
to format the drive. Add a simple FORMAT UNIT command which simply returns success
to allow the format to succeed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220622105314.802852-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both MacOS and A/UX make use of vendor-specific MODE SELECT commands with PF=0
to identify SCSI devices:
- MacOS sends a MODE SELECT command with PF=0 for the MODE_PAGE_VENDOR_SPECIFIC
(0x0) mode page containing 2 bytes before initialising a disk
- A/UX (installed on disk) sends a MODE SELECT command with PF=0 during SCSI
bus enumeration, and gets stuck in an infinite loop if it fails
Add a new SCSI_DISK_QUIRK_MODE_PAGE_VENDOR_SPECIFIC_APPLE quirk to allow both
PF=0 MODE SELECT commands and implement a MODE_PAGE_VENDOR_SPECIFIC (0x0)
mode page which is compatible with MacOS.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
During SCSI bus enumeration A/UX sends a MODE SENSE command to the CDROM with
the DBD bit unset and expects the response to include a block descriptor. As per
the latest SCSI documentation, QEMU currently force-disables the block
descriptor for CDROM devices but the A/UX driver expects the requested block
descriptor to be returned.
If the block descriptor is not returned in the response then A/UX becomes
confused, since the block descriptor returned in the MODE SENSE response is
used to generate a subsequent MODE SELECT command which is then invalid.
Add a new SCSI_DISK_QUIRK_MODE_SENSE_ROM_USE_DBD quirk to allow this behaviour
to be enabled as required. Note that an additional workaround is required for
the previous SCSI_DISK_QUIRK_MODE_PAGE_APPLE_VENDOR quirk which must never
return a block descriptor even though the DBD bit is left unset.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One of the mechanisms MacOS uses to identify CDROM drives compatible with MacOS
is to send a custom MODE SELECT command for page 0x30 to the drive. The
response to this is a hard-coded manufacturer string which must match in order
for the CDROM to be usable within MacOS.
Add an implementation of the MODE SELECT page 0x30 response guarded by a newly
defined SCSI_DISK_QUIRK_MODE_PAGE_APPLE_VENDOR quirk bit so that CDROM drives
attached to non-Apple machines function exactly as before.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220622105314.802852-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the MacOS SCSI implementation is quite old (and Apple added some firmware
customisations to their drives for m68k Macs) there is need to add a mechanism
to correctly handle Apple-specific quirks.
Add a new quirks bitmap to SCSIDiskState that can be used to enable these
features as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220622105314.802852-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 1b7fd72955 ("block: rename buffer_alignment to
guest_block_size") noted:
At this point, the field is set by the device emulation, but completely
ignored by the block layer.
The last time the value of buffer_alignment/guest_block_size was
actually used was before commit 339064d506 ("block: Don't use guest
sector size for qemu_blockalign()").
This value has not been used since 2013. Get rid of it.
Cc: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220518130945.2657905-1-stefanha@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Move the various memalign-related functions out of osdep.h and into
their own header, which we include only where they are used.
While we're doing this, add some brief documentation comments.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
The "hardware version" machinery (qemu_set_hw_version(),
qemu_hw_version(), and the QEMU_HW_VERSION define) is used by fewer
than 10 files. Move it out from osdep.h into a new
qemu/hw-version.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-6-peter.maydell@linaro.org
The 'resid' field is slightly confusing and could be
interpreted as some ID. Rename it as 'residual' which
is clearer to review. No logical change.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220111184309.28637-8-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This avoids an off-by-one read of 'mode_sense_valid' buffer in
hw/scsi/scsi-disk.c:mode_sense_page().
Fixes: CVE-2021-3930
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: a8f4bbe290 ("scsi-disk: store valid mode pages in a table")
Fixes: #546
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
check_lba_range expects sectors to be expressed in original qdev blocksize, but
scsi_unmap_complete_noio was translating them to 512 block sizes, which was
causing sense errors in the larger LBAs in devices using a 4k block size.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/345
Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Message-Id: <20210521142829.326217-1-kit.westneat@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'scsi-hd' and 'scsi-cd' devices provide suitable alternatives.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
On Fedora 33, gcc 10.2.1 notes that scsi_cdb_length(buf) can set
len==-1, which in turn overflows g_malloc():
[5/5] Linking target qemu-system-x86_64
In function ‘scsi_disk_new_request_dump’,
inlined from ‘scsi_new_request’ at ../hw/scsi/scsi-disk.c:2608:9:
../hw/scsi/scsi-disk.c:2582:19: warning: argument 1 value ‘18446744073709551612’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
2582 | line_buffer = g_malloc(len * 5 + 1);
| ^
Silence it with a decent assertion, since we only convert a buffer to
bytes when we have a valid cdb length.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210209152350.207958-1-eblake@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Some SCSI drivers like virtio have an internal mapping for the
host_status. This patch moves the host_status translation into
the SCSI drivers to allow those drivers to set up the correct
values.
Signed-off-by: Hannes Reinecke <hare@suse.de>.
[Added default handling to avoid touching all drivers. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently sg_io_sense_from_errno() converts the two input parameters
'errno' and 'io_hdr' into sense code and SCSI status. Having
split the function off into scsi_sense_from_errno() and
scsi_sense_from_host_status(), both of which are available generically,
we now inline the logic in the callers so that scsi-disk and
scsi-generic will be able to pass host_status to the HBA.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20201116184041.60465-7-hare@suse.de>
[Put together from "scsi-disk: Add sg_io callback to evaluate status"
and what remains of "scsi: split sg_io_sense_from_errno() in two functions",
with many other fixes. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now, recoverable sense values are only passed directly to the
guest only for rerror=report. However, when rerror/werror are 'stop'
we still don't want the host to be involved on every UNIT ATTENTION
(especially considered that the QMP event will not have enough information
to act on the report).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of fishing it from *r->status, just pass the SCSI status
as a positive value of the second parameter and an errno as a
negative value.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new function is an extension of the switch statement in scsi-disk.c
which also includes the errno cases only found in sg_io_sense_from_errno.
This allows us to consolidate the errno handling.
Extracted from a patch by Hannes Reinecke <hare@suse.de>.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When requested to ignore errors, just do nothing and let the
request complete normally. This means that the request will
be accounted correctly.
This is what commit 40dce4ee61 ("scsi-disk: fix rerror/werror=ignore",
2018-10-19) was supposed to do:
Fixes: 40dce4ee61 ("scsi-disk: fix rerror/werror=ignore", 2018-10-19)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add tracepoints for SG_IO commands to allow for debugging
of SG_IO commands.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20201116183114.55703-4-hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current code sets an infinite timeout on SG_IO requests,
causing the guest to stall if the host experiences a frame
loss.
This patch adds an 'io_timeout' parameter for SCSIDevice to
make the SG_IO timeout configurable, and also shortens the
default timeout to 30 seconds to avoid infinite stalls.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20201116183114.55703-3-hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, blk_is_read_only() tells whether a given BlockBackend can
only be used in read-only mode because its root node is read-only. Some
callers actually try to answer a slightly different question: Is the
BlockBackend configured to be writable, by taking write permissions on
the root node?
This can differ, for example, for CD-ROM devices which don't take write
permissions, but may be backed by a writable image file. scsi-cd allows
write requests to the drive if blk_is_read_only() returns false.
However, the write request will immediately run into an assertion
failure because the write permission is missing.
This patch introduces separate functions for both questions.
blk_supports_write_perm() answers the question whether the block
node/image file can support writable devices, whereas blk_is_writable()
tells whether the BlockBackend is currently configured to be writable.
All calls of blk_is_read_only() are converted to one of the two new
functions.
Fixes: https://bugs.launchpad.net/bugs/1906693
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210118123448.307825-2-kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Move the property types and property macros implemented in
qdev-properties-system.c to a new qdev-properties-system.h
header.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20201211220529.2290218-16-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Linux has some OS-specific (and sometimes weird) mappings for various SCSI
statuses and sense codes. The most important is probably RESERVATION
CONFLICT. Add them so that they can be reported back to the guest
kernel.
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some typedefs and macros are defined after the type check macros.
This makes it difficult to automatically replace their
definitions with OBJECT_DECLARE_TYPE.
Patch generated using:
$ ./scripts/codeconverter/converter.py -i \
--pattern=QOMStructTypedefSplit $(git grep -l '' -- '*.[ch]')
which will split "typdef struct { ... } TypedefName"
declarations.
Followed by:
$ ./scripts/codeconverter/converter.py -i --pattern=MoveSymbols \
$(git grep -l '' -- '*.[ch]')
which will:
- move the typedefs and #defines above the type check macros
- add missing #include "qom/object.h" lines if necessary
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20200831210740.126168-9-ehabkost@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20200831210740.126168-10-ehabkost@redhat.com>
Message-Id: <20200831210740.126168-11-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Use self-explicit definitions instead of magic '512' value.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20200814082841.27000-8-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Several block device properties related to blocksize configuration must
be in certain relationship WRT each other: physical block must be no
smaller than logical block; min_io_size, opt_io_size, and
discard_granularity must be a multiple of a logical block.
To ensure these requirements are met, add corresponding consistency
checks to blkconf_blocksizes, adjusting its signature to communicate
possible error to the caller. Also remove the now redundant consistency
checks from the specific devices.
Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20200528225516.1676602-3-rvkagan@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Devices may have component devices and buses.
Device realization may fail. Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).
When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far. If any of these unrealizes
failed, the device would be left in an inconsistent state. Must not
happen.
device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.
Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back? We'd have to
re-realize, which can fail. This design is fundamentally broken.
device_set_realized() does not roll back at all. Instead, it keeps
unrealizing, ignoring further errors.
It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.
bus_set_realized() does not roll back either. Instead, it stops
unrealizing.
Fortunately, no unrealize method can fail, as we'll see below.
To fix the design error, drop parameter @errp from all the unrealize
methods.
Any unrealize method that uses @errp now needs an update. This leads
us to unrealize() methods that can fail. Merely passing it to another
unrealize method cannot cause failure, though. Here are the ones that
do other things with @errp:
* virtio_serial_device_unrealize()
Fails when qbus_set_hotplug_handler() fails, but still does all the
other work. On failure, the device would stay realized with its
resources completely gone. Oops. Can't happen, because
qbus_set_hotplug_handler() can't actually fail here. Pass
&error_abort to qbus_set_hotplug_handler() instead.
* hw/ppc/spapr_drc.c's unrealize()
Fails when object_property_del() fails, but all the other work is
already done. On failure, the device would stay realized with its
vmstate registration gone. Oops. Can't happen, because
object_property_del() can't actually fail here. Pass &error_abort
to object_property_del() instead.
* spapr_phb_unrealize()
Fails and bails out when remove_drcs() fails, but other work is
already done. On failure, the device would stay realized with some
of its resources gone. Oops. remove_drcs() fails only when
chassis_from_bus()'s object_property_get_uint() fails, and it can't
here. Pass &error_abort to remove_drcs() instead.
Therefore, no unrealize method can fail before this patch.
device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool(). Can't drop @errp there, so pass
&error_abort.
We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors. Pass &error_abort instead.
Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.
One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().
Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
Fixes the following coccinelle warnings:
$ spatch --sp-file --verbose-parsing ... \
scripts/coccinelle/remove_local_err.cocci
...
SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5213
SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5261
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:166
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:167
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:169
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:170
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:171
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:172
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:173
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5787
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5789
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5800
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5801
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5802
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5804
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5805
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5806
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:6329
SUSPICIOUS: a \ character appears outside of a #define at ./hw/sd/sdhci.c:1133
SUSPICIOUS: a \ character appears outside of a #define at ./hw/scsi/scsi-disk.c:3081
SUSPICIOUS: a \ character appears outside of a #define at ./hw/net/virtio-net.c:1529
SUSPICIOUS: a \ character appears outside of a #define at ./hw/riscv/sifive_u.c:468
SUSPICIOUS: a \ character appears outside of a #define at ./dump/dump.c:1895
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2209
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2215
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2221
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2222
SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:172
SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:173
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200412223619.11284-2-f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
It's an old compatibility shim that just delegates to scsi-cd or scsi-hd.
Just like ide-drive, we don't need this.
Acked-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Relevant devices are:
* ide-hd (and ide-cd, ide-drive)
* scsi-hd (and scsi-cd, scsi-disk, scsi-block)
* virtio-blk-pci
We do not call del_boot_device_lchs() for ide-* since we don't need to -
IDE block devices do not support unplugging.
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
Signed-off-by: Sam Eiderman <sameid@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190923121737.83281-8-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This will help to account the operation in the following commit.
The difference is that we don't call scsi_disk_req_check_error() before
the 1st discard iteration anymore. That function also checks if
the request is cancelled, however it shouldn't get canceled until it
yields in blk_aio() functions anyway.
Same approach is already used for emulate_write_same.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190923121737.83281-7-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
it allows to report it in the error handler
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Message-id: 20190923121737.83281-6-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>