SunOS expects CD-ROM devices to have a block size of 512, and will
fail to mount or install using QEMU's default block size of 2048.
When initializing the SCSI device, allow the `physical_block_size'
block device option to override the default block size.
Signed-off-by: John Millikin <john@john-millikin.com>
Message-Id: <20220804122950.1577012-1-john@john-millikin.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The `started` field is manipulated internally within the vhost code
except for one place, vhost-user-blk via f5b22d06fb (vhost: recheck
dev state in the vhost_migration_log routine). Mark that as a FIXME
because it introduces a potential race. I think the referenced fix
should be tracking its state locally.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220802095010.3330793-12-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwittz@nutanix.com>
In scsi_req_parse_cdb(), if the CDB length implied by the command type
exceeds the initialized portion of the command buffer, reject the request.
Rejected requests are recorded by the `scsi_req_parse_bad` trace event.
On example of a bug detected by this check is SunOS's use of interleaved
DMA and non-DMA commands. This guest behavior currently causes QEMU to
parse uninitialized memory as a SCSI command, with unpredictable
outcomes.
With the new check in place:
* QEMU consistently creates a trace event and rejects the request.
* SunOS retries the request(s) and is able to successfully boot from
disk.
Signed-off-by: John Millikin <john@john-millikin.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1127
Message-Id: <20220817053458.698416-2-john@john-millikin.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a SCSI command is received from the guest, the CDB length implied
by the first byte might exceed the number of bytes the guest sent. In
this case scsi_req_new() will read uninitialized data, causing
unpredictable behavior.
Adds the buf_len parameter to scsi_req_new() and plumbs it through the
call stack.
Signed-off-by: John Millikin <john@john-millikin.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1127
Message-Id: <20220817053458.698416-1-john@john-millikin.com>
[Fill in correct length for adapters other than ESP. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Per investigation on the linked ticket, SunOS issues a SCSI bus reset
to the ESP as part of its boot sequence. If this ESP command doesn't
cause devices to assert sense flag UNIT ATTENTION, SunOS will consider
the CD-ROM device to be non-compliant with Common Command Set (CCS).
In this condition, the SunOS installer's early userspace doesn't set
the installation source location to sr0 and the miniroot copy fails.
Signed-off-by: John Millikin <john@john-millikin.com>
Suggested-by: Bill Paul <noisetube@gmail.com>
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1127
Message-Id: <20220817053846.699310-1-john@john-millikin.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commits 01ef8185b8 amd 24b36e9813 updated the way that the maximum
transfer length is calculated for patching block limits VPD page in an
INQUIRY response.
The same updates also need to be made for the case where the host device
does not support the block limits VPD page at all and we emulate the
whole page.
Without this fix, on host block devices a maximum transfer length of
(INT_MAX - sector_size) bytes is advertised to the guest, resulting in
I/O errors when a request that exceeds the host limits is made by the
guest. (Prior to commit 24b36e9813, this code path would use the
max_transfer value from the host instead of INT_MAX, but still miss the
fix from 01ef8185b8 where max_transfer is also capped to max_iov
host pages, so it would be less wrong, but still wrong.)
Cc: qemu-stable@nongnu.org
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2096251
Fixes: 01ef8185b8
Fixes: 24b36e9813
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220822125320.48257-1-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As soon as virtio_scsi_data_plane_start() attaches host notifiers the
IOThread may start virtqueue processing. There is a race between
IOThread virtqueue processing and virtio_scsi_data_plane_start() because
it only assigns s->dataplane_started after attaching host notifiers.
When a virtqueue handler function in the IOThread calls
virtio_scsi_defer_to_dataplane() it may see !s->dataplane_started and
attempt to start dataplane even though we're already in the IOThread:
#0 0x00007f67b360857c __pthread_kill_implementation (libc.so.6 + 0xa257c)
#1 0x00007f67b35bbd56 raise (libc.so.6 + 0x55d56)
#2 0x00007f67b358e833 abort (libc.so.6 + 0x28833)
#3 0x00007f67b358e75b __assert_fail_base.cold (libc.so.6 + 0x2875b)
#4 0x00007f67b35b4cd6 __assert_fail (libc.so.6 + 0x4ecd6)
#5 0x000055ca87fd411b memory_region_transaction_commit (qemu-kvm + 0x67511b)
#6 0x000055ca87e17811 virtio_pci_ioeventfd_assign (qemu-kvm + 0x4b8811)
#7 0x000055ca87e14836 virtio_bus_set_host_notifier (qemu-kvm + 0x4b5836)
#8 0x000055ca87f8e14e virtio_scsi_set_host_notifier (qemu-kvm + 0x62f14e)
#9 0x000055ca87f8dd62 virtio_scsi_dataplane_start (qemu-kvm + 0x62ed62)
#10 0x000055ca87e14610 virtio_bus_start_ioeventfd (qemu-kvm + 0x4b5610)
#11 0x000055ca87f8c29a virtio_scsi_handle_ctrl (qemu-kvm + 0x62d29a)
#12 0x000055ca87fa5902 virtio_queue_host_notifier_read (qemu-kvm + 0x646902)
#13 0x000055ca882c099e aio_dispatch_handler (qemu-kvm + 0x96199e)
#14 0x000055ca882c1761 aio_poll (qemu-kvm + 0x962761)
#15 0x000055ca880e1052 iothread_run (qemu-kvm + 0x782052)
#16 0x000055ca882c562a qemu_thread_start (qemu-kvm + 0x96662a)
This patch assigns s->dataplane_started before attaching host notifiers
so that virtqueue handler functions that run in the IOThread before
virtio_scsi_data_plane_start() returns correctly identify that dataplane
does not need to be started. This fix is taken from the virtio-blk
dataplane code and it's worth adding a comment in virtio-blk as well to
explain why it works.
Note that s->dataplane_started does not need the AioContext lock because
it is set before attaching host notifiers and cleared after detaching
host notifiers. In other words, the IOThread always sees the value true
and the main loop thread does not modify it while the IOThread is
active.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2099541
Reported-by: Qing Wang <qinwang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220808162134.240405-1-stefanha@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The existing code assumes that the block size can be generated from p[1] << 8
in multiple places which ignores the top and bottom 8 bits. If the block size
is allowed to be set to an arbitrary value then this causes a mismatch
between the value written by the guest in the block descriptor and the value
subsequently read back using READ CAPACITY causing the guest to generate
requests that can crash QEMU.
For now restrict block size changes to bits 8-15 and also ignore requests to
set the block size to 0 which causes the SCSI emulation to crash in at least
one place with a divide by zero error.
Fixes: 356c4c441e ("scsi-disk: allow MODE SELECT block descriptor to set the block size")
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1112
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220730122656.253448-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In scsi_disk_emulate_write_same() the number of host sectors to transfer is
calculated as (s->qdev.blocksize / BDRV_SECTOR_SIZE) which is then used to
copy data in block size chunks to the iov buffer.
Since the loop copying the data to the iov buffer uses a fixed increment of
s->qdev.blocksize then using a block size that isn't a multiple of
BDRV_SECTOR_SIZE introduces a rounding error in the iov buffer size calculation
such that the iov buffer copy overflows the space allocated.
Update the iov buffer copy for() loop so that it will use the smallest of either
the current block size or the remaining transfer count to prevent the overflow.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220730122656.253448-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The MODE SELECT command can contain an optional block descriptor that can be used
to set the device block size. If the block descriptor is present then update the
block size on the SCSI device accordingly.
This allows CDROMs to be used with A/UX which requires a CDROM drive which is
capable of switching from a 2048 byte sector size to a 512 byte sector size.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A/UX sends a MODE_PAGE_R_W_ERROR command with the AWRE bit set to 0 when enumerating
CDROM drives. Since the bit is currently hardcoded to 1 then indicate that the AWRE
bit can be changed (even though we don't care about the value) so that
the MODE_PAGE_R_W_ERROR page can be set successfully.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-12-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When A/UX configures the CDROM device it sends a truncated MODE SELECT request
for page 1 (MODE_PAGE_R_W_ERROR) which is only 6 bytes in length rather than
10. This seems to be due to bug in Apple's code which calculates the CDB message
length incorrectly.
The work at [1] suggests that this truncated request is accepted on real
hardware whereas in QEMU it generates an INVALID_PARAM_LEN sense code which
causes A/UX to get stuck in a loop retrying the command in an attempt to succeed.
Alter the mode page request length check so that truncated requests are allowed
if the SCSI_DISK_QUIRK_MODE_PAGE_TRUNCATED quirk is enabled, whilst also adding a
trace event to enable the condition to be detected.
[1] https://68kmla.org/bb/index.php?threads/scsi2sd-project-anyone-interested.29040/page-7#post-316444
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When initialising a drive ready to install MacOS, Apple HD SC Setup first attempts
to format the drive. Add a simple FORMAT UNIT command which simply returns success
to allow the format to succeed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220622105314.802852-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both MacOS and A/UX make use of vendor-specific MODE SELECT commands with PF=0
to identify SCSI devices:
- MacOS sends a MODE SELECT command with PF=0 for the MODE_PAGE_VENDOR_SPECIFIC
(0x0) mode page containing 2 bytes before initialising a disk
- A/UX (installed on disk) sends a MODE SELECT command with PF=0 during SCSI
bus enumeration, and gets stuck in an infinite loop if it fails
Add a new SCSI_DISK_QUIRK_MODE_PAGE_VENDOR_SPECIFIC_APPLE quirk to allow both
PF=0 MODE SELECT commands and implement a MODE_PAGE_VENDOR_SPECIFIC (0x0)
mode page which is compatible with MacOS.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
During SCSI bus enumeration A/UX sends a MODE SENSE command to the CDROM with
the DBD bit unset and expects the response to include a block descriptor. As per
the latest SCSI documentation, QEMU currently force-disables the block
descriptor for CDROM devices but the A/UX driver expects the requested block
descriptor to be returned.
If the block descriptor is not returned in the response then A/UX becomes
confused, since the block descriptor returned in the MODE SENSE response is
used to generate a subsequent MODE SELECT command which is then invalid.
Add a new SCSI_DISK_QUIRK_MODE_SENSE_ROM_USE_DBD quirk to allow this behaviour
to be enabled as required. Note that an additional workaround is required for
the previous SCSI_DISK_QUIRK_MODE_PAGE_APPLE_VENDOR quirk which must never
return a block descriptor even though the DBD bit is left unset.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220622105314.802852-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One of the mechanisms MacOS uses to identify CDROM drives compatible with MacOS
is to send a custom MODE SELECT command for page 0x30 to the drive. The
response to this is a hard-coded manufacturer string which must match in order
for the CDROM to be usable within MacOS.
Add an implementation of the MODE SELECT page 0x30 response guarded by a newly
defined SCSI_DISK_QUIRK_MODE_PAGE_APPLE_VENDOR quirk bit so that CDROM drives
attached to non-Apple machines function exactly as before.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220622105314.802852-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the MacOS SCSI implementation is quite old (and Apple added some firmware
customisations to their drives for m68k Macs) there is need to add a mechanism
to correctly handle Apple-specific quirks.
Add a new quirks bitmap to SCSIDiskState that can be used to enable these
features as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220622105314.802852-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Set current_req to NULL, not current_req->req, to prevent reusing a free'd
buffer in case of repeated SCSI cancel requests. Also apply the fix to
CLEAR QUEUE and BUS DEVICE RESET messages as well, since they also cancel
the request.
Thanks to Alexander Bulekov for providing a reproducer.
Fixes: CVE-2022-0216
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/972
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20220711123316.421279-1-mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Set current_req->req to NULL to prevent reusing a free'd buffer in case of
repeated SCSI cancel requests. Thanks to Thomas Huth for suggesting the patch.
Fixes: CVE-2022-0216
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/972
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220705200543.2366809-1-mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Inspired by Julia Lawall's fixing of Linux
kernel comments, I looked at qemu, although I did it manually.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Message-Id: <20220614104045.85728-2-dgilbert@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Commit 1b7fd72955 ("block: rename buffer_alignment to
guest_block_size") noted:
At this point, the field is set by the device emulation, but completely
ignored by the block layer.
The last time the value of buffer_alignment/guest_block_size was
actually used was before commit 339064d506 ("block: Don't use guest
sector size for qemu_blockalign()").
This value has not been used since 2013. Get rid of it.
Cc: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220518130945.2657905-1-stefanha@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
more CXL patches
VIOT
Igor's huge AML rework
fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKj4YcPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpkNcIAKTsMfUVueTjelC2RwIdegQkypycKhCweKzc
QxddaEr0w+N2164byT3IUy9h53hV3qAAmMuGE4d8B2r5rykf+SwDfIeNmHNqntnA
oLraXIxSSAf4/1cTsRCVL/BXo2E9P+WHI3huw37HClmPLdyMjQa1AtpTpKnIsbmO
sBZf7t5yHDJ2WGZwBQ1IbAxvsdGo1fa1TCt1jZ9g1dmnQSTteQG8DHkGoRnkwTi7
510jb0e8uQEgKytCdLTHqESHlfgjvoI73OFOAR2dzTKy6KelFmdLYSo2FtsIdtT5
1fZNaDjtl6zQ4b2iLBgPpHtikKch9BzzhDMbCsq7FpvasZ8u2FE=
=LXG0
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: fixes,cleanups,features
more CXL patches
VIOT
Igor's huge AML rework
fixes, cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKj4YcPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpkNcIAKTsMfUVueTjelC2RwIdegQkypycKhCweKzc
# QxddaEr0w+N2164byT3IUy9h53hV3qAAmMuGE4d8B2r5rykf+SwDfIeNmHNqntnA
# oLraXIxSSAf4/1cTsRCVL/BXo2E9P+WHI3huw37HClmPLdyMjQa1AtpTpKnIsbmO
# sBZf7t5yHDJ2WGZwBQ1IbAxvsdGo1fa1TCt1jZ9g1dmnQSTteQG8DHkGoRnkwTi7
# 510jb0e8uQEgKytCdLTHqESHlfgjvoI73OFOAR2dzTKy6KelFmdLYSo2FtsIdtT5
# 1fZNaDjtl6zQ4b2iLBgPpHtikKch9BzzhDMbCsq7FpvasZ8u2FE=
# =LXG0
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Jun 2022 05:27:51 PM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (53 commits)
hw/vhost-user-scsi|blk: set `supports_config` flag correctly
hw/virtio/vhost-user: don't use uninitialized variable
tests/acpi: virt: update golden masters for VIOT
hw/acpi/viot: sort VIOT ACPI table entries by PCI host bridge min_bus
tests/acpi: virt: allow VIOT acpi table changes
hw/acpi/viot: build array of PCI host bridges before generating VIOT ACPI table
hw/acpi/viot: move the individual PCI host bridge entry generation to a new function
hw/acpi/viot: rename build_pci_range_node() to enumerate_pci_host_bridges()
hw/cxl: Fix missing write mask for HDM decoder target list registers
pci: fix overflow in snprintf string formatting
hw/machine: Drop cxl_supported flag as no longer useful
hw/cxl: Move the CXLState from MachineState to machine type specific state.
tests/acpi: Update q35/CEDT.cxl for new memory addresses.
pci/pci_expander_bridge: For CXL HB delay the HB register memory region setup.
tests/acpi: Allow modification of q35 CXL CEDT table.
hw/cxl: Push linking of CXL targets into i386/pc rather than in machine.c
hw/acpi/cxl: Pass in the CXLState directly rather than MachineState
hw/cxl: Make the CXL fixed memory window setup a machine parameter.
x86: acpi-build: do not include hw/isa/isa.h directly
tests: acpi: update expected DSDT.tis.tpm2/DSDT.tis.tpm12 blobs
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently vhost-user-scsi driver doesn't allow to change
the configuration space of virtio_scsi, while vhost-user-blk
support that, so here we set the flag in vhost-user-blk driver
and unset it in vhost-user-scsi.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Message-Id: <20220525125540.50979-2-changpeng.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
We have about 30 instances of the typo/variant spelling 'writeable',
and over 500 of the more common 'writable'. Standardize on the
latter.
Change produced with:
sed -i -e 's/\([Ww][Rr][Ii][Tt]\)[Ee]\([Aa][Bb][Ll][Ee]\)/\1\2/g' $(git grep -il writeable)
and then hand-undoing the instance in linux-headers/linux/kvm.h.
Most of these changes are in comments or documentation; the
exceptions are:
* a local variable in accel/hvf/hvf-accel-ops.c
* a local variable in accel/kvm/kvm-all.c
* the PMCR_WRITABLE_MASK macro in target/arm/internals.h
* the EPT_VIOLATION_GPA_WRITABLE macro in target/i386/hvf/vmcs.h
(which is never used anywhere)
* the AR_TYPE_WRITABLE_MASK macro in target/i386/hvf/vmx.h
(which is never used anywhere)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20220505095015.2714666-1-peter.maydell@linaro.org
This patch adds a get_vhost() callback function for VirtIODevices that
returns the device's corresponding vhost_dev structure, if the vhost
device is running. This patch also adds a vhost_started flag for
VirtIODevices.
Previously, a VirtIODevice wouldn't be able to tell if its corresponding
vhost device was active or not.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <1648819405-25696-3-git-send-email-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch drops the name parameter for the virtio_init function.
The pair between the numeric device ID and the string device ID
(name) of a virtio device already exists, but not in a way that
lets us map between them.
This patch lets us do this and removes the need for the name
parameter in the virtio_init function.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <1648819405-25696-2-git-send-email-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Previously we would silently suppress VHOST_USER_PROTOCOL_F_CONFIG
during the protocol negotiation if the QEMU stub hadn't implemented
the vhost_dev_config_notifier. However this isn't the only way we can
handle config messages, the existing vdc->get/set_config can do this
as well.
Lightly re-factor the code to check for both potential methods and
instead of silently squashing the feature error out. It is unlikely
that a vhost-user backend expecting to handle CONFIG messages will
behave correctly if they never get sent.
Fixes: 1c3e5a2617 ("vhost-user: back SET/GET_CONFIG requests with a protocol feature")
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220321153037.3622127-13-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is no longer a need to expose the request and related APIs in
virtio-scsi.h since there are no callers outside virtio-scsi.c.
Note the block comment in VirtIOSCSIReq has been adjusted to meet the
coding style.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20220427143541.119567-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
virtio_scsi_handle_cmd_vq() is only called from hw/scsi/virtio-scsi.c
now and its return value is no longer used. Remove the function
prototype from virtio-scsi.h and drop the return value.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20220427143541.119567-6-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
virtio_scsi_handle_ctrl_vq() is only called from hw/scsi/virtio-scsi.c
now and its return value is no longer used. Remove the function
prototype from virtio-scsi.h and drop the return value.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20220427143541.119567-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
virtio_scsi_handle_event_vq() is only called from hw/scsi/virtio-scsi.c
now and its return value is no longer used. Remove the function
prototype from virtio-scsi.h and drop the return value.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20220427143541.119567-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-scsi event virtqueue is not emptied by its handler function.
This is typical for rx virtqueues where the device uses buffers when
some event occurs (e.g. a packet is received, an error condition
happens, etc).
Polling non-empty virtqueues wastes CPU cycles. We are not waiting for
new buffers to become available, we are waiting for an event to occur,
so it's a misuse of CPU resources to poll for buffers.
Introduce the new virtio_queue_aio_attach_host_notifier_no_poll() API,
which is identical to virtio_queue_aio_attach_host_notifier() except
that it does not poll the virtqueue.
Before this patch the following command-line consumed 100% CPU in the
IOThread polling and calling virtio_scsi_handle_event():
$ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \
--object iothread,id=iothread0 \
--device virtio-scsi-pci,iothread=iothread0 \
--blockdev file,filename=test.img,aio=native,cache.direct=on,node-name=drive0 \
--device scsi-hd,drive=drive0
After this patch CPU is no longer wasted.
Reported-by: Nir Soffer <nsoffer@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Nir Soffer <nsoffer@redhat.com>
Message-id: 20220427143541.119567-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit f34e8d8b8d ("virtio-scsi: prepare
virtio_scsi_handle_cmd for dataplane") prepared the virtio-scsi cmd
virtqueue handler function to be used in both the dataplane and
non-datpalane code paths.
It failed to convert the ctrl and event virtqueue handler functions,
which are not designed to be called from the dataplane code path but
will be since the ioeventfd is set up for those virtqueues when
dataplane starts.
Convert the ctrl and event virtqueue handler functions now so they
operate correctly when called from the dataplane code path. Avoid code
duplication by extracting this code into a helper function.
Fixes: f34e8d8b8d ("virtio-scsi: prepare virtio_scsi_handle_cmd for dataplane")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20220427143541.119567-2-stefanha@redhat.com
[Fixed s/by used/be used/ typo pointed out by Michael Tokarev
<mjt@tls.msk.ru>.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
vhost-scsi and vhost-user-scsi are two devices of their own; it should
be possible to enable/disable them with --without-default-devices, not
--without-default-features. Compute their default value in Kconfig to
obtain the more intuitive behavior.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.
This avoids the need for a constructor initialization too.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"0x%u" format is very misleading, replace by "0x%x".
Found running:
$ git grep -E '0x%[0-9]*([lL]*|" ?PRI)[dDuU]' hw/
Inspired-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20220323114718.58714-3-philippe.mathieu.daude@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since PDMA reads/writes are driven by the guest, it is possible that migration
can occur whilst a SCSIRequest is still active. Fortunately active SCSIRequests
are already included in the migration stream and restarted post migration but
this still leaves the reference in ESPState uninitialised.
Implement the SCSIBusInfo .load_request callback to obtain a reference to the
currently active SCSIRequest and use it to recreate ESPState current_req
after migration.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220305155530.9265-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This involves (re)adding a PDMA-specific subsection to hold the reference to the
current PDMA callback.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220305155530.9265-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This prepares for the inclusion of the current PDMA callback in the migration
stream since the callback is referenced by an integer instead of a function
pointer.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220305155530.9265-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This function is to be used to execute the current PDMA callback rather than
dereferencing the ESPState pdma_cb function pointer directly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220305155530.9265-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This function is to be used to set the current PDMA callback rather than
accessing the ESPState pdma_cb function pointer directly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220305155530.9265-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Move the various memalign-related functions out of osdep.h and into
their own header, which we include only where they are used.
While we're doing this, add some brief documentation comments.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
The "hardware version" machinery (qemu_set_hw_version(),
qemu_hw_version(), and the QEMU_HW_VERSION define) is used by fewer
than 10 files. Move it out from osdep.h into a new
qemu/hw-version.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-6-peter.maydell@linaro.org
More than 1k of TypeInfo instances are already marked as const. Mark the
remaining ones, too.
This commit was created with:
git grep -z -l 'static TypeInfo' -- '*.c' | \
xargs -0 sed -i 's/static TypeInfo/static const TypeInfo/'
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Message-id: 20220117145805.173070-2-shentey@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Hi
This time I have disabled vmstate canary patches form Dave Gilbert.
Let's see if it works.
Later, Juan.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmH0NkEACgkQ9IfvGFhy
1yM4VQ/+MML5ugA9XA5hOFV+Stwv2ENtMR4r4raQsC7UKdKMaCNuoj1BdlXMRaki
E2TpoHYq99rfJX+AA0XihxHh84I1l9fpoiXrcr8pgNmhcj0qkBykY9Elzf95woMM
UMyinL2jhHfHjby29AaE7BDelUZIA0BgyzQ3TMq8rO+l/ZqFYA8U1SEgPlDYj7cn
gkDWFkPJx6IKgcI8M1obHw11azHgS7dmjjl9lXzxJ2/WfXnoZCuU0BtHd6a1rnAS
qcO3gwLfCo+3aTGKRseJie1Cljz6sIP+ke0Xgn5O+e7alWjCOtlVZrWwd2MqQ07K
2bf7uuTC2KQicLLH8DCnoH/BSvHmpyl/FglFrETRk/55KKg0bi+ZltXaTs9bC2uO
jzNbBSRf8UMcX6Bp3ukhPaFQ1vxqP7KxN9bM+7LYP9aX7Lt/NCJciYjw1jCTwcwi
nz0RS4d7cscMhoMEarPCKcaNJR6PJetdZY2VXavWjXv6er3407yTocvuei0Epdyb
WZtbFnpI2tfx1GEr/Bz6Mnk/qn7kwo7BFEUtJoweFE05g5wHa1PojsblrrsqeOuc
llpK8o8c8NFACxeiLa0z0VBkTjdOtao206eLhF+Se3ukubImayRQwZiOCEBBXwB3
+LmVcmwNDfNonSWI04AA2WAy9gAdM3Ko/gBfWsuOPR5oIs65wns=
=F/ek
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/quintela-gitlab/tags/migration-20220128-pull-request' into staging
Migration Pull request (Take 2)
Hi
This time I have disabled vmstate canary patches form Dave Gilbert.
Let's see if it works.
Later, Juan.
# gpg: Signature made Fri 28 Jan 2022 18:30:25 GMT
# gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full]
# gpg: aka "Juan Quintela <quintela@trasno.org>" [full]
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/quintela-gitlab/tags/migration-20220128-pull-request: (36 commits)
migration: Move temp page setup and cleanup into separate functions
migration: Simplify unqueue_page()
migration: Add postcopy_has_request()
migration: Enable UFFD_FEATURE_THREAD_ID even without blocktime feat
migration: No off-by-one for pss->page update in host page size
migration: Tally pre-copy, downtime and post-copy bytes independently
migration: Introduce ram_transferred_add()
migration: Don't return for postcopy_send_discard_bm_ram()
migration: Drop return code for disgard ram process
migration: Do chunk page in postcopy_each_ram_send_discard()
migration: Drop postcopy_chunk_hostpages()
migration: Don't return for postcopy_chunk_hostpages()
migration: Drop dead code of ram_debug_dump_bitmap()
migration/ram: clean up unused comment.
migration: Report the error returned when save_live_iterate fails
migration/migration.c: Remove the MIGRATION_STATUS_ACTIVE when migration finished
migration/migration.c: Avoid COLO boot in postcopy migration
migration/migration.c: Add missed default error handler for migration state
Remove unnecessary minimum_version_id_old fields
multifd: Rename pages_used to normal_pages
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The migration code will not look at a VMStateDescription's
minimum_version_id_old field unless that VMSD has set the
load_state_old field to something non-NULL. (The purpose of
minimum_version_id_old is to specify what migration version is needed
for the code in the function pointed to by load_state_old to be able
to handle it on incoming migration.)
We have exactly one VMSD which still has a load_state_old,
in the PPC CPU; every other VMSD which sets minimum_version_id_old
is doing so unnecessarily. Delete all the unnecessary ones.
Commit created with:
sed -i '/\.minimum_version_id_old/d' $(git grep -l '\.minimum_version_id_old')
with the one legitimate use then hand-edited back in.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
It missed vmstate_ppc_cpu.
softmmu/rtc.c defines two public functions: qemu_get_timedate() and
qemu_timedate_diff(). Currently we keep the prototypes for these in
qemu-common.h, but most files don't need them. Move them to their
own header, a new include/sysemu/rtc.h.
Since the C files using these two functions did not need to include
qemu-common.h for any other reason, we can remove those include lines
when we add the include of the new rtc.h.
The license for the .h file follows that of the softmmu/rtc.c
where both the functions are defined.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Since commit 292e13142d, dma_buf_rw() returns a MemTxResult type.
Do not discard it, return it to the caller. Pass the previously
returned value (the QEMUSGList residual size, which was rarely used)
as an optional argument.
With this new API, SCSIRequest::residual might now be accessed via
a pointer. Since the size_t type does not have the same size on
32 and 64-bit host architectures, convert it to a uint64_t, which
is big enough to hold the residual size, and the type is constant
on both 32/64-bit hosts.
Update the few dma_buf_read() / dma_buf_write() callers to the new
API.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220117125130.131828-1-f4bug@amsat.org>
Update the obvious places where dma_addr_t should be used
(instead of uint64_t, hwaddr, size_t, int32_t types).
This allows to have &dma_addr_t type portable on 32/64-bit
hosts.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220111184309.28637-11-f4bug@amsat.org>
The 'resid' field is slightly confusing and could be
interpreted as some ID. Rename it as 'residual' which
is clearer to review. No logical change.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220111184309.28637-8-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This code is easier to review using the load/store API.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211218111912.1499377-1-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Now that virtio-blk and virtio-scsi are ready, get rid of
the handle_aio_output() callback. It's no longer needed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Prepare virtio_scsi_handle_cmd() to be used by both dataplane and
non-dataplane by making the condition for starting ioeventfd more
specific. This way it won't trigger when dataplane has already been
started.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtqueue host notifier API
virtio_queue_aio_set_host_notifier_handler() polls the virtqueue for new
buffers. AioContext previously required a bool progress return value
indicating whether an event was handled or not. This is no longer
necessary because the AioContext polling API has been split into a poll
check function and an event handler function. The event handler is only
run when we know there is work to do, so it doesn't return bool.
The VirtIOHandleAIOOutput function signature is now the same as
VirtIOHandleOutput. Get rid of the bool return value.
Further simplifications will be made for virtio-blk and virtio-scsi in
the next patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
vhost_dev_init calls vhost_dev_cleanup on error, which closes vhostfd,
don't double close it.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Message-Id: <20211129132358.1110372-2-d-tatianin@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_dev_init calls vhost_dev_cleanup in case of an error during
initialization, which zeroes out the entire vsc->dev as well as the
vsc->dev.vqs pointer. This prevents us from properly freeing it in free_vqs.
Keep a local copy of the pointer so we can free it later.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Message-Id: <20211129132358.1110372-1-d-tatianin@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ld*_dma() returns a MemTxResult type. Do not discard
it, return it to the caller.
Update the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-24-philmd@redhat.com>
Let devices specify transaction attributes when calling ld*_pci_dma().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-22-philmd@redhat.com>
Let devices specify transaction attributes when calling st*_pci_dma().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-21-philmd@redhat.com>
Let devices specify transaction attributes when calling
dma_buf_read().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-13-philmd@redhat.com>
Let devices specify transaction attributes when calling
dma_buf_write().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-12-philmd@redhat.com>
Let devices specify transaction attributes when calling pci_dma_rw().
Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-10-philmd@redhat.com>
DMA operations are run on any kind of buffer, not arrays of
uint8_t. Convert dma_buf_read/dma_buf_write functions to take
a void pointer argument and save us pointless casts to uint8_t *.
Remove this pointless casts in the megasas device model.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-9-philmd@redhat.com>
While the reply queue values fit in 16-bit, they are accessed
as 32-bit:
661: s->reply_queue_head = ldl_le_pci_dma(pcid, s->producer_pa);
662: s->reply_queue_head %= MEGASAS_MAX_FRAMES;
663: s->reply_queue_tail = ldl_le_pci_dma(pcid, s->consumer_pa);
664: s->reply_queue_tail %= MEGASAS_MAX_FRAMES;
Having:
41:#define MEGASAS_MAX_FRAMES 2048 /* Firmware limit at 65535 */
In order to update the ld/st*_pci_dma() API to pass the address
of the value to access, it is simpler to have the head/tail declared
as 32-bit values. Replace the uint16_t by uint32_t, wasting 4 bytes in
the MegasasState structure.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-20-philmd@redhat.com>
If we detect an overflow on the SGL buffer, do not
keep processing the command: discard it. TARGET_FAILURE
sense code will be returned (MFI_STAT_SCSI_DONE_WITH_ERROR).
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/521
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20211119201141.532377-2-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a reset command is sent after data has been transferred into the SCSI buffer
ensure that async_len is reset to 0. Otherwise a subsequent TI command assumes
the SCSI buffer contains data to be transferred to the device causing it to
dereference the stale async_buf pointer.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/724
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211118100327.29061-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids an off-by-one read of 'mode_sense_valid' buffer in
hw/scsi/scsi-disk.c:mode_sense_page().
Fixes: CVE-2021-3930
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: a8f4bbe290 ("scsi-disk: store valid mode pages in a table")
Fixes: #546
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is currently a check in esp_select() to cancel any in-flight SCSI requests
to ensure that issuing multiple select commands without continuing through the
rest of the ESP state machine ignores all but the last SCSI request. This is
also enforced through the addition of assert()s in esp_transfer_data() and
scsi_read_data().
The get_cmd() function does not call esp_select() when TC == 0 which means it is
possible for a fuzzer to trigger these assert()s by sending a select command when
TC == 0 immediately after a valid SCSI CDB has been submitted.
Since esp_select() is only called from get_cmd(), hoist the check to cancel
in-flight SCSI requests from esp_select() into get_cmd() to ensure it is always
called when executing a select command to initiate a new SCSI request.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Closes: https://gitlab.com/qemu-project/qemu/-/issues/662
Closes: https://gitlab.com/qemu-project/qemu/-/issues/663
Message-Id: <20211101183516.8455-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
sources, IOV_MAX in POSIX). Because of this, on some host adapters
requests with many iovecs are rejected with -EINVAL by the
io_submit() or readv()/writev() system calls.
In fact, the same limit applies to SG_IO as well. To fix both the
EINVAL and the possible performance issues from using fewer iovecs
than allowed by Linux (some HBAs have max_segments as low as 128),
introduce a separate entry in BlockLimits to hold the max_segments
value from sysfs. This new limit is used only for SG_IO and clamped
to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
bs->bl.max_transfer.
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org
Cc: qemu-stable@nongnu.org
Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do not round to power of 2", 2021-06-25)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210923130436.1187591-1-pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Rename qbus_create_inplace() to qbus_init(); this is more in line
with our usual naming convention for functions that in-place
initialize objects.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-5-peter.maydell@linaro.org
The function scsi_bus_new() creates a new SCSI bus; callers can
either pass in a name argument to specify the name of the new bus, or
they can pass in NULL to allow the bus to be given an automatically
generated unique name. Almost all callers want to use the
autogenerated name; the only exception is the virtio-scsi device.
Taking a name argument that should almost always be NULL is an
easy-to-misuse API design -- it encourages callers to think perhaps
they should pass in some standard name like "scsi" or "scsi-bus". We
don't do this anywhere for SCSI, but we do (incorrectly) do it for
other bus types such as i2c.
The function name also implies that it will return a newly allocated
object, when it in fact does in-place allocation. We more commonly
name such functions foo_init(), with foo_new() being the
allocate-and-return variant.
Replace all the scsi_bus_new() callsites with either:
* scsi_bus_init() for the usual case where the caller wants
an autogenerated bus name
* scsi_bus_init_named() for the rare case where the caller
needs to specify the bus name
and document that for the _named() version it's then the caller's
responsibility to think about uniqueness of bus names.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210923121153.23754-2-peter.maydell@linaro.org
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210720125408.387910-6-armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
The device model batching its ioeventfds in a single MR transaction is
an optimization. Clarify this in virtio-scsi, virtio-blk and generic
virtio code. Also clarify that the transaction must commit before
closing ioeventfds so that no one is tempted to merge the loops
in the start functions error path and in the stop functions.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <162125799728.1394228.339855768563326832.stgit@bahia.lan>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This allows callers to return better error messages instead of making
one up while the real error ends up on stderr. Most callers can
immediately make use of this because they already have an Error
parameter themselves. The others just keep printing the error with
error_report_err().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-2-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
For block host devices, I/O can happen through either the kernel file
descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
or the SCSI passthrough ioctl SG_IO.
In the latter case, the size of each transfer can be limited by the
HBA, while for file descriptor I/O the kernel is able to split and
merge I/O in smaller pieces as needed. Applying the HBA limits to
file descriptor I/O results in more system calls and suboptimal
performance, so this patch splits the max_transfer limit in two:
max_transfer remains valid and is used in general, while max_hw_transfer
is limited to the maximum hardware size. max_hw_transfer can then be
included by the scsi-generic driver in the block limits page, to ensure
that the stricter hardware limit is used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I/O to a disk via read/write is not limited by the number of segments allowed
by the host adapter; the kernel can split requests if needed, and the limit
imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
returns EINVAL if memory is heavily fragmented.
Since this value is only interesting for SG_IO-based I/O, do not include
it in the max_transfer and only take it into account when patching the
block limits VPD page in the scsi-generic device.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The LUN is selected with an IDENTIFY message, and persists
until the next message out phase. Instead of passing it to
do_busid_cmd, store it in ESPState. Because do_cmd can simply
skip the message out phase if cmdfifo_cdb_offset is zero, it
can now be used for the S without ATN cases as well.
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 4e78f3bf35 "esp: defer command completion interrupt on incoming data
transfers" added a version check for use with VMSTATE_*_TEST macros to allow
migration from older QEMU versions. Unfortunately the version check fails to
work in its current form since if the VMStateDescription version_id is
incremented, the test returns false and so the fields are not included in the
outgoing migration stream.
Change the version check to use >= rather == to ensure that migration works
correctly when the ESPState VMStateDescription has version_id > 5.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 4e78f3bf35 ("esp: defer command completion interrupt on incoming data transfers")
Message-Id: <20210613102614.5438-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 799d90d818 "esp: transition to message out phase after SATN and stop
command" added logic to correctly handle extended messages for DMA requests
but not for PDMA requests.
Apply the same logic in esp_do_dma() to do_dma_pdma_cb() so that extended
messages terminated with a PDMA request are accumulated correctly. This allows
the ESP device to respond correctly to the SDTR negotiation initiated by the
NetBSD ESP driver without causing errors and timeouts on boot.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20210519100803.10293-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This commit from nearly 10 years ago is now broken due to the improvements
in esp emulation (or perhaps was never correct). It shows up as a bug
in detecting the CDROM drive under MacOS. The error is caused by the
MacOS CDROM driver sending this CDB with an "S without ATN" command and
without DMA:
0x12 0x00 0x00 0x00 0x05 0x00 (INQUIRY)
This is a valid INQUIRY command, however with this logic present the 3rd
byte (0x0) is copied over the 1st byte (0x12) which silently converts the
INQUIRY command to a TEST UNIT READY command before passing it to the
QEMU SCSI layer. Since the TEST UNIT READY command has a zero length
response the MacOS CDROM driver never receives a response and assumes
the CDROM is not present.
The logic was to ignore the IDENTIFY byte and copy the LUN over from
the CDB, which did store the LUN in bits 5-7 of the second byte in
olden times. This however is all obsolete, so just drop the code.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20210519100803.10293-5-mark.cave-ayland@ilande.co.uk>
[Tweaked commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After each PDMA write transfer the MacOS CDROM driver waits until the FIFO is empty
(i.e. its contents have been written out to the SCSI bus) by polling the FIFO count
register until it reads 0. This doesn't work with the current PDMA write
implementation which waits until either the FIFO is full or the transfer is complete
before invoking the PDMA callback to process the FIFO contents.
Change the PDMA write transfer logic so that the PDMA callback is invoked after each
PDMA write to transfer the FIFO contents to the target buffer immediately, and hence
avoid getting stuck in the FIFO count register polling loop.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20210519100803.10293-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The initial implementation of non-DMA transfers was based upon analysis of traces
from the MacOS toolbox ROM for handling unaligned reads but missed one key
aspect - during a non-DMA transfer from the target, the bus service interrupt
should be raised for every single byte received from the bus and not just at either
the end of the transfer or when the FIFO is full.
Adjust the non-DMA code accordingly so that esp_do_nodma() is called for every byte
received from the target. This also includes special handling for managing the change
from DATA IN to STATUS phase as this needs to occur when the final byte is read out
from the FIFO, and not at the end of the transfer of the last byte into the FIFO.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20210519100803.10293-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current implementation only resumes DMA transfers when incoming data is
received from the target device, but this is also required for non-DMA transfers
with the next set of non-DMA changes.
Rather than duplicate the DMA/non-DMA dispatch logic in the initial transfer
section, update the code so that the initial transfer section can just
fallthrough to the main DMA/non-DMA dispatch logic.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20210519100803.10293-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When processing a command to select a target and send a CDB, the ESP device
maintains a sequence step register so that if an error occurs the host can
determine which part of the selection/CDB submission sequence failed.
The old Linux 2.6 driver is really pedantic here: it checks the sequence step
register even if a command succeeds and complains loudly on the console if the
sequence step register doesn't match the expected bus phase and interrupt flags.
This reason this mismatch occurs is because the ESP emulation currently doesn't
update the bus phase until the next TI (Transfer Information) command and so the
cleared sequence step register is considered invalid for the stale bus phase.
Normally this isn't an issue as the host only checks the sequence step register
if an error occurs but the old Linux 2.6 driver does this in several places
causing a large stream of "esp0: STEP_ASEL for tgt 0" messages to appear on the
console during the boot process.
Fix this by not clearing the sequence step register when reading the interrupt
register and clearing the DMA status, so the guest sees a valid sequence step
and bus phase combination at the end of the command phase. No other change is
required since the sequence step register is correctly updated throughout the
selection/CDB submission sequence once one of the select commands is issued.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 1b9e48a5bd ("esp: implement non-DMA transfers in PDMA mode")
Message-Id: <20210518212511.21688-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The datasheet sequence tables confirm that when a target selection fails, only
the INTR_DC interrupt flag should be asserted.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: cf47a41e05 ("esp: latch individual bits in ESP_RINTR register")
Message-Id: <20210518212511.21688-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the device doesn't support the VPD block limits page, we emulate it even
for SCSI passthrough.
As a part of the emulation we need to add it to the 'Supported VPD Pages'
The code that does this adds it to the page, but it doesn't increase the length
of the data to be copied to the guest, thus the guest never sees the VPD block
limits page as supported.
Bump the transfer size by 1 in this case.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20201217165612.942849-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.
We still have several references to the old file, so let's fix them
with the following command:
sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
check_lba_range expects sectors to be expressed in original qdev blocksize, but
scsi_unmap_complete_noio was translating them to 512 block sizes, which was
causing sense errors in the larger LBAs in devices using a 4k block size.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/345
Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Message-Id: <20210521142829.326217-1-kit.westneat@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fixes all over the place. Faster boot for virtio. ioeventfd support for
mmio.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmCeiMEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpqsIH/A49Av5Bv8huL75lf9GzCx3E1a/z2W9Fphik
OcQ1ahR+7CRDARub+vTG40MBmZBVefIWjLAj3BwBWzFGPX0DZq0zeI102VzlEVKY
OeUx8ixuiKOSLcS+QxE7ZXIBL2Pn7l+MFUi4nLMYKti7c/kola7zlB57qsmXh+VD
AOQ7Utj6NWoi6QocWJsMSCyHCh3Fk9QzcStLlr6/MkSJa1zqv8l22+8oWH07Fk2M
wZfhrm9k094on28iSejsFYL5e4ROeXUajbOdfyMIxWvAB7boC9Jxk/e0oAbuSB4y
2f71Gfk3mU6irS7PvrxcKbk6BVD2zxM2WumOchZJgxFAujDO6yg=
=fvkT
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,pci,virtio: bugfixes, improvements
Fixes all over the place. Faster boot for virtio. ioeventfd support for
mmio.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 14 May 2021 15:27:13 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
Fix build with 64 bits time_t
vhost-vdpa: Make vhost_vdpa_get_device_id() static
hw/virtio: enable ioeventfd configuring for mmio
hw/smbios: support for type 41 (onboard devices extended information)
checkpatch: Fix use of uninitialized value
virtio-scsi: Configure all host notifiers in a single MR transaction
virtio-scsi: Set host notifiers and callbacks separately
virtio-blk: Configure all host notifiers in a single MR transaction
virtio-blk: Fix rollback path in virtio_blk_data_plane_start()
pc-dimm: remove unnecessary get_vmstate_memory_region() method
amd_iommu: fix wrong MMIO operations
virtio-net: Constify VirtIOFeature feature_sizes[]
virtio-blk: Constify VirtIOFeature feature_sizes[]
hw/virtio: Pass virtio_feature_get_config_size() a const argument
x86: acpi: use offset instead of pointer when using build_header()
amd_iommu: Fix pte_override_page_mask()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# hw/arm/virt.c
This allows the virtio-scsi-pci device to batch the setup of all its
host notifiers. This significantly improves boot time of VMs with a
high number of vCPUs, e.g. from 6m5.563s down to 1m2.884s for a
pseries machine with 384 vCPUs.
Note that memory_region_transaction_commit() must be called before
virtio_bus_cleanup_host_notifier() because the latter might close
ioeventfds that the transaction still assumes to be around when it
commits.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210407143501.244343-5-groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Host notifiers are guaranteed to be idle until the callbacks are
hooked up with virtio_queue_aio_set_host_notifier_handler(). They
thus don't need to be set or unset with the AioContext lock held.
Do this outside the critical section, like virtio-blk already
does : basically downgrading virtio_scsi_vring_init() to only
setup the host notifier and set the callback in the caller.
This will allow to batch addition/deletion of ioeventds in
a single memory transaction, which is expected to greatly
improve initialization time.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210407143501.244343-4-groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stop including cpu.h in files that don't need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-4-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
While processing SCSI i/o requests in mptsas_process_scsi_io_request(),
the Megaraid emulator appends new MPTSASRequest object 'req' to
the 's->pending' queue. In case of an error, this same object gets
dequeued in mptsas_free_request() only if SCSIRequest object
'req->sreq' is initialised. This may lead to a use-after-free issue.
Since s->pending is actually not used, simply remove it from
MPTSASState.
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Message-id: 20210419134247.1467982-1-f4bug@amsat.org
Message-Id: <20210416102243.1293871-1-mjt@msgid.tls.msk.ru>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
BugLink: https://bugs.launchpad.net/qemu/+bug/1914236 (CVE-2021-3392)
Fixes: e351b82611 ("hw: Add support for LSI SAS1068 (mptsas) device")
[PMD: Reworded description, added more tags]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When a CDB has been received and is about to be submitted to the SCSI layer
via one of the ESP select commands, ensure that do_cmd is set to zero before
executing the command.
Otherwise a guest executing 2 valid CDBs in quick sequence can invoke the SCSI
.transfer_data callback again before do_cmd is set to zero by the callback
function triggering an assert at the start of esp_transfer_data().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210407195801.685-12-mark.cave-ayland@ilande.co.uk>
Instead let the SCSI layer invoke the .cancel callback itself to cancel and
reset the request state.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210407195801.685-11-mark.cave-ayland@ilande.co.uk>
If a guest transfers the message out/command phase data using DMA with a TC
that is larger than the cmdfifo size then the cmdfifo overflows triggering
an assert. Limit the size of the transfer to the free space available in
cmdfifo.
Buglink: https://bugs.launchpad.net/qemu/+bug/1919036
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-10-mark.cave-ayland@ilande.co.uk>
If the guest tries to read a CDB using DMA and cmdfifo is not empty then it is
possible to overflow cmdfifo.
Since this can only occur by issuing deliberately incorrect instruction
sequences, ensure that the maximum length of the CDB transferred to cmdfifo is
limited to the available free space within cmdfifo.
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-9-mark.cave-ayland@ilande.co.uk>
If the guest tries to execute a CDB when cmdfifo is not empty before the start
of the message out phase then clearing the message out phase data will cause
cmdfifo to underflow due to cmdfifo_cdb_offset being larger than the amount of
data within.
Since this can only occur by issuing deliberately incorrect instruction
sequences, ensure that the maximum length of esp_fifo_pop_buf() is limited to
the size of the data within cmdfifo.
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-8-mark.cave-ayland@ilande.co.uk>
When about to execute a SCSI command, ensure that cmdfifo is not empty and
current_dev is non-NULL. This can happen if the guest tries to execute a TI
(Transfer Information) command without issuing one of the select commands
first.
Buglink: https://bugs.launchpad.net/qemu/+bug/1910723
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-7-mark.cave-ayland@ilande.co.uk>
The const pointer returned by fifo8_pop_buf() lies directly within the array used
to model the FIFO. Building with address sanitizers enabled shows that if the
caller expects a minimum number of bytes present then if the FIFO is nearly full,
the caller may unexpectedly access past the end of the array.
Introduce esp_fifo_pop_buf() which takes a destination buffer and performs a
memcpy() in it to guarantee that the caller cannot overwrite the FIFO array and
update all callers to use it. Similarly add underflow protection similar to
esp_fifo_push() and esp_fifo_pop() so that instead of triggering an assert()
the operation becomes a no-op.
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210407195801.685-6-mark.cave-ayland@ilande.co.uk>
Each FIFO currently has its own pop functions with the only difference being
the capacity check. The original reason for this was that the fifo8
implementation doesn't have a formal API for retrieving the FIFO capacity,
however there are multiple examples within QEMU where the capacity field is
accessed directly.
Change esp_fifo_pop() to access the FIFO capacity directly and then consolidate
esp_cmdfifo_pop() into esp_fifo_pop().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-5-mark.cave-ayland@ilande.co.uk>
Each FIFO currently has its own push functions with the only difference being
the capacity check. The original reason for this was that the fifo8
implementation doesn't have a formal API for retrieving the FIFO capacity,
however there are multiple examples within QEMU where the capacity field is
accessed directly.
Change esp_fifo_push() to access the FIFO capacity directly and then consolidate
esp_cmdfifo_push() into esp_fifo_push().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-4-mark.cave-ayland@ilande.co.uk>
The code for write_response() has always used the FIFO to store the data for
the status/message in phases, even for DMA transactions. Switch to using a
separate buffer that can be used directly for DMA transactions and restrict
the FIFO use to the non-DMA case.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210407195801.685-3-mark.cave-ayland@ilande.co.uk>
After issuing a SCSI command the SCSI layer can call the SCSIBusInfo .cancel
callback which resets both current_req and current_dev to NULL. If any data
is left in the transfer buffer (async_len != 0) then the next TI (Transfer
Information) command will attempt to reference the NULL pointer causing a
segfault.
Buglink: https://bugs.launchpad.net/qemu/+bug/1910723
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-2-mark.cave-ayland@ilande.co.uk>
If QEMU is launched with the -S option then the ESPState mig_version_id property
is left unset due to the ordering of the VMState fields in the VMStateDescription
for sysbusespscsi and pciespscsi. If the VM is migrated and restored in this
stopped state, the version tests in the vmstate_esp VMStateDescription and
esp_post_load() become confused causing the migration to fail.
Fix the ordering problem by moving the setting of mig_version_id to a common
esp_pre_save() function which is invoked first by both sysbusespscsi and
pciespscsi rather than at the point where ESPState is itself serialised into the
migration stream.
Buglink: https://bugs.launchpad.net/qemu/+bug/1922611
Fixes: 0bd005be78 ("esp: add vmstate_esp version to embedded ESPState")
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210407124842.32695-1-mark.cave-ayland@ilande.co.uk>
The 'scsi-hd' and 'scsi-cd' devices provide suitable alternatives.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The 'running' argument from VMChangeStateHandler does not require
other value than 0 / 1. Make it a plain boolean.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210111152020.1422021-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
On Fedora 33, gcc 10.2.1 notes that scsi_cdb_length(buf) can set
len==-1, which in turn overflows g_malloc():
[5/5] Linking target qemu-system-x86_64
In function ‘scsi_disk_new_request_dump’,
inlined from ‘scsi_new_request’ at ../hw/scsi/scsi-disk.c:2608:9:
../hw/scsi/scsi-disk.c:2582:19: warning: argument 1 value ‘18446744073709551612’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
2582 | line_buffer = g_malloc(len * 5 + 1);
| ^
Silence it with a decent assertion, since we only convert a buffer to
bytes when we have a valid cdb length.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210209152350.207958-1-eblake@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmBEwfEeHG1hcmsuY2F2
ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIfYZYH/AwOuqycV9XI1s3K
valsExtSmcj6oTrFTSyak0U7QL9G5pm40qEHwQkwFwYecaQXfFNMN1Mt4gYpsLai
JNhV29tzWrQtNFepCjvvuJihDY5IC/3NWyrWqYXRNZ5IGMvde3HUcg210bFOC3mf
iOtZ40ZjBspTwG4eJCOLBBDeXVAMkVUjdfqtsq66zGteHsJjNEpso0HLdMvlFkud
QLF4g06h8iFMxwahZYEwY+fKUHphqfZn8INr01ODQIy3nBGVgenKz9DKPw/pg/UD
nfxhuYwftDof65JpYEhClOWas3AztFJ1YyyP/cibwosLBQVphj3R05/kF6VuyuyP
gZjux0w=
=RPt3
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-20210307' into staging
qemu-sparc queue
# gpg: Signature made Sun 07 Mar 2021 12:07:13 GMT
# gpg: using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg: issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* remotes/mcayland/tags/qemu-sparc-20210307: (42 commits)
esp: add support for unaligned accesses
esp: implement non-DMA transfers in PDMA mode
esp: add trivial implementation of the ESP_RFLAGS register
esp: convert cmdbuf from array to Fifo8
esp: convert ti_buf from array to Fifo8
esp: transition to message out phase after SATN and stop command
esp: add maxlen parameter to get_cmd()
esp: raise interrupt after every non-DMA byte transferred to the FIFO
esp: remove old deferred command completion mechanism
esp: defer command completion interrupt on incoming data transfers
esp: latch individual bits in ESP_RINTR register
esp: implement FIFO flush command
esp: add 4 byte PDMA read and write transfers
esp: remove pdma_origin from ESPState
esp: use FIFO for PDMA transfers between initiator and device
esp: fix PDMA target selection
esp: rename get_cmd_cb() to esp_select()
esp: remove CMD pdma_origin
esp: use in-built TC to determine PDMA transfer length
esp: use ti_wptr/ti_rptr to manage the current FIFO position for PDMA
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When the MacOS toolbox ROM transfers data from a target device to an unaligned
memory address, the first/last byte of a 16-bit transfer needs to be handled
separately. This means that the first byte is preloaded into the FIFO before
the transfer, or the last byte remains in the FIFO after the transfer.
The result of this is that the PDMA routines must be updated so that the FIFO
is loaded/unloaded if the last 16-bit word is used (rather than the last byte)
and any remaining byte from a FIFO wraparound is handled correctly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-43-mark.cave-ayland@ilande.co.uk>
The MacOS toolbox ROM uses non-DMA TI commands to handle the first/last byte
of an unaligned 16-bit transfer to memory.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-42-mark.cave-ayland@ilande.co.uk>
The bottom 5 bits contain the number of bytes remaining in the FIFO which is
trivial to implement with Fifo8 (the remaining bits are unimplemented and left
as 0 for now).
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-41-mark.cave-ayland@ilande.co.uk>
Rename ESP_CMDBUF_SZ to ESP_CMDFIFO_SZ and cmdbuf_cdb_offset to cmdfifo_cdb_offset
to indicate that the command buffer type has changed from an array to a Fifo8.
This also enables us to remove the ESPState field cmdlen since the command length
is now simply the number of elements used in cmdfifo.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-40-mark.cave-ayland@ilande.co.uk>
Rename TI_BUFSZ to ESP_FIFO_SZ since this constant is really describing the size
of the FIFO and is not directly related to the TI size.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-39-mark.cave-ayland@ilande.co.uk>
The SCSI bus should remain in the message out phase after the SATN and stop
command rather than transitioning to the command phase. A new ESPState variable
cmdbuf_cdb_offset is added which stores the offset of the CDB from the start
of cmdbuf when accumulating extended message out phase data.
Currently any extended message out data is discarded in do_cmd() before the CDB
is processed in do_busid_cmd().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-38-mark.cave-ayland@ilande.co.uk>
Some guests use a mixture of DMA and non-DMA transfers in combination with the
SATN and stop command to transfer message out phase and command phase bytes to
the target. Prepare for the next commit by adding a maxlen parameter to
get_cmd() to allow partial transfers.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-37-mark.cave-ayland@ilande.co.uk>
This matches the description in the datasheet and is required as support for
non-DMA transfers is added.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-36-mark.cave-ayland@ilande.co.uk>
Commit ea84a44250 "scsi: esp: Defer command completion until previous interrupts
have been handled" provided a mechanism to delay the command completion interrupt
until ESP_RINTR is read after the command has completed.
With the previous fixes for latching the ESP_RINTR bits and deferring the setting
of the command completion interrupt for incoming data to the SCSI callback, this
workaround is no longer required and can be removed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-35-mark.cave-ayland@ilande.co.uk>
The MacOS toolbox ROM issues a command to the ESP controller as part of its
"FAST" SCSI routines and then proceeds to read the incoming data soon after
receiving the command completion interrupt.
Unfortunately due to SCSI block transfers being asynchronous the incoming data
may not yet be present causing an underflow error. Resolve this by waiting for
the SCSI subsystem transfer_data callback before raising the command completion
interrupt.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-34-mark.cave-ayland@ilande.co.uk>
Currently the ESP_RINTR register is set to a specific value as required within
the ESP state machine. In order to implement the upcoming deferred interrupt
functionality it is necessary to set individual bits within ESP_RINTR so that
a deferred interrupt will not overwrite the value of any other interrupt bits.
This also requires fixing up a few locations where the ESP_RINTR and ESP_RSEQ
registers are set/reset unexpectedly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-33-mark.cave-ayland@ilande.co.uk>
At this point it is now possible to properly implement the FIFO flush command
without causing guest errors.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-32-mark.cave-ayland@ilande.co.uk>
The MacOS toolbox ROM performs 4 byte reads/writes when transferring data to
and from the target. Since the SCSI bus is 16-bits wide, use the memory API
to split a 4 byte access into 2 x 2 byte accesses.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-31-mark.cave-ayland@ilande.co.uk>
Now that all data is transferred via the FIFO (ti_buf) there is no need to track
the source buffer being used for the data transfer. This also eliminates the
need for a separate subsection for PDMA state migration.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-30-mark.cave-ayland@ilande.co.uk>
PDMA as implemented on the Quadra 800 uses DREQ to load data into the FIFO
up to a maximum of 16 bytes at a time. The MacOS toolbox ROM requires this
because it mixes FIFO and PDMA transfers whilst checking the FIFO status
and counter registers to ensure success.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-29-mark.cave-ayland@ilande.co.uk>
Currently the target selection for PDMA is done after the SCSI command has been
delivered which is not correct. Perform target selection as part of the initial
get_cmd() call when the command is submitted: if no target is present, don't
raise DRQ.
If the target is present then switch to the command phase since the MacOS toolbox
ROM checks for this before attempting to submit the SCSI command.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-28-mark.cave-ayland@ilande.co.uk>
This better describes the purpose of the function.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-27-mark.cave-ayland@ilande.co.uk>
The cmdbuf is really just a copy of FIFO data (including extra message phase
bytes) so its pdma_origin is effectively TI. Fortunately we already know when
we are receiving a SCSI command since do_cmd == 1 which enables us to
distinguish between the two cases in esp_pdma_read()/esp_pdma_write().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-26-mark.cave-ayland@ilande.co.uk>
Real hardware simply counts down using the in-built TC to determine when the
the PDMA request is complete. Use the TC to determine the PDMA transfer length
which then enables us to remove the redundant pdma_len variable.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-25-mark.cave-ayland@ilande.co.uk>
This eliminates the last user of the PDMA-specific pdma_cur variable which can
now be removed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-24-mark.cave-ayland@ilande.co.uk>
Here the updates to async_len and ti_size are moved into the corresponding
esp_pdma_read()/esp_pdma_write() function to eliminate the reference to
pdma_cur in do_dma_pdma_cb().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-23-mark.cave-ayland@ilande.co.uk>
Now that PDMA SCSI commands are accumulated in cmdbuf in the same way as normal
commands, the existing logic for locating the start of the SCSI command in
cmdbuf via cmdlen can be used. This enables the PDMA-specific pdma_start and
also get_pdma_buf() to be removed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-22-mark.cave-ayland@ilande.co.uk>
Now that all SCSI commands are accumulated in cmdbuf, remove the buf and buflen
parameters from get_cmd() since these always reference cmdbuf and ESP_CMDBUF_SZ
respectively.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-21-mark.cave-ayland@ilande.co.uk>
Now that all SCSI commands are accumulated in cmdbuf, remove the buf parameter
from do_cmd() since this always points to cmdbuf.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-20-mark.cave-ayland@ilande.co.uk>
ESP SCSI commands are already accumulated in cmdbuf and so there is no need to
keep a separate pdma_buf buffer. Accumulate SCSI commands for PDMA transfers in
cmdbuf instead of pdma_buf so update cmdlen accordingly and change pdma_origin
for PDMA transfers to CMD which allows the PDMA origin to be removed.
This commit also removes a stray memcpy() from get_cmd() which is a no-op because
cmdlen is always zero at the start of a command.
Notionally the removal of pdma_buf from vmstate_esp_pdma also breaks migration
compatibility for the PDMA subsection until its complete removal by the end of
the series.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-19-mark.cave-ayland@ilande.co.uk>
This is the first step in removing get_pdma_buf() from esp.c.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-17-mark.cave-ayland@ilande.co.uk>
The limiting of DMA transfers to the maximum size of the available data is already
handled by esp_do_dma() and do_dma_pdma_cb().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-15-mark.cave-ayland@ilande.co.uk>
The ESP device already keeps track of the remaining bytes left to transfer via
its TC (transfer counter) register which is decremented for each byte that
is transferred across the SCSI bus.
Switch the transfer logic to use the value of TC instead of dma_left and then
remove dma_left completely, adding logic to the vmstate_esp post_load() function
to transfer the old dma_left value to the TC register during migration from
older versions.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-14-mark.cave-ayland@ilande.co.uk>
The value of dma_counter is set once at the start of the transfer and remains
the same until the transfer is complete. This allows the check in esp_transfer_data
to be simplified since dma_left will always be non-zero until the transfer is
completed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-13-mark.cave-ayland@ilande.co.uk>
Perform the length adjustment whereby a value of 0 in the STC represents
a transfer length of 0x10000 at the point where the TC is loaded at the
start of a DMA command rather than just when a TI (Transfer Information)
command is executed. This better matches the description as given in the
datasheet.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-12-mark.cave-ayland@ilande.co.uk>
This function simplifies reading the STC register value without having to manually
shift each individual 8-bit value.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-11-mark.cave-ayland@ilande.co.uk>
These functions simplify reading and writing the TC register value without having to
manually shift each individual 8-bit value.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-10-mark.cave-ayland@ilande.co.uk>
The transfer direction is currently determined by checking the sign of ti_size
but as this series progresses ti_size can be zero at the end of the transfer.
Use the SCSI phase to determine the transfer direction as used in other SCSI
controller implementations.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-9-mark.cave-ayland@ilande.co.uk>
This will become more useful later when trying to debug mixed FIFO and PDMA
requests.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-8-mark.cave-ayland@ilande.co.uk>
Move the trace event to the end of the function so that it correctly reports
the returned value if it doesn't come directly from the rregs array.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-7-mark.cave-ayland@ilande.co.uk>
This enables us to determine whether the command being issued is for a DMA or a
non-DMA transfer.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-6-mark.cave-ayland@ilande.co.uk>
The QOM object representing ESPState is currently embedded within both the
SYSBUS_ESP and PCI_ESP devices with migration state handled by embedding
vmstate_esp within each device using VMSTATE_STRUCT.
Since the vmstate_esp fields are embedded directly within the migration
stream, the incoming vmstate_esp version_id is lost. The only version information
available is that from vmstate_sysbus_esp_scsi and vmstate_esp_pci_scsi, but
those versions represent their respective devices and not that of the underlying
ESPState.
Resolve this by adding a new version-dependent field in vmstate_sysbus_esp_scsi
and vmstate_esp_pci_scsi which stores the vmstate_esp version_id field within
ESPState to be used to allow migration from older QEMU versions.
Finally bump the vmstate_esp version to 5 to cover the upcoming ESPState changes
within this patch series.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-5-mark.cave-ayland@ilande.co.uk>
Make this new QOM device state a child device of both the sysbus-esp and esp-pci
implementations.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-4-mark.cave-ayland@ilande.co.uk>
The existing ESP QOM type currently represents a sysbus device with an embedded
ESP state. Rename the type to SYSBUS_ESP accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20210304221103.6369-3-mark.cave-ayland@ilande.co.uk>
Some SCSI drivers like virtio have an internal mapping for the
host_status. This patch moves the host_status translation into
the SCSI drivers to allow those drivers to set up the correct
values.
Signed-off-by: Hannes Reinecke <hare@suse.de>.
[Added default handling to avoid touching all drivers. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently sg_io_sense_from_errno() converts the two input parameters
'errno' and 'io_hdr' into sense code and SCSI status. Having
split the function off into scsi_sense_from_errno() and
scsi_sense_from_host_status(), both of which are available generically,
we now inline the logic in the callers so that scsi-disk and
scsi-generic will be able to pass host_status to the HBA.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20201116184041.60465-7-hare@suse.de>
[Put together from "scsi-disk: Add sg_io callback to evaluate status"
and what remains of "scsi: split sg_io_sense_from_errno() in two functions",
with many other fixes. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a READ CAPACITY command would fail, for example s->qdev.blocksize would be
set to zero and cause a division by zero on the next use.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
PCI_DEVICE has overwritten DeviceState::unrealize (pci_qdev_unrealize).
However, LSI53C895A, which is a subclass of PCI_DEVICE, overwrites it
again and doesn't save the parent's implementation so the PCI_DEVICE's
implementation of DeviceState::unrealize will never be called when
unrealize a LSI53C895A device. And it will lead to memory leak and
unplug failure.
For a PCI device, it's better to implement PCIDevice::exit instead of
DeviceState::unrealize. So let's change to use PCIDevice::exit.
Fixes: a8632434c7 ("lsi: implement I/O memory space for Memory Move instructions")
Cc: qemu-stable@nongnu.org
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Message-Id: <20210302133016.1221081-1-liangpeng10@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The command complete callback has a SCSIRequest as the first argument,
and the status field of that structure is identical to the 'status'
argument. So drop the argument from the callback.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20201116184041.60465-3-hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now, recoverable sense values are only passed directly to the
guest only for rerror=report. However, when rerror/werror are 'stop'
we still don't want the host to be involved on every UNIT ATTENTION
(especially considered that the QMP event will not have enough information
to act on the report).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of fishing it from *r->status, just pass the SCSI status
as a positive value of the second parameter and an errno as a
negative value.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new function is an extension of the switch statement in scsi-disk.c
which also includes the errno cases only found in sg_io_sense_from_errno.
This allows us to consolidate the errno handling.
Extracted from a patch by Hannes Reinecke <hare@suse.de>.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When requested to ignore errors, just do nothing and let the
request complete normally. This means that the request will
be accounted correctly.
This is what commit 40dce4ee61 ("scsi-disk: fix rerror/werror=ignore",
2018-10-19) was supposed to do:
Fixes: 40dce4ee61 ("scsi-disk: fix rerror/werror=ignore", 2018-10-19)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If virtio_scsi_dataplane_start fails, there is a small window when it drops the
aio lock (in aio_wait_bh_oneshot) and the dataplane's AIO handler can
still run during that window.
This is done after the dataplane was marked as fenced, thus we use this flag
to avoid it doing any IO.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201217150040.906961-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
'enospc' is the default for -drive, but qemu allows user to set
drive option werror. If werror of scsi-generic is set to 'report'
by user, qemu will not allow vm to start.
This patch allow user to set werror as 'report' for scsi-generic.
Signed-off-by: Zihao Chang <changzihao1@huawei.com>
Reviewed-by: Fam Zheng <fam@euphon.net>
Message-Id: <20201103061240.1364-1-changzihao1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add tracepoints for SG_IO commands to allow for debugging
of SG_IO commands.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20201116183114.55703-4-hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current code sets an infinite timeout on SG_IO requests,
causing the guest to stall if the host experiences a frame
loss.
This patch adds an 'io_timeout' parameter for SCSIDevice to
make the SG_IO timeout configurable, and also shortens the
default timeout to 30 seconds to avoid infinite stalls.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20201116183114.55703-3-hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Count number of queues that we initialized and only deinitialize these that we
initialized successfully.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201217150040.906961-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, blk_is_read_only() tells whether a given BlockBackend can
only be used in read-only mode because its root node is read-only. Some
callers actually try to answer a slightly different question: Is the
BlockBackend configured to be writable, by taking write permissions on
the root node?
This can differ, for example, for CD-ROM devices which don't take write
permissions, but may be backed by a writable image file. scsi-cd allows
write requests to the drive if blk_is_read_only() returns false.
However, the write request will immediately run into an assertion
failure because the write permission is missing.
This patch introduces separate functions for both questions.
blk_supports_write_perm() answers the question whether the block
node/image file can support writable devices, whereas blk_is_writable()
tells whether the BlockBackend is currently configured to be writable.
All calls of blk_is_read_only() are converted to one of the two new
functions.
Fixes: https://bugs.launchpad.net/bugs/1906693
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210118123448.307825-2-kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The documentation for bdrv_set_aio_context_ignore() states this:
* The caller must own the AioContext lock for the old AioContext of bs, but it
* must not own the AioContext lock for new_context (unless new_context is the
* same as the current context of bs).
As blk_set_aio_context() makes use of this function, this rule also
applies to it.
Fix all occurrences where this rule wasn't honored.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20201214170519.223781-2-slp@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add trace events for virtio command and response tracing.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20201116183114.55703-2-hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 8118f0950f "migration: Append JSON description of migration
stream" needs a JSON writer. The existing qobject_to_json() wasn't a
good fit, because it requires building a QObject to convert. Instead,
migration got its very own JSON writer, in commit 190c882ce2 "QJSON:
Add JSON writer". It tacitly limits numbers to int64_t, and strings
contents to characters that don't need escaping, unlike
qobject_to_json().
The previous commit factored the JSON writer out of qobject_to_json().
Replace migration's JSON writer by it.
Cc: Juan Quintela <quintela@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20201211171152.146877-17-armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Move the property types and property macros implemented in
qdev-properties-system.c to a new qdev-properties-system.h
header.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20201211220529.2290218-16-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
There is (mostly theoretical) race between removal of a scsi device and
scsi_dma_restart_bh.
It used to be easier to hit this race prior to my / Paulo's patch series
that added rcu to scsi bus device handling code, but IMHO this race
should still be possible to hit, at least in theory.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1854811
Fix it anyway with a patch that was proposed by Paulo in the above bugzilla.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201210125929.1136390-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux has some OS-specific (and sometimes weird) mappings for various SCSI
statuses and sense codes. The most important is probably RESERVATION
CONFLICT. Add them so that they can be reported back to the guest
kernel.
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.
This patch contains all the files, whose maintainer I could not get
from ‘get_maintainer.pl’ script.
Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Message-Id: <20201023124424.20177-1-chetan4windows@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Adapted exec.c and qdev-monitor.c to new location]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Currently scsi_target_emulate_report_luns iterates over the child device list
twice, and there is no guarantee that this list is the same in both iterations.
The reason for iterating twice is that the first iteration calculates
how much memory to allocate. However if we use a dynamic array we can
avoid iterating twice, and therefore we avoid this race.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1866707
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-10-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-14-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will help us to avoid the scsi device disappearing
after we took a reference to it.
It doesn't by itself forbid case when we try to access
an unrealized device
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-9-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-13-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add scsi_device_get which finds the scsi device
and takes a reference to it.
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200913160259.32145-8-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-12-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The device core first places a device on the bus and then realizes it.
Make scsi_device_find avoid returing such devices to avoid
races in drivers that use an iothread (currently virtio-scsi)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1812399
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-7-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-11-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes the race between device emulation code that tries to find
a child device to dispatch the request to (e.g a scsi disk),
and hotplug of a new device to that bus.
Note that this doesn't convert all the readers of the list
but only these that might go over that list without BQL held.
This is a very small first step to make this code thread safe.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-5-mlevitsk@redhat.com>
[Use RCU_READ_LOCK_GUARD in more places, adjust testcase now that
the delay in DEVICE_DELETED due to RCU is more consistent. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-9-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This change will allow us to convert the bus children list to RCU,
while not changing the logic of this function
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
cur_mon really needs to be coroutine-local as soon as we move monitor
command handlers to coroutines and let them yield. As a first step, just
remove all direct accesses to cur_mon so that we can implement this in
the getter function later.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201005155855.256490-4-kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Currently in 'megasas_map_sgl' when 'iov_count=0' will just return
success however the 'cmd' doens't contain any iov. This will cause
the assert in 'scsi_dma_complete' failed. This is because in
'dma_blk_cb' the 'dbs->sg_cur_index == dbs->sg->nsg' will be true
and just call 'dma_complete'. However now there is no aiocb returned.
This fixes the LP#1878263:
-->https://bugs.launchpad.net/qemu/+bug/1878263
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20200815141940.44025-3-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The caller of 'megasas_map_sgl' will only check if the return
is zero or not. If it return 0 it means success, as in the next
patch we will consider 'iov_count=0' is an error, so let's
return -1 to indicate a failure.
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20200815141940.44025-2-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Several important steps during device scan depend on SCSI type of the
device. For example, max_transfer property is only determined and
assigned if the device has the type of TYPE_DISK.
Host-managed ZBC disks retain most of the properties of regular SCSI
drives, but they have their own SCSI device type, 0x14. This prevents
the proper assignment of max_transfer property for HM-zoned devices in
scsi-generic driver leading to I/O errors if the maximum i/o size
calculated at the guest exceeds the host value.
To fix this, define TYPE_ZBC to have the standard value from SCSI ZBC
standard spec. Several scan steps that were previously done only for
TYPE_DISK devices, are now performed for the SCSI devices having
TYPE_ZBC too.
Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Message-Id: <20200811225122.17342-3-dmitry.fomichev@wdc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Qemu will send GET_INFLIGHT_FD and SET_INFLIGH_FD to backend, and
the backend setup the inflight memory to track the io.
Change-Id: I805d6189996f7a1b44c65f0b12ef7473b1789510
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20200909122021.1055174-1-fengli@smartx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When debugging QEMU it is often useful to put a breakpoint on the
error_setg_internal method impl.
Unfortunately the object_property_add / object_class_property_add
methods call object_property_find / object_class_property_find methods
to check if a property exists already before adding the new property.
As a result there are a huge number of calls to error_setg_internal
on startup of most QEMU commands, making it very painful to set a
breakpoint on this method.
Most callers of object_find_property and object_class_find_property,
however, pass in a NULL for the Error parameter. This simplifies the
methods to remove the Error parameter entirely, and then adds some
new wrapper methods that are able to raise an Error when needed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200914135617.1493072-1-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Make the type checking macro name consistent with the TYPE_*
constant.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20200902224311.1321159-40-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Some typedefs and macros are defined after the type check macros.
This makes it difficult to automatically replace their
definitions with OBJECT_DECLARE_TYPE.
Patch generated using:
$ ./scripts/codeconverter/converter.py -i \
--pattern=QOMStructTypedefSplit $(git grep -l '' -- '*.[ch]')
which will split "typdef struct { ... } TypedefName"
declarations.
Followed by:
$ ./scripts/codeconverter/converter.py -i --pattern=MoveSymbols \
$(git grep -l '' -- '*.[ch]')
which will:
- move the typedefs and #defines above the type check macros
- add missing #include "qom/object.h" lines if necessary
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20200831210740.126168-9-ehabkost@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20200831210740.126168-10-ehabkost@redhat.com>
Message-Id: <20200831210740.126168-11-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We do not implement hotplug in the vscsi bus, but we forgot to
tell qdev about it. The result is that users are able to hotplug
devices in the vscsi bus, the devices appear in qdev, but they
aren't usable by the guest OS unless the user reboots it first.
Setting qbus hotplug_handler to NULL will tell qdev-monitor, via
qbus_is_hotpluggable(), that we do not support hotplug operations
in spapr_vscsi.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1862059
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200820190635.379657-1-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Use self-explicit definitions instead of magic '512' value.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20200814082841.27000-8-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This will make future conversion to use OBJECT_DECLARE* easier.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-Id: <20200826184334.4120620-9-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This will make future conversion to OBJECT_DECLARE* easier.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-By: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200825192110.3528606-41-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Rename the PVSCSI_DEVICE_CLASS() and PVSCSI_DEVICE_GET_CLASS()
macros to be consistent with the PVSCSI() instance cast macro.
This will allow us to register the type cast macros using
OBJECT_DECLARE_TYPE later.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-By: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200825192110.3528606-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Rename the MEGASAS_DEVICE_CLASS() and MEGASAS_DEVICE_GET_CLASS()
macros to be consistent with the MEGASAS() instance cast macro.
This will allow us to register the type cast macros using
OBJECT_DECLARE_TYPE later.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-By: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200825192110.3528606-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Automatically size the number of virtio-scsi-pci, vhost-scsi-pci, and
vhost-user-scsi-pci request virtqueues to match the number of vCPUs.
Other transports continue to default to 1 request virtqueue.
A 1:1 virtqueue:vCPU mapping ensures that completion interrupts are
handled on the same vCPU that submitted the request. No IPI is
necessary to complete an I/O request and performance is improved. The
maximum number of MSI-X vectors and virtqueues limit are respected.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200818143348.310613-6-stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The event and control virtqueues are always present, regardless of the
multi-queue configuration. Define a constant so that virtqueue number
calculations are easier to read.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20200818143348.310613-5-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Meson doesn't enjoy the same flexibility we have with Make in choosing
the include path. In particular the tracing headers are using
$(build_root)/$(<D).
In order to keep the include directives unchanged,
the simplest solution is to generate headers with patterns like
"trace/trace-audio.h" and place forwarding headers in the source tree
such that for example "audio/trace.h" includes "trace/trace-audio.h".
This patch is too ugly to be applied to the Makefiles now. It's only
a way to separate the changes to the tracing header files from the
Meson rewrite of the tracing logic.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When migrate_add_blocker(blocker, &errp) is followed by
error_propagate(errp, err), we can often just as well do
migrate_add_blocker(..., errp).
Do that with this Coccinelle script:
@@
expression blocker, err, errp;
expression ret;
@@
- ret = migrate_add_blocker(blocker, &err);
- if (err) {
+ ret = migrate_add_blocker(blocker, errp);
+ if (ret < 0) {
... when != err;
- error_propagate(errp, err);
...
}
@@
expression blocker, err, errp;
@@
- migrate_add_blocker(blocker, &err);
- if (err) {
+ if (migrate_add_blocker(blocker, errp) < 0) {
... when != err;
- error_propagate(errp, err);
...
}
Double-check @err is not used afterwards. Dereferencing it would be
use after free, but checking whether it's null would be legitimate.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-43-armbru@redhat.com>
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away. Convert
if (!foo(..., &err)) {
...
error_propagate(errp, err);
...
return ...
}
to
if (!foo(..., errp)) {
...
...
return ...
}
where nothing else needs @err. Coccinelle script:
@rule1 forall@
identifier fun, err, errp, lbl;
expression list args, args2;
binary operator op;
constant c1, c2;
symbol false;
@@
if (
(
- fun(args, &err, args2)
+ fun(args, errp, args2)
|
- !fun(args, &err, args2)
+ !fun(args, errp, args2)
|
- fun(args, &err, args2) op c1
+ fun(args, errp, args2) op c1
)
)
{
... when != err
when != lbl:
when strict
- error_propagate(errp, err);
... when != err
(
return;
|
return c2;
|
return false;
)
}
@rule2 forall@
identifier fun, err, errp, lbl;
expression list args, args2;
expression var;
binary operator op;
constant c1, c2;
symbol false;
@@
- var = fun(args, &err, args2);
+ var = fun(args, errp, args2);
... when != err
if (
(
var
|
!var
|
var op c1
)
)
{
... when != err
when != lbl:
when strict
- error_propagate(errp, err);
... when != err
(
return;
|
return c2;
|
return false;
|
return var;
)
}
@depends on rule1 || rule2@
identifier err;
@@
- Error *err = NULL;
... when != err
Not exactly elegant, I'm afraid.
The "when != lbl:" is necessary to avoid transforming
if (fun(args, &err)) {
goto out
}
...
out:
error_propagate(errp, err);
even though other paths to label out still need the error_propagate().
For an actual example, see sclp_realize().
Without the "when strict", Coccinelle transforms vfio_msix_setup(),
incorrectly. I don't know what exactly "when strict" does, only that
it helps here.
The match of return is narrower than what I want, but I can't figure
out how to express "return where the operand doesn't use @err". For
an example where it's too narrow, see vfio_intx_enable().
Silently fails to convert hw/arm/armsse.c, because Coccinelle gets
confused by ARMSSE being used both as typedef and function-like macro
there. Converted manually.
Line breaks tidied up manually. One nested declaration of @local_err
deleted manually. Preexisting unwanted blank line dropped in
hw/riscv/sifive_e.c.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-35-armbru@redhat.com>
The previous commit enables conversion of
qdev_prop_set_drive_err(..., &err);
if (err) {
...
}
to
if (!qdev_prop_set_drive_err(..., errp)) {
...
}
Coccinelle script:
@@
identifier fun = qdev_prop_set_drive_err;
expression list args;
typedef Error;
Error *err;
@@
- fun(args, &err);
- if (err)
+ if (!fun(args, &err))
{
...
}
One line break tidied up manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-33-armbru@redhat.com>
The previous commit enables conversion of
foo(..., &err);
if (err) {
...
}
to
if (!foo(..., errp)) {
...
}
for QOM functions that now return true / false on success / error.
Coccinelle script:
@@
identifier fun = {
object_apply_global_props, object_initialize_child_with_props,
object_initialize_child_with_propsv, object_property_get,
object_property_get_bool, object_property_parse, object_property_set,
object_property_set_bool, object_property_set_int,
object_property_set_link, object_property_set_qobject,
object_property_set_str, object_property_set_uint, object_set_props,
object_set_propv, user_creatable_add_dict,
user_creatable_complete, user_creatable_del
};
expression list args, args2;
typedef Error;
Error *err;
@@
- fun(args, &err, args2);
- if (err)
+ if (!fun(args, &err, args2))
{
...
}
Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.
Line breaks tidied up manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-29-armbru@redhat.com>
The object_property_set_FOO() setters take property name and value in
an unusual order:
void object_property_set_FOO(Object *obj, FOO_TYPE value,
const char *name, Error **errp)
Having to pass value before name feels grating. Swap them.
Same for object_property_set(), object_property_get(), and
object_property_parse().
Convert callers with this Coccinelle script:
@@
identifier fun = {
object_property_get, object_property_parse, object_property_set_str,
object_property_set_link, object_property_set_bool,
object_property_set_int, object_property_set_uint, object_property_set,
object_property_set_qobject
};
expression obj, v, name, errp;
@@
- fun(obj, v, name, errp)
+ fun(obj, name, v, errp)
Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information". Convert that one manually.
Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.
Fails to convert hw/rx/rx-gdbsim.c, because Coccinelle gets confused
by RXCPU being used both as typedef and function-like macro there.
Convert manually. The other files using RXCPU that way don't need
conversion.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-27-armbru@redhat.com>
[Straightforwad conflict with commit 2336172d9b "audio: set default
value for pcspk.iobase property" resolved]
Convert
foo(..., &err);
if (err) {
...
}
to
if (!foo(..., &err)) {
...
}
for qdev_realize(), qdev_realize_and_unref(), qbus_realize() and their
wrappers isa_realize_and_unref(), pci_realize_and_unref(),
sysbus_realize(), sysbus_realize_and_unref(), usb_realize_and_unref().
Coccinelle script:
@@
identifier fun = {
isa_realize_and_unref, pci_realize_and_unref, qbus_realize,
qdev_realize, qdev_realize_and_unref, sysbus_realize,
sysbus_realize_and_unref, usb_realize_and_unref
};
expression list args, args2;
typedef Error;
Error *err;
@@
- fun(args, &err, args2);
- if (err)
+ if (!fun(args, &err, args2))
{
...
}
Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information". Nothing to convert there; skipped.
Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Converted manually.
A few line breaks tidied up manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20200707160613.848843-5-armbru@redhat.com>
qbus_set_hotplug_handler() is a simple wrapper around
object_property_set_link().
object_property_set_link() fails when the property doesn't exist, is
not settable, or its .check() method fails. These are all programming
errors here, so passing &error_abort to qbus_set_hotplug_handler() is
appropriate.
Most of its callers do. Exceptions:
* pcie_cap_slot_init(), shpc_init(), spapr_phb_realize() pass NULL,
i.e. they ignore errors.
* spapr_machine_init() passes &error_fatal.
* s390_pcihost_realize(), virtio_serial_device_realize(),
s390_pcihost_plug() pass the error to their callers. The latter two
keep going after the error, which looks wrong.
Drop the @errp parameter, and instead pass &error_abort to
object_property_set_link().
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200630090351.1247703-15-armbru@redhat.com>
All callers pass &error_abort. Drop the parameter.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200630090351.1247703-14-armbru@redhat.com>
Some tracepoints in megasas.c use a guest-controlled value as an index
into the mfi_frame_desc[] array. Thus a malicious guest could cause an
out-of-bounds error here. Fortunately, the impact is very low since this
can only happen when the corresponding tracepoints have been enabled
before, but the problem should be fixed anyway with a proper check.
Buglink: https://bugs.launchpad.net/qemu/+bug/1882065
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200615072629.32321-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qdev_prop_set_drive() can fail. None of the other qdev_prop_set_FOO()
can; they abort on error.
To clean up this inconsistency, rename qdev_prop_set_drive() to
qdev_prop_set_drive_err(), and create a qdev_prop_set_drive() that
aborts on error.
Coccinelle script to update callers:
@ depends on !(file in "hw/core/qdev-properties-system.c")@
expression dev, name, value;
symbol error_abort;
@@
- qdev_prop_set_drive(dev, name, value, &error_abort);
+ qdev_prop_set_drive(dev, name, value);
@@
expression dev, name, value, errp;
@@
- qdev_prop_set_drive(dev, name, value, errp);
+ qdev_prop_set_drive_err(dev, name, value, errp);
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200622094227.1271650-14-armbru@redhat.com>
Several block device properties related to blocksize configuration must
be in certain relationship WRT each other: physical block must be no
smaller than logical block; min_io_size, opt_io_size, and
discard_granularity must be a multiple of a logical block.
To ensure these requirements are met, add corresponding consistency
checks to blkconf_blocksizes, adjusting its signature to communicate
possible error to the caller. Also remove the now redundant consistency
checks from the specific devices.
Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20200528225516.1676602-3-rvkagan@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is the transformation explained in the commit before previous.
Takes care of just one pattern that needs conversion. More to come in
this series.
Coccinelle script:
@ depends on !(file in "hw/arm/highbank.c")@
expression bus, type_name, dev, expr;
@@
- dev = qdev_create(bus, type_name);
+ dev = qdev_new(type_name);
... when != dev = expr
- qdev_init_nofail(dev);
+ qdev_realize_and_unref(dev, bus, &error_fatal);
@@
expression bus, type_name, dev, expr;
identifier DOWN;
@@
- dev = DOWN(qdev_create(bus, type_name));
+ dev = DOWN(qdev_new(type_name));
... when != dev = expr
- qdev_init_nofail(DEVICE(dev));
+ qdev_realize_and_unref(DEVICE(dev), bus, &error_fatal);
@@
expression bus, type_name, expr;
identifier dev;
@@
- DeviceState *dev = qdev_create(bus, type_name);
+ DeviceState *dev = qdev_new(type_name);
... when != dev = expr
- qdev_init_nofail(dev);
+ qdev_realize_and_unref(dev, bus, &error_fatal);
@@
expression bus, type_name, dev, expr, errp;
symbol true;
@@
- dev = qdev_create(bus, type_name);
+ dev = qdev_new(type_name);
... when != dev = expr
- object_property_set_bool(OBJECT(dev), true, "realized", errp);
+ qdev_realize_and_unref(dev, bus, errp);
@@
expression bus, type_name, expr, errp;
identifier dev;
symbol true;
@@
- DeviceState *dev = qdev_create(bus, type_name);
+ DeviceState *dev = qdev_new(type_name);
... when != dev = expr
- object_property_set_bool(OBJECT(dev), true, "realized", errp);
+ qdev_realize_and_unref(dev, bus, errp);
The first rule exempts hw/arm/highbank.c, because it matches along two
control flow paths there, with different @type_name. Covered by the
next commit's manual conversions.
Missing #include "qapi/error.h" added manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200610053247.1583243-10-armbru@redhat.com>
[Conflicts in hw/misc/empty_slot.c and hw/sparc/leon3.c resolved]
Use unsigned type for the MegasasState fields which hold positive
numeric values.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20200513192540.1583887-4-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While in megasas_handle_frame(), megasas_enqueue_frame() may
set a NULL frame into MegasasCmd object for a given 'frame_addr'
address. Add check to avoid a NULL pointer dereference issue.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: https://bugs.launchpad.net/qemu/+bug/1878259
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20200513192540.1583887-3-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.
Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.
Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200513192540.1583887-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We use the Object type all over the place.
Forward declare it in "qemu/typedefs.h".
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200504115656.6045-2-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Devices may have component devices and buses.
Device realization may fail. Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).
When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far. If any of these unrealizes
failed, the device would be left in an inconsistent state. Must not
happen.
device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.
Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back? We'd have to
re-realize, which can fail. This design is fundamentally broken.
device_set_realized() does not roll back at all. Instead, it keeps
unrealizing, ignoring further errors.
It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.
bus_set_realized() does not roll back either. Instead, it stops
unrealizing.
Fortunately, no unrealize method can fail, as we'll see below.
To fix the design error, drop parameter @errp from all the unrealize
methods.
Any unrealize method that uses @errp now needs an update. This leads
us to unrealize() methods that can fail. Merely passing it to another
unrealize method cannot cause failure, though. Here are the ones that
do other things with @errp:
* virtio_serial_device_unrealize()
Fails when qbus_set_hotplug_handler() fails, but still does all the
other work. On failure, the device would stay realized with its
resources completely gone. Oops. Can't happen, because
qbus_set_hotplug_handler() can't actually fail here. Pass
&error_abort to qbus_set_hotplug_handler() instead.
* hw/ppc/spapr_drc.c's unrealize()
Fails when object_property_del() fails, but all the other work is
already done. On failure, the device would stay realized with its
vmstate registration gone. Oops. Can't happen, because
object_property_del() can't actually fail here. Pass &error_abort
to object_property_del() instead.
* spapr_phb_unrealize()
Fails and bails out when remove_drcs() fails, but other work is
already done. On failure, the device would stay realized with some
of its resources gone. Oops. remove_drcs() fails only when
chassis_from_bus()'s object_property_get_uint() fails, and it can't
here. Pass &error_abort to remove_drcs() instead.
Therefore, no unrealize method can fail before this patch.
device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool(). Can't drop @errp there, so pass
&error_abort.
We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors. Pass &error_abort instead.
Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.
One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().
Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
Several functions can't fail anymore: ich9_pm_add_properties(),
device_add_bootindex_property(), ppc_compat_add_property(),
spapr_caps_add_properties(), PropertyInfo.create(). Drop their @errp
parameter.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-16-armbru@redhat.com>
The only way object_property_add() can fail is when a property with
the same name already exists. Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.
Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent. Parentage is
also under program control, so this is a programming error, too.
We have a bit over 500 callers. Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.
The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.
Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL. Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call. ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.
When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.
Drop parameter @errp and assert the preconditions instead.
There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification". Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
Fixes the following coccinelle warnings:
$ spatch --sp-file --verbose-parsing ... \
scripts/coccinelle/remove_local_err.cocci
...
SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5213
SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5261
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:166
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:167
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:169
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:170
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:171
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:172
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:173
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5787
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5789
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5800
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5801
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5802
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5804
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5805
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5806
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:6329
SUSPICIOUS: a \ character appears outside of a #define at ./hw/sd/sdhci.c:1133
SUSPICIOUS: a \ character appears outside of a #define at ./hw/scsi/scsi-disk.c:3081
SUSPICIOUS: a \ character appears outside of a #define at ./hw/net/virtio-net.c:1529
SUSPICIOUS: a \ character appears outside of a #define at ./hw/riscv/sifive_u.c:468
SUSPICIOUS: a \ character appears outside of a #define at ./dump/dump.c:1895
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2209
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2215
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2221
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2222
SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:172
SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:173
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200412223619.11284-2-f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
When running Ubuntu 3.13.0-65-generic guest, QEMU sometimes crashes
during guest ACPI reset. It crashes on assert(s->rings_info_valid)
in pvscsi_process_io().
Analyzing the crash revealed that it happens when userspace issues
a sync during a reboot syscall.
Below are backtraces we gathered from the guests.
Guest backtrace when issuing PVSCSI_CMD_ADAPTER_RESET:
pci_device_shutdown
device_shutdown
init_pid_ns
init_pid_ns
kernel_power_off
SYSC_reboot
Guest backtrace when issuing PVSCSI_REG_OFFSET_KICK_RW_IO:
scsi_done
scsi_dispatch_cmd
blk_add_timer
scsi_request_fn
elv_rb_add
__blk_run_queue
queue_unplugged
blk_flush_plug_list
blk_finish_plug
ext4_writepages
set_next_entity
do_writepages
__filemap_fdatawrite_range
filemap_write_and_wait_range
ext4_sync_file
ext4_sync_file
do_fsync
sys_fsync
Since QEMU pvscsi should imitate VMware pvscsi device emulation,
we decided to imitate VMware's behavior in this case.
To check VMware behavior, we wrote a kernel module that issues
a reset to the pvscsi device and then issues a kick. We ran it on
VMware ESXi 6.5 and it seems that it simply ignores the kick.
Hence, we decided to ignore the kick as well.
Signed-off-by: Elazar Leibovich <elazar.leibovich@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20200315132634.113632-1-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-8-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Depending on the length of sense data, vscsi_send_rsp() can
overrun the buffer size.
Do not copy more than SRP_MAX_IU_DATA_LEN bytes, and assert
that vscsi_send_iu() is always called with a size in range.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-7-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The 'union srp_iu' is meant as a pointer to any SRP Information
Unit type, it is not related to the size of a VIO DMA buffer.
Use a plain buffer for the VIO DMA read/write calls.
We can remove the reserved buffer from the 'union srp_iu'.
This issue was noticed when replacing the zero-length arrays
from hw/scsi/srp.h with flexible array member,
'clang -fsanitize=undefined' reported:
hw/scsi/spapr_vscsi.c:69:29: error: field 'iu' with variable sized type 'union viosrp_iu' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end]
union viosrp_iu iu;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-6-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce the req_iu() helper which returns a pointer to
the viosrp_iu union held in the vscsi_req structure.
This simplifies the next patch.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-5-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We already have a 'iu' pointer, use it
(this simplifies the next commit).
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-4-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Replace sizeof() flexible arrays union srp_iu/viosrp_iu by the
SRP_MAX_IU_LEN definition, which is what this code actually meant
to use.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-3-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This header use the srp_* structures declared in "hw/scsi/srp.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-2-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The goal is to reduce the amount of requests issued by a guest on
1M reads/writes. This rises the performance up to 4% on that kind of
disk access pattern.
The maximum chunk size to be used for the guest disk accessing is
limited with seg_max parameter, which represents the max amount of
pices in the scatter-geather list in one guest disk request.
Since seg_max is virqueue_size dependent, increasing the virtqueue
size increases seg_max, which, in turn, increases the maximum size
of data to be read/write from a guest disk.
More details in the original problem statment:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-id: 20200214074648.958-1-dplotnikov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit was produced with the included Coccinelle script
scripts/coccinelle/exec_rw_const.
Suggested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Fixes: 74d71ea16b
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20200218094402.26625-8-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Provide a temporary device_legacy_reset function doing what
device_reset does to prepare for the transition with Resettable
API.
All occurrence of device_reset in the code tree are also replaced
by device_legacy_reset.
The new resettable API has different prototype and semantics
(resetting child buses as well as the specified device). Subsequent
commits will make the changeover for each call site individually; once
that is complete device_legacy_reset() will be removed.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-2-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use virtio_delete_queue to make it more clear.
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200117075547.60864-3-pannengyuan@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Command line parsing fixes (Michal, Peter, Xiaoyao)
* Cooperlake CPU model fixes (Xiaoyao)
* i386 gdb fix (mkdolata)
* IOEventHandler cleanup (Philippe)
* icount fix (Pavel)
* RR support for random number sources (Pavel)
* Kconfig fixes (Philippe)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJeFbG8AAoJEL/70l94x66DCpMIAKBwxBL+VegqI+ySKgmtIBQX
LtU+ardEeZ37VfWfvuWzTFe+zQ0hsFpz/e0LHE7Ae+LVLMNWXixlmMrTIm+Xs762
hJzxBjhUhkdrMioVYTY16Kqap4Nqaxu70gDQ32Ve2sY6xYGxYLSaJooBOU5bXVgb
HPspHFVpeP6ZshBd1n2LXsgURE6v3AjTwqcsPCkL/AESFdkdOsoHeXjyKWJG1oPy
W7btzlUEqVsauZI8/PhhW/8hZUvUsJVHonYLTZTyy8aklU7aOILSyT2uPXFBVUVQ
irkQjLtD4dWlogBKO4i/QHMuwV+Asa57WNPmqv3EcIWPUWmTY84H0g2AxRgcc2M=
=48jx
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Compat machines fix (Denis)
* Command line parsing fixes (Michal, Peter, Xiaoyao)
* Cooperlake CPU model fixes (Xiaoyao)
* i386 gdb fix (mkdolata)
* IOEventHandler cleanup (Philippe)
* icount fix (Pavel)
* RR support for random number sources (Pavel)
* Kconfig fixes (Philippe)
# gpg: Signature made Wed 08 Jan 2020 10:41:00 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (38 commits)
chardev: Use QEMUChrEvent enum in IOEventHandler typedef
chardev: use QEMUChrEvent instead of int
chardev/char: Explicit we ignore some QEMUChrEvent in IOEventHandler
monitor/hmp: Explicit we ignore a QEMUChrEvent in IOEventHandler
monitor/qmp: Explicit we ignore few QEMUChrEvent in IOEventHandler
virtio-console: Explicit we ignore some QEMUChrEvent in IOEventHandler
vhost-user-blk: Explicit we ignore few QEMUChrEvent in IOEventHandler
vhost-user-net: Explicit we ignore few QEMUChrEvent in IOEventHandler
vhost-user-crypto: Explicit we ignore some QEMUChrEvent in IOEventHandler
ccid-card-passthru: Explicit we ignore QEMUChrEvent in IOEventHandler
hw/usb/redirect: Explicit we ignore few QEMUChrEvent in IOEventHandler
hw/usb/dev-serial: Explicit we ignore few QEMUChrEvent in IOEventHandler
hw/char/terminal3270: Explicit ignored QEMUChrEvent in IOEventHandler
hw/ipmi: Explicit we ignore some QEMUChrEvent in IOEventHandler
hw/ipmi: Remove unnecessary declarations
target/i386: Add missed features to Cooperlake CPU model
target/i386: Add new bit definitions of MSR_IA32_ARCH_CAPABILITIES
target/i386: Fix handling of k_gs_base register in 32-bit mode in gdbstub
hw/rtc/mc146818: Add missing dependency on ISA Bus
hw/nvram/Kconfig: Restrict CHRP NVRAM to machines using OpenBIOS or SLOF
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The NMC93xx EEPROM is only used by few NIC cards and the
Am53C974 SCSI controller.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191231183216.6781-13-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before the patch, seg_max parameter was immutable and hardcoded
to 126 (128 - 2) without respect to queue size. This has two negative effects:
1. when queue size is < 128, we have Virtio 1.1 specfication violation:
(2.6.5.3.1 Driver Requirements) seq_max must be <= queue_size.
This violation affects the old Linux guests (ver < 4.14). These guests
crash on these queue_size setups.
2. when queue_size > 128, as was pointed out by Denis Lunev <den@virtuozzo.com>,
seg_max restrics guest's block request length which affects guests'
performance making them issues more block request than needed.
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html
To mitigate this two effects, the patch adds the property adjusting seg_max
to queue size automaticaly. Since seg_max is a guest visible parameter,
the property is machine type managable and allows to choose between
old (seg_max = 126 always) and new (seg_max = queue_size - 2) behaviors.
Not to change the behavior of the older VMs, prevent setting the default
seg_max_adjust value for older machine types.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20191220140905.1718-2-dplotnikov@virtuozzo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the vhost-user-scsi backend supports the VHOST_USER_F_RESET_DEVICE
protocol feature, then the device can be reset when requested.
If this feature is not supported, do not try a reset as this will send
a VHOST_USER_RESET_OWNER that the backend is not expecting,
potentially putting into an inoperable state.
Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <1572385083-5254-3-git-send-email-raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>