While in megasas_handle_frame(), megasas_enqueue_frame() may
set a NULL frame into MegasasCmd object for a given 'frame_addr'
address. Add check to avoid a NULL pointer dereference issue.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: https://bugs.launchpad.net/qemu/+bug/1878259
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20200513192540.1583887-3-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.
Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.
Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200513192540.1583887-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We use the Object type all over the place.
Forward declare it in "qemu/typedefs.h".
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200504115656.6045-2-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Devices may have component devices and buses.
Device realization may fail. Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).
When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far. If any of these unrealizes
failed, the device would be left in an inconsistent state. Must not
happen.
device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.
Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back? We'd have to
re-realize, which can fail. This design is fundamentally broken.
device_set_realized() does not roll back at all. Instead, it keeps
unrealizing, ignoring further errors.
It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.
bus_set_realized() does not roll back either. Instead, it stops
unrealizing.
Fortunately, no unrealize method can fail, as we'll see below.
To fix the design error, drop parameter @errp from all the unrealize
methods.
Any unrealize method that uses @errp now needs an update. This leads
us to unrealize() methods that can fail. Merely passing it to another
unrealize method cannot cause failure, though. Here are the ones that
do other things with @errp:
* virtio_serial_device_unrealize()
Fails when qbus_set_hotplug_handler() fails, but still does all the
other work. On failure, the device would stay realized with its
resources completely gone. Oops. Can't happen, because
qbus_set_hotplug_handler() can't actually fail here. Pass
&error_abort to qbus_set_hotplug_handler() instead.
* hw/ppc/spapr_drc.c's unrealize()
Fails when object_property_del() fails, but all the other work is
already done. On failure, the device would stay realized with its
vmstate registration gone. Oops. Can't happen, because
object_property_del() can't actually fail here. Pass &error_abort
to object_property_del() instead.
* spapr_phb_unrealize()
Fails and bails out when remove_drcs() fails, but other work is
already done. On failure, the device would stay realized with some
of its resources gone. Oops. remove_drcs() fails only when
chassis_from_bus()'s object_property_get_uint() fails, and it can't
here. Pass &error_abort to remove_drcs() instead.
Therefore, no unrealize method can fail before this patch.
device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool(). Can't drop @errp there, so pass
&error_abort.
We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors. Pass &error_abort instead.
Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.
One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().
Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
Several functions can't fail anymore: ich9_pm_add_properties(),
device_add_bootindex_property(), ppc_compat_add_property(),
spapr_caps_add_properties(), PropertyInfo.create(). Drop their @errp
parameter.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-16-armbru@redhat.com>
The only way object_property_add() can fail is when a property with
the same name already exists. Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.
Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent. Parentage is
also under program control, so this is a programming error, too.
We have a bit over 500 callers. Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.
The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.
Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL. Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call. ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.
When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.
Drop parameter @errp and assert the preconditions instead.
There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification". Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
Fixes the following coccinelle warnings:
$ spatch --sp-file --verbose-parsing ... \
scripts/coccinelle/remove_local_err.cocci
...
SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5213
SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5261
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:166
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:167
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:169
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:170
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:171
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:172
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:173
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5787
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5789
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5800
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5801
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5802
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5804
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5805
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5806
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:6329
SUSPICIOUS: a \ character appears outside of a #define at ./hw/sd/sdhci.c:1133
SUSPICIOUS: a \ character appears outside of a #define at ./hw/scsi/scsi-disk.c:3081
SUSPICIOUS: a \ character appears outside of a #define at ./hw/net/virtio-net.c:1529
SUSPICIOUS: a \ character appears outside of a #define at ./hw/riscv/sifive_u.c:468
SUSPICIOUS: a \ character appears outside of a #define at ./dump/dump.c:1895
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2209
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2215
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2221
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2222
SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:172
SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:173
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200412223619.11284-2-f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
When running Ubuntu 3.13.0-65-generic guest, QEMU sometimes crashes
during guest ACPI reset. It crashes on assert(s->rings_info_valid)
in pvscsi_process_io().
Analyzing the crash revealed that it happens when userspace issues
a sync during a reboot syscall.
Below are backtraces we gathered from the guests.
Guest backtrace when issuing PVSCSI_CMD_ADAPTER_RESET:
pci_device_shutdown
device_shutdown
init_pid_ns
init_pid_ns
kernel_power_off
SYSC_reboot
Guest backtrace when issuing PVSCSI_REG_OFFSET_KICK_RW_IO:
scsi_done
scsi_dispatch_cmd
blk_add_timer
scsi_request_fn
elv_rb_add
__blk_run_queue
queue_unplugged
blk_flush_plug_list
blk_finish_plug
ext4_writepages
set_next_entity
do_writepages
__filemap_fdatawrite_range
filemap_write_and_wait_range
ext4_sync_file
ext4_sync_file
do_fsync
sys_fsync
Since QEMU pvscsi should imitate VMware pvscsi device emulation,
we decided to imitate VMware's behavior in this case.
To check VMware behavior, we wrote a kernel module that issues
a reset to the pvscsi device and then issues a kick. We ran it on
VMware ESXi 6.5 and it seems that it simply ignores the kick.
Hence, we decided to ignore the kick as well.
Signed-off-by: Elazar Leibovich <elazar.leibovich@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20200315132634.113632-1-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-8-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Depending on the length of sense data, vscsi_send_rsp() can
overrun the buffer size.
Do not copy more than SRP_MAX_IU_DATA_LEN bytes, and assert
that vscsi_send_iu() is always called with a size in range.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-7-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The 'union srp_iu' is meant as a pointer to any SRP Information
Unit type, it is not related to the size of a VIO DMA buffer.
Use a plain buffer for the VIO DMA read/write calls.
We can remove the reserved buffer from the 'union srp_iu'.
This issue was noticed when replacing the zero-length arrays
from hw/scsi/srp.h with flexible array member,
'clang -fsanitize=undefined' reported:
hw/scsi/spapr_vscsi.c:69:29: error: field 'iu' with variable sized type 'union viosrp_iu' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end]
union viosrp_iu iu;
^
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-6-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce the req_iu() helper which returns a pointer to
the viosrp_iu union held in the vscsi_req structure.
This simplifies the next patch.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-5-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We already have a 'iu' pointer, use it
(this simplifies the next commit).
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-4-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Replace sizeof() flexible arrays union srp_iu/viosrp_iu by the
SRP_MAX_IU_LEN definition, which is what this code actually meant
to use.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-3-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This header use the srp_* structures declared in "hw/scsi/srp.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200305121253.19078-2-philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The goal is to reduce the amount of requests issued by a guest on
1M reads/writes. This rises the performance up to 4% on that kind of
disk access pattern.
The maximum chunk size to be used for the guest disk accessing is
limited with seg_max parameter, which represents the max amount of
pices in the scatter-geather list in one guest disk request.
Since seg_max is virqueue_size dependent, increasing the virtqueue
size increases seg_max, which, in turn, increases the maximum size
of data to be read/write from a guest disk.
More details in the original problem statment:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-id: 20200214074648.958-1-dplotnikov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit was produced with the included Coccinelle script
scripts/coccinelle/exec_rw_const.
Suggested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Fixes: 74d71ea16b
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20200218094402.26625-8-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Provide a temporary device_legacy_reset function doing what
device_reset does to prepare for the transition with Resettable
API.
All occurrence of device_reset in the code tree are also replaced
by device_legacy_reset.
The new resettable API has different prototype and semantics
(resetting child buses as well as the specified device). Subsequent
commits will make the changeover for each call site individually; once
that is complete device_legacy_reset() will be removed.
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200123132823.1117486-2-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use virtio_delete_queue to make it more clear.
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200117075547.60864-3-pannengyuan@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Command line parsing fixes (Michal, Peter, Xiaoyao)
* Cooperlake CPU model fixes (Xiaoyao)
* i386 gdb fix (mkdolata)
* IOEventHandler cleanup (Philippe)
* icount fix (Pavel)
* RR support for random number sources (Pavel)
* Kconfig fixes (Philippe)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJeFbG8AAoJEL/70l94x66DCpMIAKBwxBL+VegqI+ySKgmtIBQX
LtU+ardEeZ37VfWfvuWzTFe+zQ0hsFpz/e0LHE7Ae+LVLMNWXixlmMrTIm+Xs762
hJzxBjhUhkdrMioVYTY16Kqap4Nqaxu70gDQ32Ve2sY6xYGxYLSaJooBOU5bXVgb
HPspHFVpeP6ZshBd1n2LXsgURE6v3AjTwqcsPCkL/AESFdkdOsoHeXjyKWJG1oPy
W7btzlUEqVsauZI8/PhhW/8hZUvUsJVHonYLTZTyy8aklU7aOILSyT2uPXFBVUVQ
irkQjLtD4dWlogBKO4i/QHMuwV+Asa57WNPmqv3EcIWPUWmTY84H0g2AxRgcc2M=
=48jx
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Compat machines fix (Denis)
* Command line parsing fixes (Michal, Peter, Xiaoyao)
* Cooperlake CPU model fixes (Xiaoyao)
* i386 gdb fix (mkdolata)
* IOEventHandler cleanup (Philippe)
* icount fix (Pavel)
* RR support for random number sources (Pavel)
* Kconfig fixes (Philippe)
# gpg: Signature made Wed 08 Jan 2020 10:41:00 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (38 commits)
chardev: Use QEMUChrEvent enum in IOEventHandler typedef
chardev: use QEMUChrEvent instead of int
chardev/char: Explicit we ignore some QEMUChrEvent in IOEventHandler
monitor/hmp: Explicit we ignore a QEMUChrEvent in IOEventHandler
monitor/qmp: Explicit we ignore few QEMUChrEvent in IOEventHandler
virtio-console: Explicit we ignore some QEMUChrEvent in IOEventHandler
vhost-user-blk: Explicit we ignore few QEMUChrEvent in IOEventHandler
vhost-user-net: Explicit we ignore few QEMUChrEvent in IOEventHandler
vhost-user-crypto: Explicit we ignore some QEMUChrEvent in IOEventHandler
ccid-card-passthru: Explicit we ignore QEMUChrEvent in IOEventHandler
hw/usb/redirect: Explicit we ignore few QEMUChrEvent in IOEventHandler
hw/usb/dev-serial: Explicit we ignore few QEMUChrEvent in IOEventHandler
hw/char/terminal3270: Explicit ignored QEMUChrEvent in IOEventHandler
hw/ipmi: Explicit we ignore some QEMUChrEvent in IOEventHandler
hw/ipmi: Remove unnecessary declarations
target/i386: Add missed features to Cooperlake CPU model
target/i386: Add new bit definitions of MSR_IA32_ARCH_CAPABILITIES
target/i386: Fix handling of k_gs_base register in 32-bit mode in gdbstub
hw/rtc/mc146818: Add missing dependency on ISA Bus
hw/nvram/Kconfig: Restrict CHRP NVRAM to machines using OpenBIOS or SLOF
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The NMC93xx EEPROM is only used by few NIC cards and the
Am53C974 SCSI controller.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191231183216.6781-13-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before the patch, seg_max parameter was immutable and hardcoded
to 126 (128 - 2) without respect to queue size. This has two negative effects:
1. when queue size is < 128, we have Virtio 1.1 specfication violation:
(2.6.5.3.1 Driver Requirements) seq_max must be <= queue_size.
This violation affects the old Linux guests (ver < 4.14). These guests
crash on these queue_size setups.
2. when queue_size > 128, as was pointed out by Denis Lunev <den@virtuozzo.com>,
seg_max restrics guest's block request length which affects guests'
performance making them issues more block request than needed.
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html
To mitigate this two effects, the patch adds the property adjusting seg_max
to queue size automaticaly. Since seg_max is a guest visible parameter,
the property is machine type managable and allows to choose between
old (seg_max = 126 always) and new (seg_max = queue_size - 2) behaviors.
Not to change the behavior of the older VMs, prevent setting the default
seg_max_adjust value for older machine types.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20191220140905.1718-2-dplotnikov@virtuozzo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the vhost-user-scsi backend supports the VHOST_USER_F_RESET_DEVICE
protocol feature, then the device can be reset when requested.
If this feature is not supported, do not try a reset as this will send
a VHOST_USER_RESET_OWNER that the backend is not expecting,
potentially putting into an inoperable state.
Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <1572385083-5254-3-git-send-email-raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Virtqueue notifications are not necessary during polling, so we disable
them. This allows the guest driver to avoid MMIO vmexits.
Unfortunately the virtio-blk and virtio-scsi handler functions re-enable
notifications, defeating this optimization.
Fix virtio-blk and virtio-scsi emulation so they leave notifications
disabled. The key thing to remember for correctness is that polling
always checks one last time after ending its loop, therefore it's safe
to lose the race when re-enabling notifications at the end of polling.
There is a measurable performance improvement of 5-10% with the null-co
block driver. Real-life storage configurations will see a smaller
improvement because the MMIO vmexit overhead contributes less to
latency.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20191209210957.65087-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It's an old compatibility shim that just delegates to scsi-cd or scsi-hd.
Just like ide-drive, we don't need this.
Acked-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Relevant devices are:
* ide-hd (and ide-cd, ide-drive)
* scsi-hd (and scsi-cd, scsi-disk, scsi-block)
* virtio-blk-pci
We do not call del_boot_device_lchs() for ide-* since we don't need to -
IDE block devices do not support unplugging.
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
Signed-off-by: Sam Eiderman <sameid@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
We will need to add LCHS removal logic to scsi-hd's unrealize() in the
next commit.
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
Signed-off-by: Sam Eiderman <sameid@google.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
There is no DMA in Quadra 800, so the CPU reads/writes the data from the
PDMA register (offset 0x100, ESP_PDMA in hw/m68k/q800.c) and copies them
to/from the memory.
There is a nice assembly loop in the kernel to do that, see
linux/drivers/scsi/mac_esp.c:MAC_ESP_PDMA_LOOP().
The start of the transfer is triggered by the DREQ interrupt (see linux
mac_esp_send_pdma_cmd()), the CPU polls on the IRQ flag to start the
transfer after a SCSI command has been sent (in Quadra 800 it goes
through the VIA2, the via2-irq line and the vIFR register)
The Macintosh hardware includes hardware handshaking to prevent the CPU
from reading invalid data or writing data faster than the peripheral
device can accept it.
This is the "blind mode", and from the doc:
"Approximate maximum SCSI transfer rates within a blocks are 1.4 MB per
second for blind transfers in the Macintosh II"
Some references can be found in:
Apple Macintosh Family Hardware Reference, ISBN 0-201-19255-1
Guide to the Macintosh Family Hardware, ISBN-0-201-52405-8
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Co-developed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20191026164546.30020-4-laurent@vivier.eu>
This will be needed to implement pseudo-DMA
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20191026164546.30020-3-laurent@vivier.eu>
To prepare following patches move do_cmd and DMA special case
from handle_ti() to esp_do_dma().
This part of the code must be only executed with real DMA, not with
pseudo-DMA. And PDMA is detected in esp_do_dma(), so move this part
of the code in esp_do_dma(). We keep the code in handle_ti_cmd()
in the case no DMA is done.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20191026164546.30020-2-laurent@vivier.eu>
This patch implements basic support for the packed virtqueue. Compare
the split virtqueue which has three rings, packed virtqueue only have
one which is supposed to have better cache utilization and more
hardware friendly.
Please refer virtio specification for more information.
Signed-off-by: Wei Xu <wexu@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20191025083527.30803-6-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190923121737.83281-8-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This will help to account the operation in the following commit.
The difference is that we don't call scsi_disk_req_check_error() before
the 1st discard iteration anymore. That function also checks if
the request is cancelled, however it shouldn't get canceled until it
yields in blk_aio() functions anyway.
Same approach is already used for emulate_write_same.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190923121737.83281-7-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
it allows to report it in the error handler
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Message-id: 20190923121737.83281-6-anton.nefedov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
While the tracing framework does not forbid trailing newline in
events format string, using them lead to confuse output.
It is the responsibility of the backend to properly end an event
line.
Some of our formats have trailing newlines, remove them.
[Fixed typo in commit description reported by Eric Blake
<eblake@redhat.com>
--Stefan]
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190916095121.29506-2-philmd@redhat.com
Message-Id: <20190916095121.29506-2-philmd@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Of the 3 virtqueues, seabios only sets cmd, leaving ctrl
and event without a physical address. This can cause
vhost_verify_ring_part_mapping to return ENOMEM, causing
the following logs:
qemu-system-x86_64: Unable to map available ring for ring 0
qemu-system-x86_64: Verify ring failure on region 0
The qemu commit e6cc11d64f
has already resolved the issue for vhost scsi devices but
the fix was never applied to vhost-user scsi devices.
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1560299717-177734-1-git-send-email-raphael.norwitz@nutanix.com
Message-Id: <1560299717-177734-1-git-send-email-raphael.norwitz@nutanix.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When executing script in lsi_execute_script(), the LSI scsi adapter
emulator advances 's->dsp' index to read next opcode. This can lead
to an infinite loop if the next opcode is empty. Move the existing
loop exit after 10k iterations so that it covers no-op opcodes as
well.
Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator. Evidence:
* It's included widely: in my "build everything" tree, changing
sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
objects (not counting tests and objects that don't depend on
qemu/osdep.h, down from 5400 due to the previous two commits).
* It pulls in more than a dozen additional headers.
Split stuff related to run state management into its own header
sysemu/runstate.h.
Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200. Touching new sysemu/runstate.h recompiles some 500 objects.
Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-30-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[Unbreak OS-X build]
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 1800 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h, down from 5400 due to the
previous commit).
Several headers include sysemu/sysemu.h just to get typedef
VMChangeStateEntry. Move it from sysemu/sysemu.h to qemu/typedefs.h.
Spell its structure tag the same while there. Drop the now
superfluous includes of sysemu/sysemu.h from headers.
Touching sysemu/sysemu.h now recompiles some 1100 objects.
qemu/uuid.h also drops from 1800 to 1100, and
qapi/qapi-types-run-state.h from 5000 to 4400.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-29-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
Almost a third of its inclusions are actually superfluous. Delete
them. Downgrade two more to qapi/qapi-types-run-state.h, and move one
from char/serial.h to char/serial.c.
hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and
stubs/semihost.c define variables declared in sysemu/sysemu.h without
including it. The compiler is cool with that, but include it anyway.
This doesn't reduce actual use much, as it's still included into
widely included headers. The next commit will tackle that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-27-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h. Include hw/qdev-core.h there
instead.
hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.
While there, delete a few superfluous inclusions of hw/qdev-core.h.
Touching hw/qdev-properties.h now recompiles some 1200 objects.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h). It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.
Include qemu/main-loop.h only where it's needed. Touching it now
recompiles only some 1700 objects. For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the
others, they shrink only slightly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-20-armbru@redhat.com>
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
The previous commits have left only the declaration of hw_error() in
hw/hw.h. This permits dropping most of its inclusions. Touching it
now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
hw/hw.h supposedly includes it for convenience. Several other headers
include it just to get VMStateDescription. The previous commit made
that unnecessary.
Include migration/vmstate.h only where it's still needed. Touching it
now recompiles only some 1600 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing hw/irq.h triggers a recompile
of some 5400 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
hw/hw.h supposedly includes it for convenience. Several other headers
include it just to get qemu_irq and.or qemu_irq_handler.
Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to
qemu/typedefs.h, and then include hw/irq.h only where it's still
needed. Touching it now recompiles only some 500 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-13-armbru@redhat.com>
In my "build everything" tree, changing migration/qemu-file-types.h
triggers a recompile of some 2600 out of 6600 objects (not counting
tests and objects that don't depend on qemu/osdep.h).
The culprit is again hw/hw.h, which supposedly includes it for
convenience.
Include migration/qemu-file-types.h only where it's needed. Touching
it now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
scsi-disks decides whether it has a read-only device by looking at
whether the BlockBackend specified as drive=... is read-only. In the
case of an anonymous BlockBackend (with a node name specified in
drive=...), this is the read-only flag of the attached node. In the case
of an empty anonymous BlockBackend, it's always read-write because
nothing prevented it from being read-write.
This is a problem because scsi-cd would take write permissions on the
anonymous BlockBackend of an empty drive created without a drive=...
option. Using blockdev-insert-medium with a read-only node fails then
with the error message "Block node is read-only".
Fix scsi_realize() so that scsi-cd devices always take read-only
permissions on their BlockBackend instead.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1733920
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
commit a6f230c move blockbackend back to main AioContext on unplug. It set the AioContext of
SCSIDevice to the main AioContex, but s->ctx is still the iothread AioContex(if the scsi controller
is configure with iothread). So if there are having in-flight requests during unplug, a failing assertion
happend. The bt is below:
(gdb) bt
#0 0x0000ffff86aacbd0 in raise () from /lib64/libc.so.6
#1 0x0000ffff86aadf7c in abort () from /lib64/libc.so.6
#2 0x0000ffff86aa6124 in __assert_fail_base () from /lib64/libc.so.6
#3 0x0000ffff86aa61a4 in __assert_fail () from /lib64/libc.so.6
#4 0x0000000000529118 in virtio_scsi_ctx_check (d=<optimized out>, s=<optimized out>, s=<optimized out>) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:246
#5 0x0000000000529ec4 in virtio_scsi_handle_cmd_req_prepare (s=0x2779ec00, req=0xffff740397d0) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:559
#6 0x000000000052a228 in virtio_scsi_handle_cmd_vq (s=0x2779ec00, vq=0xffff7c6d7110) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:603
#7 0x000000000052afa8 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>, vq=0xffff7c6d7110) at /home/qemu-4.0.0/hw/scsi/virtio-scsi-dataplane.c:59
#8 0x000000000054d94c in virtio_queue_host_notifier_aio_poll (opaque=<optimized out>) at /home/qemu-4.0.0/hw/virtio/virtio.c:2452
assert(blk_get_aio_context(d->conf.blk) == s->ctx) failed.
To avoid assertion failed, moving the "if" after qdev_simple_device_unplug_cb.
In addition, to avoid another qemu crash below, add aio_disable_external before
qdev_simple_device_unplug_cb, which disable the further processing of external clients
when doing qdev_simple_device_unplug_cb.
(gdb) bt
#0 scsi_req_unref (req=0xffff6802c6f0) at hw/scsi/scsi-bus.c:1283
#1 0x00000000005294a4 in virtio_scsi_handle_cmd_req_submit (req=<optimized out>,
s=<optimized out>) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:589
#2 0x000000000052a2a8 in virtio_scsi_handle_cmd_vq (s=s@entry=0x9c90e90,
vq=vq@entry=0xffff7c05f110) at /home/qemu-4.0.0/hw/scsi/virtio-scsi.c:625
#3 0x000000000052afd8 in virtio_scsi_data_plane_handle_cmd (vdev=<optimized out>,
vq=0xffff7c05f110) at /home/qemu-4.0.0/hw/scsi/virtio-scsi-dataplane.c:60
#4 0x000000000054d97c in virtio_queue_host_notifier_aio_poll (opaque=<optimized out>)
at /home/qemu-4.0.0/hw/virtio/virtio.c:2447
#5 0x00000000009b204c in run_poll_handlers_once (ctx=ctx@entry=0x6efea40,
timeout=timeout@entry=0xffff7d7f7308) at util/aio-posix.c:521
#6 0x00000000009b2b64 in run_poll_handlers (ctx=ctx@entry=0x6efea40,
max_ns=max_ns@entry=4000, timeout=timeout@entry=0xffff7d7f7308) at util/aio-posix.c:559
#7 0x00000000009b2ca0 in try_poll_mode (ctx=ctx@entry=0x6efea40, timeout=0xffff7d7f7308,
timeout@entry=0xffff7d7f7348) at util/aio-posix.c:594
#8 0x00000000009b31b8 in aio_poll (ctx=0x6efea40, blocking=blocking@entry=true)
at util/aio-posix.c:636
#9 0x00000000006973cc in iothread_run (opaque=0x6ebd800) at iothread.c:75
#10 0x00000000009b592c in qemu_thread_start (args=0x6efef60) at util/qemu-thread-posix.c:502
#11 0x0000ffff8057f8bc in start_thread () from /lib64/libpthread.so.0
#12 0x0000ffff804e5f8c in thread_start () from /lib64/libc.so.6
(gdb) p bus
$1 = (SCSIBus *) 0x0
Signed-off-by: Zhengui li <lizhengui@huawei.com>
Message-Id: <1563696502-7972-1-git-send-email-lizhengui@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1563829520-17525-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When READ CAPACITY command completes, scsi_read_complete() function
snoops the command result and updates SCSIDevice members blocksize and
max_lba . However, this update is executed even when READ CAPACITY
command indicates an error in sense data. This causes unexpected
blocksize update with zero value for SCSI devices without
READ CAPACITY(10) command support and eventually results in a divide
by zero. An emulated device by TCMU-runner is an example of a device
that doesn't support READ CAPACITY(10) command.
To avoid the unexpected update, add sense key check in
scsi_read_complete() function. The function already checks the sense key
for VPD Block Limits emulation. Do the scsi_parse_sense_buf() call for
all requests rather than just for VPD Block Limits emulation, so that
blocksize and max_lba are only updated if READ CAPACITY returns zero
sense key.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
[Extend the check to all requests, not just READ CAPACITY]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids memory leak when device hotplug is failed.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Message-Id: <20190717004606.12444-2-xieyongji@baidu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids memory leak when device hotplug is failed.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Message-Id: <20190717004606.12444-1-xieyongji@baidu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The argument is not used and passing it clutters error propagation in the
callers. So, get rid of it.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's not really possible to fit all sense codes into errno codes,
especially in such a way that sense codes can be properly categorized as
either guest-recoverable or host-handled. Create a new function that
checks for guest recoverable sense, then scsi_sense_buf_to_errno only
needs to be called for host handled sense codes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When an error was passed down to the guest because it was recoverable,
the sense length was not copied from the SG_IO data. As a result,
the guest saw the CHECK CONDITION status but not the sense data.
Signed-off-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the 'cont' command resumes guest execution the vm change state
handlers are invoked. Unfortunately there is no explicit ordering
between classic qemu_add_vm_change_state_handler() callbacks. When two
layers of code both use vm change state handlers, we don't control which
handler runs first.
virtio-scsi with iothreads hits a deadlock when a failed SCSI command is
restarted and completes before the iothread is re-initialized.
This patch uses the new qdev_add_vm_change_state_handler() API to
guarantee that virtio-scsi's virtio change state handler executes before
the SCSI bus children. This way DMA is restarted after the iothread has
re-initialized.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-3-armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
stricter rules for acpi tables: we now fail
on any difference that isn't whitelisted.
vhost-scsi migration.
some cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJc+B4YAAoJECgfDbjSjVRpq1EIAJR7tCxcpu9GggVlinmUA8G4
tmSAe06IryH7+nF3RsnINuGu7ius9qC2/E2y0uJUHhTqiU/RWOfWZ7PPM0EcYZaA
TLPaCe2NUF6/8afeqmvE9Usk7VspI5TDZRms+bonmZz2xP1lHIMN0qW4s7HHLWr8
sZKDtCJ+9cYII93VQwtlR0qiHgv5f0kzcuZeJaZHsAHH6XZGqRuQjI6txcFa4o53
lkdLCEwTnRuwu2wyL84eL5p+E8SzOgR/x1QI+nffrJfsvnmiT7lnOrkjnQlWAp5G
xqwqsUrUxUCuQ+zitwJqmv+H6nx79MwAM7fTHAETCWX703N5o9tZxAnHHqLoa8I=
=cQNg
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci, pc: cleanups, features
stricter rules for acpi tables: we now fail
on any difference that isn't whitelisted.
vhost-scsi migration.
some cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 05 Jun 2019 20:55:04 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
bios-tables-test: ignore identical binaries
tests: acpi: add simple arm/virt testcase
tests: add expected ACPI tables for arm/virt board
bios-tables-test: list all tables that differ
vhost-scsi: Allow user to enable migration
vhost-scsi: Add VMState descriptor
vhost-scsi: The vhost backend should be stopped when the VM is not running
bios-tables-test: add diff allowed list
vhost: fix memory leak in vhost_user_scsi_realize
vhost: fix incorrect print type
vhost: remove the dead code
docs: smbios: remove family=x from type2 entry description
pci: Fold pci_get_bus_devfn() into its sole caller
pci: Make is_bridge a bool
pcie: Simplify pci_adjust_config_limit()
acpi: pci: use build_append_foo() API to construct MCFG
hw/acpi: Consolidate build_mcfg to pci.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rather than looking inside the definition of a BusState with "s->bus.qbus",
use the QOM prefered style: "BUS(&s->bus)".
This patch was generated using the following Coccinelle script:
// Use BUS() macros to access BusState.qbus
@use_bus_macro_to_access_qbus@
expression obj;
identifier bus;
@@
-&obj->bus.qbus
+BUS(&obj->bus)
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20190528164020.32250-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Since the BusState is accesible from the SCSIBus object,
it is pointless to use qbus_reset_all_fn.
Use qbus_reset_all() directly.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Message-Id: <20190528164020.32250-2-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This makes use of qdev_prop_drive_iothread for scsi-disk so that the
disk can be attached to a node that is already in the target AioContext.
We need to check that the HBA actually supports iothreads, otherwise
scsi-disk must make sure that the node is already in the main
AioContext.
This changes the error message for conflicting iothread settings.
Previously, virtio-scsi produced the error message, now it comes from
blk_set_aio_context(). Update a test case accordingly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds a new parameter to blk_new() which requires its callers to
declare from which AioContext this BlockBackend is going to be used (or
the locks of which AioContext need to be taken anyway).
The given context is only stored and kept up to date when changing
AioContexts. Actually applying the stored AioContext to the root node
is saved for another commit.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an Error parameter to blk_set_aio_context() and use
bdrv_child_try_set_aio_context() internally to check whether all
involved nodes can actually support the AioContext switch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In order to perform a valid migration of a vhost-scsi device,
the following requirements must be met:
(1) The virtio-scsi device state needs to be saved & loaded.
(2) The vhost backend must be stopped before virtio-scsi device state
is saved:
(2.1) Sync vhost backend state to virtio-scsi device state.
(2.2) No further I/O requests are made by vhost backend to target
SCSI device.
(2.3) No further guest memory access takes place after VM is stopped.
(3) Requests in-flight to target SCSI device are completed before
migration handover.
(4) Target SCSI device state needs to be saved & loaded into the
destination host target SCSI device.
Previous commit ("vhost-scsi: Add VMState descriptor")
add support to save & load the device state using VMState.
This meets requirement (1).
When VM is stopped by migration thread (On Pre-Copy complete), the
following code path is executed:
migration_completion() -> vm_stop_force_state() -> vm_stop() ->
do_vm_stop().
do_vm_stop() calls first pause_all_vcpus() which pause all guest
vCPUs and then call vm_state_notify().
In case of vhost-scsi device, this will lead to the following code path
to be executed:
vm_state_notify() -> virtio_vmstate_change() ->
virtio_set_status() -> vhost_scsi_set_status() -> vhost_scsi_stop().
vhost_scsi_stop() then calls vhost_scsi_clear_endpoint() and
vhost_scsi_common_stop().
vhost_scsi_clear_endpoint() sends VHOST_SCSI_CLEAR_ENDPOINT ioctl to
vhost backend which will reach kernel's vhost_scsi_clear_endpoint()
which process all pending I/O requests and wait for them to complete
(vhost_scsi_flush()). This meets requirement (3).
vhost_scsi_common_stop() will stop the vhost backend.
As part of this stop, dirty-bitmap is synced and vhost backend state is
synced with virtio-scsi device state. As at this point guest vCPUs are
already paused, this meets requirement (2).
At this point we are left with requirement (4) which is target SCSI
device specific and therefore cannot be done by QEMU. Which is the main
reason why vhost-scsi adds a migration blocker.
However, as this can be handled either by an external orchestrator or
by using shared-storage (i.e. iSCSI), there is no reason to limit the
orchestrator from being able to explictly specify it wish to enable
migration even when VM have a vhost-scsi device.
Considering all the above, this commit allows orchestrator to explictly
specify that it is responsbile for taking care of requirement (4) and
therefore vhost-scsi should not add a migration blocker.
Reviewed-by: Nir Weiner <nir.weiner@oracle.com>
Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190416125912.44001-4-liran.alon@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
As preparation of enabling migration of vhost-scsi device,
define it’s VMState. Note, we keep the convention of
verifying in the pre_save() method that the vhost backend
must be stopped before attempting to save the device
state. Similar to how it is done for vhost-vsock.
Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Nir Weiner <nir.weiner@oracle.com>
Message-Id: <20190416125912.44001-3-liran.alon@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
vhost-scsi doesn’t takes into account whether the VM is running or not in
order to decide if it should start/stop vhost backend.
This would lead to vhost backend still being active when VM's RunState
suddenly change to stopped.
An example of when this issue is encountered is when Live-Migration Pre-Copy
phase completes. As in this case, VM state will be changed to stopped (while
vhost backend is still active), which will result in
virtio_vmstate_change() -> virtio_set_status() -> vhost_scsi_set_status()
executed but vhost_scsi_set_status() will just return without stopping
vhost backend.
To handle this, change code to consider that vhost processing should be
stopped when VM is not running. Similar to how it is done in vhost-vsock
device at vhost_vsock_set_status().
Fixes: 5e9be92d77 ("vhost-scsi: new device supporting the tcm_vhost Linux kernel module”)
Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Nir Weiner <nir.weiner@oracle.com>
Message-Id: <20190416125912.44001-2-liran.alon@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
fix memory leak in vhost_user_scsi_realize
Signed-off-by: Jie Wang <wangjie88@huawei.com>
Message-Id: <1556608500-12183-1-git-send-email-wangjie88@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
the current value of 1024 bytes (16 * MFI_FRAME_SIZE) we map is not enough to hold
the maximum number of scatter gather elements we advertise. We actually need a
maximum of 2048 bytes. This is 128 max sg elements * 16 bytes (sizeof (union mfi_sgl)).
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <20190404121015.28634-1-pl@kamp.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to
source files. That's because when trace-events got split up, the
comments were moved verbatim.
Delete the sub/dir/ part from these comments. Gets rid of several
misspellings.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190314180929.27722-3-armbru@redhat.com
Message-Id: <20190314180929.27722-3-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Take a VhostUserState* that can be pre-allocated, and initialize it
with the associated chardev.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Message-Id: <20190308140454.32437-4-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190218175529.11237-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Code was assigning DFIFO, but didn't return the value to users.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190305195519.24303-6-svens@stackframe.org>
This makes trace logs much easier to read, especially for
people who are not fluent in SCSI.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-5-svens@stackframe.org>
This makes the code easier to read - no functional change.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-4-svens@stackframe.org>
This makes the code easier to read - no functional change.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-3-svens@stackframe.org>
Instead of using the open-coded versions, use the helper already
present as this makes the code easier to read and less error-prone.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-2-svens@stackframe.org>
Qemu will crash with the assertion error that "assert(r->req.aiocb !=
NULL)" in scsi_read_complete if request is invaild or disk is no medium.
The error is below:
qemu-kvm: hw/scsi/scsi_disk.c:299: scsi_read_complete: Assertion
`r->req.aiocb != NULL' failed.
This patch add a funtion scsi_read_complete_noio to fix it.
Signed-off-by: Zhengui Li <lizhengui@huawei.com>
Message-Id: <1551949966-20092-1-git-send-email-lizhengui@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If SIGP is set, the 'Wait for Reselection' command should jump
immediately to the address stored in the second DWORD of the
instruction. This fixes spurious hangs in the HP-UX 11.11
installer when the SIGP bit gets set by the kernel before the
'Wait for Reselection' command is executed by SCRIPTS.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Helge Deller <deller@gmx.de>
Message-Id: <20190217113717.7077-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HP-UX checks this register after sending data to the target. If there's no valid
information present, it assumes the client disconnected because the kernel sent
to much data. Implement at least some of the SBCL functionality that is possible
without having a real SCSI bus.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190215194021.20543-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Build fails with gcc 9:
CC ppc64-softmmu/hw/scsi/virtio-scsi.o
hw/scsi/virtio-scsi.c: In function ‘virtio_scsi_do_tmf’:
hw/scsi/virtio-scsi.c:265:39: error: taking address of packed member of ‘struct virtio_scsi_ctrl_tmf_req’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
265 | virtio_tswap32s(VIRTIO_DEVICE(s), &req->req.tmf.subtype);
| ^~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
All the fields in struct virtio_scsi_ctrl_tmf_req are naturally aligned,
so we could in theory drop QEMU_PACKED. Unfortunately, the header file
is imported from linux which already has the packed attribute. Trying to
fix that in the update-linux-headers.sh script is likely to produce
ugliness. Turn the call to virtio_tswap32s() into an assignment instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155137678223.44753.5438092367451176318.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-42-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This automatically removes the SCSI subsystem from the
binary altogether if no controllers are selected.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20190123065618.3520-34-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of including the same list of devices for each target,
set CONFIG_PCI to true, and make the devices default to present
whenever PCI is available. However, s390x does not want all the
PCI devices, so there is a separate symbol to enable them.
Done mostly with the following script:
while read i; do
i=${i%=y}; i=${i#CONFIG_}
sed -i -e'/^config '$i'$/!b' -en \
-e'a\' -e' default y if PCI_DEVICES\' -e' depends on PCI' \
`grep -lw $i hw/*/Kconfig`
done < default-configs/pci.mak
followed by replacing a few "depends on" clauses with "select"
whenever the symbol is not really related to PCI.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-31-yang.zhong@intel.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The make_device_config.sh script is replaced by minikconf, which
is modified to support the same command line as its predecessor.
The roots of the parsing are default-configs/*.mak, Kconfig.host and
hw/Kconfig. One difference with make_device_config.sh is that all symbols
have to be defined in a Kconfig file, including those coming from the
configure script. This is the reason for the Kconfig.host file introduced
in the previous patch. Whenever a file in default-configs/*.mak used
$(...) to refer to a config-host.mak symbol, this is replaced by a
Kconfig dependency; this part must be done already in this patch
for bisectability.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-28-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Kconfig files were generated mostly with this script:
for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* | sort -u`; do
set fnord `git grep -lw $i -- 'hw/*/Makefile.objs' `
shift
if test $# = 1; then
cat >> $(dirname $1)/Kconfig << EOF
config ${i#CONFIG_}
bool
EOF
git add $(dirname $1)/Kconfig
else
echo $i $*
fi
done
sed -i '$d' hw/*/Kconfig
for i in hw/*; do
if test -d $i && ! test -f $i/Kconfig; then
touch $i/Kconfig
git add $i/Kconfig
fi
done
Whenever a symbol is referenced from multiple subdirectories, the
script prints the list of directories that reference the symbol.
These symbols have to be added manually to the Kconfig files.
Kconfig.host and hw/Kconfig were created manually.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20190123065618.3520-27-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Certain devices types, like memory/CPU, are now being handled using a
hotplug interface provided by a top-level MachineClass. Hotpluggable
host bridges are another such device where it makes sense to use a
machine-level hotplug handler. However, unlike those devices,
host-bridges have a parent bus (the main system bus), and devices with
a parent bus use a different mechanism for registering their hotplug
handlers: qbus_set_hotplug_handler(). This interface currently expects
a handler to be a subclass of DeviceClass, but this is not the case
for MachineClass, which derives directly from ObjectClass.
Internally, the interface only requires an ObjectClass, so expose that
in qbus_set_hotplug_handler().
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <154999589921.690774.3640149277362188566.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Whenever the allocation length of a SCSI request is shorter than the size of the
VPD page list, page_idx is used blindly to index into r->buf. Even though
the stores in the insertion sort are protected against overflows, the same is not
true of the reads and the final store of 0xb0.
This basically does the same thing as commit 57dbb58d80 ("scsi-generic: avoid
out-of-bounds access to VPD page list", 2018-11-06), except that here the
allocation length can be chosen by the guest. Note that according to the SCSI
standard, the contents of the PAGE LENGTH field are not altered based
on the allocation length.
The code was introduced by commit 6c219fc8a1 ("scsi-generic: keep VPD
page list sorted", 2018-11-06) but the overflow was already possible before.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Fixes: a71c775b24
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new device_id property specifies which value to use for the vendor
specific designator in the Device Identification VPD page.
In particular, this is necessary for libvirt to maintain guest ABI
compatibility when no serial number is given and a VM is switched from
-drive (where the BlockBackend name is used) to -blockdev (where the
vendor specific designator is left out by default).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
scsi-disk includes in the Device Identification VPD page, depending on
configuration amongst others, a vendor specific designator that consists
either of the serial number if given or the BlockBackend name (which is
a host detail that better shouldn't have been leaked to the guest, but
now we have to maintain it for compatibility).
With anonymous BlockBackends, i.e. scsi-disk devices constructed with
drive=<node-name>, and no serial number explicitly specified, this ends
up as an empty string. If this happens to more than one disk, we have
accidentally signalled to the OS that this is a multipath setup, which
is obviously not what was intended.
Instead of using an empty string for the vendor specific designator,
simply leave out that designator, which makes Linux detect such setups
as separate disks again.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This patch forbids attaching a disk to a SCSI device if its using a
different AioContext. Test case included.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This fixes a crash when attaching two disks with the same blockdev to
a SCSI device that is using iothreads. Test case included.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This fixes a crash when attaching a disk to a SCSI device using
iothreads, then detaching it and reattaching it again. Test case
included.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Memset vhost_dev to zero in the vhost_dev_cleanup function.
This causes dev.vqs to be NULL, so that
vqs does not free up space when calling the g_free function.
This will result in a memory leak. But you can't release vqs
directly in the vhost_dev_cleanup function, because vhost_net
will also call this function, and vhost_net's vqs is assigned by array.
In order to solve this problem, we first save the pointer of vqs,
and release the space of vqs after vhost_dev_cleanup is called.
Signed-off-by: Jian Wang <wangjian161@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Most files that have TABs only contain a handful of them. Change
them to spaces so that we don't confuse people.
disas, standard-headers, linux-headers and libdecnumber are imported
from other projects and probably should be exempted from the check.
Outside those, after this patch the following files still contain both
8-space and TAB sequences at the beginning of the line. Many of them
have a majority of TABs, or were initially committed with all tabs.
bsd-user/i386/target_syscall.h
bsd-user/x86_64/target_syscall.h
crypto/aes.c
hw/audio/fmopl.c
hw/audio/fmopl.h
hw/block/tc58128.c
hw/display/cirrus_vga.c
hw/display/xenfb.c
hw/dma/etraxfs_dma.c
hw/intc/sh_intc.c
hw/misc/mst_fpga.c
hw/net/pcnet.c
hw/sh4/sh7750.c
hw/timer/m48t59.c
hw/timer/sh_timer.c
include/crypto/aes.h
include/disas/bfd.h
include/hw/sh4/sh.h
libdecnumber/decNumber.c
linux-headers/asm-generic/unistd.h
linux-headers/linux/kvm.h
linux-user/alpha/target_syscall.h
linux-user/arm/nwfpe/double_cpdo.c
linux-user/arm/nwfpe/fpa11_cpdt.c
linux-user/arm/nwfpe/fpa11_cprt.c
linux-user/arm/nwfpe/fpa11.h
linux-user/flat.h
linux-user/flatload.c
linux-user/i386/target_syscall.h
linux-user/ppc/target_syscall.h
linux-user/sparc/target_syscall.h
linux-user/syscall.c
linux-user/syscall_defs.h
linux-user/x86_64/target_syscall.h
slirp/cksum.c
slirp/if.c
slirp/ip.h
slirp/ip_icmp.c
slirp/ip_icmp.h
slirp/ip_input.c
slirp/ip_output.c
slirp/mbuf.c
slirp/misc.c
slirp/sbuf.c
slirp/socket.c
slirp/socket.h
slirp/tcp_input.c
slirp/tcpip.h
slirp/tcp_output.c
slirp/tcp_subr.c
slirp/tcp_timer.c
slirp/tftp.c
slirp/udp.c
slirp/udp.h
target/cris/cpu.h
target/cris/mmu.c
target/cris/op_helper.c
target/sh4/helper.c
target/sh4/op_helper.c
target/sh4/translate.c
tcg/sparc/tcg-target.inc.c
tests/tcg/cris/check_addo.c
tests/tcg/cris/check_moveq.c
tests/tcg/cris/check_swap.c
tests/tcg/multiarch/test-mmap.c
ui/vnc-enc-hextile-template.h
ui/vnc-enc-zywrle.h
util/envlist.c
util/readline.c
The following have only TABs:
bsd-user/i386/target_signal.h
bsd-user/sparc64/target_signal.h
bsd-user/sparc64/target_syscall.h
bsd-user/sparc/target_signal.h
bsd-user/sparc/target_syscall.h
bsd-user/x86_64/target_signal.h
crypto/desrfb.c
hw/audio/intel-hda-defs.h
hw/core/uboot_image.h
hw/sh4/sh7750_regnames.c
hw/sh4/sh7750_regs.h
include/hw/cris/etraxfs_dma.h
linux-user/alpha/termbits.h
linux-user/arm/nwfpe/fpopcode.h
linux-user/arm/nwfpe/fpsr.h
linux-user/arm/syscall_nr.h
linux-user/arm/target_signal.h
linux-user/cris/target_signal.h
linux-user/i386/target_signal.h
linux-user/linux_loop.h
linux-user/m68k/target_signal.h
linux-user/microblaze/target_signal.h
linux-user/mips64/target_signal.h
linux-user/mips/target_signal.h
linux-user/mips/target_syscall.h
linux-user/mips/termbits.h
linux-user/ppc/target_signal.h
linux-user/sh4/target_signal.h
linux-user/sh4/termbits.h
linux-user/sparc64/target_syscall.h
linux-user/sparc/target_signal.h
linux-user/x86_64/target_signal.h
linux-user/x86_64/termbits.h
pc-bios/optionrom/optionrom.h
slirp/mbuf.h
slirp/misc.h
slirp/sbuf.h
slirp/tcp.h
slirp/tcp_timer.h
slirp/tcp_var.h
target/i386/svm.h
target/sparc/asi.h
target/xtensa/core-dc232b/xtensa-modules.inc.c
target/xtensa/core-dc233c/xtensa-modules.inc.c
target/xtensa/core-de212/core-isa.h
target/xtensa/core-de212/xtensa-modules.inc.c
target/xtensa/core-fsf/xtensa-modules.inc.c
target/xtensa/core-sample_controller/core-isa.h
target/xtensa/core-sample_controller/xtensa-modules.inc.c
target/xtensa/core-test_kc705_be/core-isa.h
target/xtensa/core-test_kc705_be/xtensa-modules.inc.c
tests/tcg/cris/check_abs.c
tests/tcg/cris/check_addc.c
tests/tcg/cris/check_addcm.c
tests/tcg/cris/check_addoq.c
tests/tcg/cris/check_bound.c
tests/tcg/cris/check_ftag.c
tests/tcg/cris/check_int64.c
tests/tcg/cris/check_lz.c
tests/tcg/cris/check_openpf5.c
tests/tcg/cris/check_sigalrm.c
tests/tcg/cris/crisutils.h
tests/tcg/cris/sys.c
tests/tcg/i386/test-i386-ssse3.c
ui/vgafont.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20181213223737.11793-3-pbonzini@redhat.com>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Stefan Markovic <smarkovic@wavecomp.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The guest OS reads RSTAT, RSEQ, and RINTR, and expects those registers
to reflect a consistent state. However, it is possible that the registers
can change after RSTAT was read, but before RINTR is read, when
esp_command_complete() is called.
Guest OS qemu
-------- ----
[handle interrupt]
Read RSTAT
esp_command_complete()
RSTAT = STAT_ST
esp_dma_done()
RSTAT |= STAT_TC
RSEQ = 0
RINTR = INTR_BS
Read RSEQ
Read RINTR RINTR = 0
RSTAT &= ~STAT_TC
RSEQ = SEQ_CD
The guest OS would then try to handle INTR_BS combined with an old
value of RSTAT. This sometimes resulted in lost events, spurious
interrupts, guest OS confusion, and stalled SCSI operations.
A typical guest error log (observed with various versions of Linux)
looks as follows.
scsi host1: Spurious irq, sreg=13.
...
scsi host1: Aborting command [84531f10:2a]
scsi host1: Current command [f882eea8:35]
scsi host1: Queued command [84531f10:2a]
scsi host1: Active command [f882eea8:35]
scsi host1: Dumping command log
scsi host1: ent[15] CMD val[44] sreg[90] seqreg[00] sreg2[00] ireg[20] ss[00] event[0c]
scsi host1: ent[16] CMD val[01] sreg[90] seqreg[00] sreg2[00] ireg[20] ss[02] event[0c]
scsi host1: ent[17] CMD val[43] sreg[90] seqreg[00] sreg2[00] ireg[20] ss[02] event[0c]
scsi host1: ent[18] EVENT val[0d] sreg[92] seqreg[04] sreg2[00] ireg[18] ss[00] event[0c]
...
Defer handling command completion until previous interrupts have been
handled to fix the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Per AM53C974 datasheet, definition of "SCSI Bus and Control (SBAC)"
register:
Bit 24 'STATUS' Write Erase Control
This bit controls the Write Erase feature on bits 3:1 and bit 6 of the DMA
Status Register ((B)+54h). When this bit is programmed to '1', the state
of bits 3:1 are preserved when read. Bits 3:1 are only cleared when a '1'
is written to the corresponding bit location. For example, to clear bit 1,
the value of '0000_0010b' should be written to the register. When the DMA
Status Preserve bit is '0', bits 3:1 are cleared when read.
The status register is currently defined to bit 12, not bit 24.
Also, its implementation is reversed: The status is auto-cleared if
the bit is set to 1, and must be cleared explicitly when the bit is
set to 0. This results in spurious interrupts reported by the Linux
kernel, and in some cases even results in stalled SCSI operations.
Set SBAC_STATUS to bit 24 and reverse the logic to fix the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-Id: <1543442171-24863-1-git-send-email-linux@roeck-us.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because they are supposed to remain const.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181114132931.22624-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Under heavy IO (e.g. fio) the queue is not checked frequently enough for
pending commands. As a result some pending commands are timed out by the
linux sym53c8xx driver, which sends SCSI Abort messages for the timed out
commands. The SCSI Abort messages result in linux errors, which show up
on the console and in /var/log/messages.
e.g.
sd 0:0:3:0: [sdd] tag#33 ABORT operation started
scsi target0:0:3: control msgout:
80 20 47 d
sd 0:0:3:0: ABORT operation complete.
scsi target0:0:4: message d sent on bad reselection
Now following a WAIT DISCONNECT Script instruction, and if there is no
current command, check for a pending command on the queue and if one
exists call lsi_reselect().
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Message-Id: <1541776692-12271-1-git-send-email-george.kennedy@oracle.com>
[For safety, add a s->current check in lsi_update_irq - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 40dce4ee6 "scsi-disk: fix rerror/werror=ignore" introduced a
bug which causes qemu to crash with the assertion error below if the
host file or disk returns an error:
qemu-system-x86_64: hw/scsi/scsi-bus.c:1374: scsi_req_complete:
Assertion `req->status == -1' failed.
Kevin Wolf suggested this fix:
< kwolf> Hm, should the final return false; in that patch
actually be a return true?
< kwolf> Because I think he didn't intend to change anything
except BLOCK_ERROR_ACTION_IGNORE
Buglink: https://bugs.launchpad.net/qemu/+bug/1804323
Fixes: 40dce4ee61
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Pass other sense, such as UNIT_ATTENTION or BUSY, directly to the
guest.
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Emulation of the block limits VPD page called back into scsi-disk.c,
which however expected the request to be for a SCSIDiskState and
accessed a scsi-generic device outside the bounds of its struct
(namely to retrieve s->max_unmap_size and s->max_io_size).
To avoid this, move the emulation code to a separate function that
takes a new SCSIBlockLimits struct and marshals it into the VPD
response format.
Reported-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A device can report an excessive number of VPD pages when asked for a
list; this can cause an out-of-bounds access to buf in
scsi_generic_set_vpd_bl_emulation. It should not happen, but
it is technically not incorrect so handle it: do not check any byte
past the allocation length that was sent to the INQUIRY command.
Reported-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block limits emulation is just placing 0xb0 as the final byte of the
VPD pages list. However, VPD page numbers must be sorted, so change
that to an in-place insert. Since I couldn't find any disk that triggered
the loop more than once, this was tested by adding manually 0xb1
at the end of the list and checking that 0xb0 was added before.
Reported-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While writing a message in 'lsi_do_msgin', message length value
in 'msg_len' could be invalid due to an invalid migration stream.
Add an assertion to avoid an out of bounds access, and reject
the incoming migration data if it contains an invalid message
length.
Discovered by Deja vu Security. Reported by Oracle.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20181026194314.18663-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are 3 virtqueues (ctrl, event and cmd) for virtio scsi device,
but seabios will only set the physical address for the 3rd one (cmd).
Then in vhost_virtqueue_start(), virtio_queue_get_desc_addr()
will be 0 for ctrl and event vq.
In this case, ctrl and event vq are not initialized.
vhost_verify_ring_mappings may use uninitialized vhost_virtqueue
such that vhost_verify_ring_part_mapping returns ENOMEM.
When encountered this problem, we got the following logs:
qemu-system-x86_64: Unable to map available ring for ring 0
qemu-system-x86_64: Verify ring failure on region 0
Signed-off-by: Forrest Liu <forrestl@synology.com>
Signed-off-by: yuchenlin <yuchenlin@synology.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
rerror=ignore was returning true from scsi_handle_rw_error but the callers were not
calling scsi_req_complete when rerror=ignore returns true (this is the correct thing
to do when true is returned after executing a passthrough command). Fix this by
calling it in scsi_handle_rw_error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a command fails with a sense that scsi_sense_buf_to_errno converts to
ECANCELED/EAGAIN/ENOTCONN or with a unit attention, scsi_req_complete is
called twice. This caused a crash.
Reported-by: Wangguang <wang.guangA@h3c.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When [2] was fixed it was agreed that adding and calling post_plug()
callback after device_reset() was low risk approach to hotfix issue
right before release. So it was merged instead of moving already
existing plug() callback after device_reset() is called which would
be more risky and require all plug() callbacks audit.
Looking at the current plug() callbacks, it doesn't seem that moving
plug() callback after device_reset() is breaking anything, so here
goes agreed upon [3] proper fix which essentially reverts [1][2]
and moves plug() callback after device_reset().
This way devices always comes to plug() stage, after it's been fully
initialized (including being reset), which fixes race condition [2]
without need for an extra post_plug() callback.
1. (25e897881 "qdev: add HotplugHandler->post_plug() callback")
2. (8449bcf94 "virtio-scsi: fix hotplug ->reset() vs event race")
3. https://www.mail-archive.com/qemu-devel@nongnu.org/msg549915.html
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1539696820-273275-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Pierre Morel<pmorel@linux.ibm.com>
Acked-by: Pierre Morel<pmorel@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this. Avoid the bug by not using the
"modify in place" byte swapping functions.
This patch was produced with the following simple spatch script:
@@
expression E;
@@
-le16_to_cpus(&E);
+E = le16_to_cpu(E);
@@
expression E;
@@
-le32_to_cpus(&E);
+E = le32_to_cpu(E);
@@
expression E;
@@
-le64_to_cpus(&E);
+E = le64_to_cpu(E);
@@
expression E;
@@
-cpu_to_le16s(&E);
+E = cpu_to_le16(E);
@@
expression E;
@@
-cpu_to_le32s(&E);
+E = cpu_to_le32(E);
@@
expression E;
@@
-cpu_to_le64s(&E);
+E = cpu_to_le64(E);
followed by some minor tidying of overlong lines and bad indent.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180927134852.21490-1-peter.maydell@linaro.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20180917053229.4853-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This option is added together with scsi-disk but is never honoured,
becuase we don't emulate the VPD page for scsi-block. We could intercept
and inject the user specified value like for max xfer len, but it's
probably not helpful since the intent of 070f80095a was for random
entropy aspects, not for performance. If emulated rotation rate is
desired, scsi-hd is more suitable.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180917083138.3948-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On some early machines the on-board PCI devices IRQs are wired directly to
the interrupt controller instead of via the PCI host bridge.
Add an optional external IRQ that if wired up via qdev will replace the
in-built PCI IRQ.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that these functions are no longer required they can be removed.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is the function that will soon be used to replace lsi53c895a_create() and
lsi53c810_create().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Allow toggling on/off the VIRTIO_SCSI_F_T10_PI feature bit for both
vhost-scsi and vhost-user-scsi devices.
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Message-Id: <20180808195235.5843-4-gedwards@ddn.com>
Reviewed-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the enablement of preset host features into the common
vhost_scsi_common_get_features() function. This is in preparation for
having vhost-scsi also make use of host_features.
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Message-Id: <20180808195235.5843-3-gedwards@ddn.com>
Reviewed-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In preparation for having vhost-scsi also make use of host_features,
move it from struct VHostUserSCSI into struct VHostSCSICommon.
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Message-Id: <20180808195235.5843-2-gedwards@ddn.com>
Reviewed-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The LSI 53c895a code does not handle the PPR Extended Message. Add
support to handle PPR Extended Message like SDTR and WDTR are handled.
That is, to skip past the message bytes and ignore the message.
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
mptsas1068 is currently listed as uncategorized device.
Mark it as storage device.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-Id: <1533076133-22745-1-git-send-email-linux@roeck-us.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reinstates commit b008326744,
which was temporarily reverted for the 3.0 release so that libvirt gets
some extra time to update their command lines.
The -drive option serial was deprecated in QEMU 2.10. It's time to
remove it.
Tests need to be updated to set the serial number with -global instead
of using the -drive option.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
There is a race condition during hotplug when iothread is used. It
occurs because virtio-scsi may be processing command queues in the
iothread while the monitor performs SCSI device hotplug.
When a SCSI device is hotplugged the HotplugHandler->plug() callback is
invoked and virtio-scsi emits a rescan event to the guest.
If the guest submits a SCSI command at this point then it may be
cancelled before hotplug completes. This happens because ->reset() is
called by hw/core/qdev.c:device_set_realized() after
HotplugHandler->plug() has been called and
hw/scsi/scsi-disk.c:scsi_disk_reset() purges all requests.
This patch uses the new HotplugHandler->post_plug() callback to emit the
rescan event after ->reset(). This eliminates the race conditions where
requests could be cancelled.
Reported-by: l00284672 <lizhengui@huawei.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180716083732.3347-3-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current BDC VPD page (page 0xb1) is too short. This can be
seen running sg_utils:
$ sg_vpd --page=bdc /dev/sda
Block device characteristics VPD page (SBC):
Block device characteristics VPD page length too short=8
By the SCSI spec, the expected size of the SBC page is 0x40.
There is no telling how the guest will behave with a shorter
message - it can ignore it, or worse, make (wrong)
assumptions.
This patch fixes the emulation by setting the size to 0x40.
This is the output of the previous sg_vpd command after
applying it:
$ sg_vpd --page=bdc /dev/sda -v
inquiry cdb: 12 01 b1 00 fc 00
Block device characteristics VPD page (SBC):
[PQual=0 Peripheral device type: disk]
Medium rotation rate is not reported
Product type: Not specified
WABEREQ=0
WACEREQ=0
Nominal form factor not reported
FUAB=0
VBULS=0
To improve readability, this patch also adds the VBULS value
explictly and add comments on the existing fields we're
setting.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts commit b008326744.
Hold off removing this for one more QEMU release (current libvirt
release still uses it.)
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It eases code review, unit is explicit.
Patch generated using:
$ git grep -E '(1024|2048|4096|8192|(<<|>>).?(10|20|30))' hw/ include/hw/
and modified manually.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180625124238.25339-10-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The error handling policy was traditionally set with -drive, but with
-blockdev it is no longer possible to set frontend options. scsi-disk
(and other block devices) have long supported qdev properties to
configure the error handling policy, so let's add these options to
usb-storage as well and just forward them to the internal scsi-disk
instance.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
The VPD Block Limits Inquiry page is optional, allowing SCSI devices
to not implement it. This is the case for devices like the MegaRAID
SAS 9361-8i and Microsemi PM8069.
In case of SCSI passthrough, the response of this request is used by
the QEMU SCSI layer to set the max_io_sectors that the guest
device will support, based on the value of the max_sectors_kb that
the device has set in the host at that time. Without this response,
the guest kernel is free to assume any value of max_io_sectors
for the SCSI device. If this value is greater than the value from
the host, SCSI Sense errors will occur because the guest will send
read/write requests that are larger than the underlying host device
is configured to support. An example of this behavior can be seen
in [1].
A workaround is to set the max_sectors_kb host value back in the guest
kernel (a process that can be automated using rc.local startup scripts
and the like), but this has several drawbacks:
- it can be troublesome if the guest has many passthrough devices that
needs this tuning;
- if a change in max_sectors_kb is made in the host side, manual change
in the guests will also be required;
- during an OS install it is difficult, and sometimes not possible, to
go to a terminal and change the max_sectors_kb prior to the installation.
This means that the disk can't be used during the install process. The
easiest alternative here is to roll back to scsi-hd, install the guest
and then go back to SCSI passthrough when the installation is done and
max_sectors_kb can be set.
An easier way would be to QEMU handle the absence of the Block Limits
VPD device response, setting max_io_sectors accordingly and allowing
the guest to use the device without the hassle.
This patch adds emulation of the Block Limits VPD response for
SCSI passthrough devices of type TYPE_DISK that doesn't support
it. The following changes were made:
- scsi_handle_inquiry_reply will now check the available VPD
pages from the Inquiry EVPD reply. In case the device does not
- a new function called scsi_generic_set_vpd_bl_emulation,
that is called during device realize, was created to set a
new flag 'needs_vpd_bl_emulation' of the device. This function
retrieves the Inquiry EVPD response of the device to check for
VPD BL support.
- scsi_handle_inquiry_reply will now check the available VPD
pages from the Inquiry EVPD reply in case the device needs
VPD BL emulation, adding the Block Limits page (0xb0) to
the list. This will make the guest kernel aware of the
support that we're now providing by emulation.
- a new function scsi_emulate_block_limits creates the
emulated Block Limits response. This function is called
inside scsi_read_complete in case the device requires
Block Limits VPD emulation and we detected a SCSI Sense
error in the VPD Block Limits reply that was issued
from the guest kernel to the device. This error is
expected: we're reporting support from our side, but
the device isn't aware of it.
With this patch, the guest now queries the Block Limits
page during the device configuration because it is being
advertised in the Supported Pages response. It will either
receive the Block Limits page from the hardware, if it supports
it, or will receive an emulated response from QEMU. At any rate,
the guest now has the information to set the max_sectors_kb
parameter accordingly, sparing the user of SCSI sense errors
that would happen without the emulated response and in the
absence of Block Limits support from the hardware.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1566195
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1566195
Reported-by: Dac Nguyen <dacng@us.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20180627172432.11120-4-danielhb413@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For the VPD Block Limits emulation with SCSI passthrough,
we'll issue an Inquiry request with EVPD set to retrieve
the available VPD pages of the device. This would be done in
a way similar of what scsi_generic_read_device_identification
does: create a SCSI command and a reply buffer, fill in the
sg_io_hdr_t structure, call blk_ioctl, check if an error
occurred, process the response.
This same process is done in other 2 functions, get_device_type
and get_stream_blocksize. They differ in the command/reply
buffer and post-processing, everything else is almost a
copy/paste.
Instead of adding a forth copy/pasted-ish code when adding
the passthrough VPD BL emulation, this patch extirpates
this repetition of those 3 functions and put it into
a new one called scsi_SG_IO_FROM_DEV. Any future code that
wants to execute an SG_DXFER_FROM_DEV to the device can
use it, avoiding filling sg_io_hdr_t again and et cetera.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20180627172432.11120-3-danielhb413@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To add support for the emulation of Block Limits VPD page
for passthrough devices, a few adjustments in the current code
base is required to avoid repetition and improve clarity.
In scsi-generic.c, detach the Inquiry handling from
scsi_read_complete and put it into a new function called
scsi_handle_inquiry_reply. This change aims to avoid
cluttering of scsi_read_complete when we more logic in the
Inquiry response handling is added in the next patches,
centralizing the changes in the new function.
In scsi-disk.c, take the build of all emulated VPD pages
from scsi_disk_emulate_inquiry and make it available to
other files into a non-static function called
scsi_disk_emulate_vpd_page. Making it public will allow
the future VPD BL emulation code for passthrough devices
to use it from scsi-generic.c, avoiding copy/pasting this
code solely for that purpose. It also has the advantage of
providing emulation of all VPD pages in case we need to
emulate other pages in other scenarios. As a bonus,
scsi_disk_emulate_inquiry got tidier.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20180627172432.11120-2-danielhb413@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the legacy esp_init() function now that there are no more remaining
users.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20180613094727.11326-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
The preferred way to select the KVM accelerator is to use "-accel kvm"
these days, so let's be consistent in our documentation and help texts.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1528866321-23886-3-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The -drive option serial was deprecated in QEMU 2.10. It's time to
remove it.
Tests need to be updated to set the serial number with -global instead
of using the -drive option.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbEXNvAAoJECgfDbjSjVRpc8gH/R8xrcFrV+k9wwbgYcOcGb6Y
LWjseE31pqJcxRV80vLOdzYEuLStZQKQQY7xBDMlA5vdyvZxIA6FLO2IsiJSbFAk
EK8pclwhpwQAahr8BfzenabohBv2UO7zu5+dqSvuJCiMWF3jGtPAIMxInfjXaOZY
odc1zY2D2EgsC7wZZ1hfraRbISBOiRaez9BoGDKPOyBY9G1ASEgxJgleFgoBLfsK
a1XU+fDM6hAVdxftfkTm0nibyf7PWPDyzqghLqjR9WXLvZP3Cqud4p8N29mY51pR
KSTjA4FYk6Z9EVMltyBHfdJs6RQzglKjxcNGdlrvacDfyFi79fGdiosVllrjfJM=
=3+V0
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
acpi, vhost, misc: fixes, features
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 01 Jun 2018 17:25:19 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (31 commits)
vhost-blk: turn on pre-defined RO feature bit
ACPI testing: test NFIT platform capabilities
nvdimm, acpi: support NFIT platform capabilities
tests/.gitignore: add entry for generated file
arch_init: sort architectures
ui: use local path for local headers
qga: use local path for local headers
colo: use local path for local headers
migration: use local path for local headers
usb: use local path for local headers
sd: fix up include
vhost-scsi: drop an unused include
ppc: use local path for local headers
rocker: drop an unused include
e1000e: use local path for local headers
ioapic: fix up includes
ide: use local path for local headers
display: use local path for local headers
trace: use local path for local headers
migration: drop an unused include
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
No reason for vhost-scsi to pull in migration headers directly.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This is only half of the work, because the proxy devices (virtio-*-pci,
virtio-*-ccw, etc.) are still included unconditionally. It is still a
move in the right direction.
Based-on: <20180522194943.24871-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove those unneeded includes to speed up the compilation
process a little bit.
Code change produced with:
$ git grep '#include "sysemu/blockdev.h"' | \
cut -d: -f-1 | \
xargs egrep -L "(BlockInterfaceType|DriveInfo|drive_get|blk_legacy_dinfo|blockdev_mark_auto_del)" | \
xargs sed -i.bak '/#include "sysemu\/blockdev.h"/d'
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-15-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove those unneeded includes to speed up the compilation
process a little bit. (Continue 7eceff5b5a cleanup)
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-13-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to the MemoryRegion valid.accepts
callback. We'll need this for subpage_accepts().
We could take the approach we used with the read and write
callbacks and add new a new _with_attrs version, but since there
are so few implementations of the accepts hook we just change
them all.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-9-peter.maydell@linaro.org
When multi queue is enabled e.g. for a virtio-net device,
each queue pair will have a vhost_dev, and the only thing
shared between vhost devs currently is the chardev. This
patch introduces a vhost-user state structure which will
be shared by all vhost devs of the same virtio device.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU SCSI code makes assumptions about how the PROTECT and BYTCHK
works in the protocol, denying support for PI (Protection
Information) in case the guest OS requests it. However, in SCSI versions 2
and older, there is no PI concept in the protocol.
This means that when dealing with such devices:
- there is no PROTECT bit in byte 5 of the standard INQUIRY response. The
whole byte is marked as "Reserved";
- there is no RDPROTECT in byte 2 of READ. We have 'Logical Unit Number'
in this field instead;
- there is no VRPROTECT in byte 2 of VERIFY. We have 'Logical Unit Number'
in this field instead. This also means that the BYTCHK bit in this case
is not related to PI.
Since QEMU does not consider these changes, a SCSI passthrough using
a SCSI-2 device will not work. It will mistake these fields with
PI information and return Illegal Request SCSI SENSE thinking
that the driver is asking for PI support.
This patch fixes it by adding a new attribute called 'scsi_version'
that is read from the standard INQUIRY response of passthrough
devices. This allows for a version verification before applying
conditions related to PI that doesn't apply for older versions.
Reported-by: Dac Nguyen <dacng@us.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Message-Id: <20180327211451.14647-1-danielhb@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We would like to have different behavior for passthrough devices
depending on the SCSI version they expose. To prepare for that,
allow the user of emulated devices to specify the desired SCSI
level, and adjust the emulation according to the property value.
The next patch will set the level for scsi-block and scsi-generic
devices.
Based on a patch by Daniel Henrique Barboza
<danielhb@linux.vnet.ibm.com>.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some backends report big max_io_sectors. Making min_io_size the same
value in this case will make it impossible for guest to align memory,
therefore the disk may not be usable at all.
Do not enlarge them when they are zero.
Reported-by: David Gibson <dgibson@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180327164141.19075-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the user does not have permissions to send ioctls to the device (due to
SELinux or cgroups, for example), the output can look like
qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
Operation not permitted. Is this a SCSI device?
but this is confusing because the ioctl was blocked _before_ the device
even received the SG_GET_VERSION_NUM ioctl. Therefore, for EPERM errors
the suggestion should be eliminated. To make that simpler, change the
code to use error_append_hint.
Reported-by: Ala Hino <ahino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The global hack for creating SCSI devices has recently been removed,
but this apparently broke SCSI devices on some boards that were not
ready for this change yet. For the 40p machine you now get:
$ ppc64-softmmu/qemu-system-ppc64 -M 40p -cdrom x.iso
qemu-system-ppc64: -cdrom x.iso: machine type does not support if=scsi,bus=0,unit=2
Fix it by providing a lsi53c810_create() function that takes care
of calling scsi_bus_legacy_handle_cmdline() after creating the
corresponding SCSI controller.
Fixes: 1454509726
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The global hack for creating SCSI devices has recently been removed,
but this apparently broke SCSI devices on some boards that were not
ready for this change yet. For the pica61 machine you now get:
$ mips64-softmmu/qemu-system-mips64 -M pica61 -cdrom x.iso
qemu-system-mips64: -cdrom x.iso: machine type does not support if=scsi,bus=0,unit=2
Fix it by calling scsi_bus_legacy_handle_cmdline() after creating the
corresponding SCSI controller.
Fixes: 1454509726
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1520414644-11535-1-git-send-email-thuth@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A NDOB bit set to one specifies that the disk shall not transfer data
from the data-out buffer and shall process the command as if the data-out
buffer contained user data set to all zeroes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After reviewing a patch from Philippe that removes block-backend.h
from hw/lm32/milkymist.c, I noticed that this header is included
unnecessarily in a lot of other files, too. Remove those unneeded
includes to speed up the compilation process a little bit.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1518684912-31637-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The calculation of the max_transfer atribute of BlockDriverState
makes considerations such as max_segments and transfer_length via
the BLKSECTGET ioctl (if available).
However, bl->max_transfer isn't considered when emulating the INQUIRY
'Block Limit' response to the scsi-hd devices. This leads to situations
where the declared max_sectors from the INQUIRY response is inconsistent
with the block limits, which isn't ideal. It can also be misleading to the
user that sets /sys/block/<dev>/queue/max_sectors_kb to a certain
value, then finds a different value in the guest OS for the same disk.
Following the same logic scsi_read_complete from scsi-generic.c does
when patching the response of the Block Limits VPD back to the guest,
change the max_io_sectors value of the emulated Block Limits VPD
response by considering the blk_get_max_transfer of the related
BlockDriverState. Use MIN_NOT_ZERO to be sure that the minimal
value is chosen.
Given that we're changing max_io_sectors, consider that min_io_sectors
and opt_io_sectors can't be greater than the new calculated value.
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180306154411.18462-1-danielhb@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock. By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!
Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread. This way no races with the vq handler are
possible.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Automatic creation of SCSI controllers for "-drive if=scsi" for x86
machines was quite a bad idea (see description of commit f778a82f0c
for details). This is marked as deprecated since QEMU v2.9.0, and as
far as I know, nobody complained that this is still urgently required
anymore. Time to remove this now.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519123357-13225-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- new stats in virtio balloon
- virtio eventfd rework for boot speedup
- vhost memory rework for boot speedup
- fixes and cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJagxKDAAoJECgfDbjSjVRp5qAH/3gmgBaIzL3KRHd5i0RZifJv
PvyAVYgZd7h0+/1r9GM7guHKyEPZ08JtbHSm/HuDV4BD/Vf3/8joy8roExIfde2A
6k8fd6ANVQmE3t5zUxNXi9qiG4pO4xDIu4cMAbixzgN9x5ttlcfTw7fTT0e0VJxJ
8SN02/uCPPR/DY4/cpjah+slSyv6rBKT1v1ONy7djyRTYHi6h3Meoh05YfEALkwA
goxTKBZHi0L1IZ3HP/ZpXJDohQ5n2P09DX0fQgb8PgmW6WIWB/Qpi5pD53LZpMCV
n9waTF0U0ahneFd2FHo22QMMrwWvQyrjv+w5uXVr+qmHb/OyH2tUt7PgGF9+QKA=
=78s5
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,vhost,pci,pc: features, fixes and cleanups
- new stats in virtio balloon
- virtio eventfd rework for boot speedup
- vhost memory rework for boot speedup
- fixes and cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 13 Feb 2018 16:29:55 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (22 commits)
virtio-balloon: include statistics of disk/file caches
acpi-test: update FADT
lpc: drop pcie host dependency
tests: acpi: fix FADT not being compared to reference table
hw/pci-bridge: fix pcie root port's IO hints capability
libvhost-user: Support across-memory-boundary access
libvhost-user: Fix resource leak
virtio-balloon: unref the memory region before continuing
pci: removed the is_express field since a uniform interface was inserted
virtio-blk: enable multiple vectors when using multiple I/O queues
pci/bus: let it has higher migration priority
pci-bridge/i82801b11: clear bridge registers on platform reset
vhost: Move log_dirty check
vhost: Merge and delete unused callbacks
vhost: Clean out old vhost_set_memory and friends
vhost: Regenerate region list from changed sections list
vhost: Merge sections added to temporary list
vhost: Simplify ring verification checks
vhost: Build temporary section list and deref after commit
virtio: improve virtio devices initialization time
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu-common.h includes qemu/option.h, but most places that include the
former don't actually need the latter. Drop the include, and add it
to the places that actually need it.
While there, drop superfluous includes of both headers, and
separate #include from file comment with a blank line.
This cleanup makes the number of objects depending on qemu/option.h
drop from 4545 (out of 4743) to 284 in my "build everything" tree.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-20-armbru@redhat.com>
[Semantic conflict with commit bdd6a90a9e in block/nvme.c resolved]
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.
While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
according to Eduardo Habkost's commit fd3b02c889 all PCIEs now implement
INTERFACE_PCIE_DEVICE so we don't need is_express field anymore.
Devices that implements only INTERFACE_PCIE_DEVICE (is_express == 1)
or
devices that implements only INTERFACE_CONVENTIONAL_PCI_DEVICE (is_express == 0)
where not affected by the change.
The only devices that were affected are those that are hybrid and also
had (is_express == 1) - therefor only:
- hw/vfio/pci.c
- hw/usb/hcd-xhci.c
- hw/xen/xen_pt.c
For those 3 I made sure that QEMU_PCI_CAP_EXPRESS is on in instance_init()
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yoni Bettan <ybettan@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio_bus_set_host_notifier function no longer calls
event_notifier_cleanup when a event notifier is removed.
The commit updates the code to match the new behavior and calls
virtio_bus_cleanup_host_notifier after the notifier was de-assign
and no longer in use.
This change is a preparation to allow executing the
virtio_bus_set_host_notifier function in a memory region
transaction.
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Coverity doesn't like the ignored return value introduced in
9d3b155186 (hw/block: Fix the return type), and other callers are
converted already in ceff3e1f01.
This one was added lately in d9bcd6f7f2 and missed the train. Do it
now.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180118025245.13042-1-famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because usb-storage creates an internal scsi device, we should propagate
options. We already do so for bootindex etc, but failed to take care of
share-rw. Fix it in an apparent way: add a new parameter to
scsi_bus_legacy_add_drive and pass in s->conf.share_rw.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20180117005222.4781-1-famz@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
scsi_write_same_complete() can retry the write if the request was
unaligned. Make sure to release the AioContext when that code path is
taken!
This patch fixes a hang when QEMU terminates after an unaligned WRITE
SAME request has been processed with dataplane. The hang occurs because
iothread_stop_all() cannot acquire the AioContext lock that was leaked
by the IOThread in scsi_write_same_complete().
Fixes: b9e413dd37 ("block: explicitly acquire aiocontext in aio callbacks that need it").
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reported-by: Cong Li <coli@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180104142502.15175-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the property to the device model, then parse it by calling
blkconf_apply_backend_options().
In addition to blk_set_perm(), the called function also handles error
options and wce. For error options we've already checked that the
default values are used, for wce we don't have the option either so it
is always the default (true). In other words there is no change of
behavior in these regards.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171205151553.7834-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Extract the common parts of scsi_sense_buf_to_errno, scsi_convert_sense
and scsi_target_send_command's REQUEST SENSE handling into two new
functions scsi_parse_sense_buf and scsi_build_sense_buf.
Fix a bug in scsi_target_send_command along the way; the length was
written in buf[10] rather than buf[7].
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: b07fbce634 ("scsi-bus: correct responses for INQUIRY and REQUEST SENSE")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Scsi-block doesn't use the DEFINE_BLOCK_PROPERTIES() macro so it didn't
gain the share-rw back when it was added to all other storage devices.
This option is meaningful here, and need to be used when attaching a
shared storage to guest.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171205071928.30242-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Drop virtio_blk_data_plane_create() change that misinterprets return
value when the virtio transport does not support dataplane.
--Stefan]
Cc: John Snow <jsnow@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: e77848d3735ba590f23ffbf8094379c646c33d79.1511317952.git.maozy.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
applied using ./scripts/clean-includes
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Ben Warren <ben@skyportsystems.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The bus pointer in PCIDevice is basically redundant with QOM information.
It's always initialized to the qdev_get_parent_bus(), the only difference
is the type.
Therefore this patch eliminates the field, instead creating a pci_get_bus()
helper to do the type mangling to derive it conveniently from the QOM
Device object underneath.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
A fair proportion of the users of pci_bus_num() want to get the bus
number on a specific device, so first have to look up the bus from the
device then call it. This adds a helper to do that (since we're going
to make looking up the bus slightly more verbose).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Commit 5c0919d020 ("virtio-scsi: Add virtqueue_size parameter allowing
virtqueue size to be set.") introduced a new parameter to virtio-scsi.
Later, commit 9200361060 ("vhost-user-scsi: add missing virtqueue_size
param") added that parameter to the new vhost-user-scsi interface but
neglected the existing vhost-scsi interface it was built on.
Apply the same change to vhost-scsi, so that we can boot a guest with
a device defined. This also avoids crashing a guest when hotplugging
a vhost-scsi device.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-id: 20171201151538.6844-2-farman@linux.vnet.ibm.com
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 5c0919d0 [1] introduced virtqueue_size parameter
for common virtio-scsi path, without updaing the vhost-user-scsi
code. vhost-user-scsi devices right now report size 0 for each vq.
This patch introduces virtqueue_size param to vhost-user-scsi,
that can now be set by the user. However, the most importantly, it
now has a default value of 128 (same as QEMU's virtio-scsi).
[1] 5c0919d0 ("virtio-scsi: Add virtqueue_size parameter
allowing virtqueue size to be set.")
Change-Id: I70e87eab702ebf1196c028dbf17d54fdc0c89a14
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Message-Id: <1510676916-76409-1-git-send-email-dariuszx.stojaczyk@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This enables them to be used outside of esp.c.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Logical block size of a SCSI disk should never be larger than
physical block size. From an ATA/SCSI perspective, it makes no sense
to have the logical block size greater than the physical block size,
and it cannot even be effectively expressed in the command set. The
whole point of adding the physical block size to the ATA/SCSI command
set was to communicate a desire for a larger block size (than logical),
while maintaining backwards compatibility with legacy 512 byte block
size.
When setting logical_block_size > physical_block_size, QEMU cannot express
it in READ CAPACITY(16) output, and all it can do is set the physical
block exponent to 0 (i.e. logical_block_size == physical_block_size).
Reporting the error properly, however, is better.
Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Message-Id: <1508185024-5840-1-git-send-email-mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of
TYPE_PCI_DEVICE, except:
1) The ones that already have INTERFACE_PCIE_DEVICE set:
* base-xhci
* e1000e
* nvme
* pvscsi
* vfio-pci
* virtio-pci
* vmxnet3
2) base-pci-bridge
Not all PCI bridges are Conventional PCI devices, so
INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes
that are actually Conventional PCI:
* dec-21154-p2p-bridge
* i82801b11-bridge
* pbm-bridge
* pci-bridge
The direct subtypes of base-pci-bridge not touched by this patch
are:
* xilinx-pcie-root: Already marked as PCIe-only.
* pcie-pci-bridge: Already marked as PCIe-only.
* pcie-port: all non-abstract subtypes of pcie-port are already
marked as PCIe-only devices.
3) megasas-base
Not all megasas devices are Conventional PCI devices, so the
interface names are added to the subclasses registered by
megasas_register_types(), according to information in the
megasas_devices[] array.
"megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add
INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas".
Acked-by: Alberto Garcia <berto@igalia.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Change all devices that set is_express=1 to implement
INTERFACE_PCIE_DEVICE.
Cc: Keith Busch <keith.busch@intel.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Dmitry Fleytman <dmitry@daynix.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: qemu-block@nongnu.org
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The following devices support both PCI Express and Conventional
PCI, by including special code to handle the QEMU_PCI_CAP_EXPRESS
flag and/or conditional pcie_endpoint_cap_init() calls:
* vfio-pci (is_express=1, but legacy PCI handled by
vfio_populate_device())
* vmxnet3 (is_express=0, but PCIe handled by vmxnet3_realize())
* pvscsi (is_express=0, but PCIe handled by pvscsi_realize())
* virtio-pci (is_express=0, but PCIe handled by
virtio_pci_dc_realize(), and additional legacy PCI code at
virtio_pci_realize())
* base-xhci (is_express=1, but pcie_endpoint_cap_init() call
is conditional on pci_bus_is_express(dev->bus)
* Note that xhci does not clear QEMU_PCI_CAP_EXPRESS like the
other hybrid devices
Cc: Dmitry Fleytman <dmitry@daynix.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The Linux kernel will query the SCSI "Block device characteristics"
VPD to determine the rotations per minute of the disk. If this has
the value 1, it is taken to be an SSD and so Linux sets the
'rotational' flag to 0 for the I/O queue and will stop using that
disk as a source of random entropy. Other operating systems may
also take into account rotation rate when setting up default
behaviour.
Mgmt apps should be able to set the rotation rate for virtualized
block devices, based on characteristics of the host storage in use,
so that the guest OS gets sensible behaviour out of the box. This
patch thus adds a 'rotation-rate' parameter for 'scsi-hd' and
'scsi-block' device types. For the latter, this parameter will be
ignored unless the host device has TYPE_DISK.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20171004114008.14849-2-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Modify the pre_save method on VMStateDescription to return an int
rather than void so that it potentially can fail.
Changed zillions of devices to make them return 0; the only
case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already
had an error_report/return case.
Note: If you add an error exit in your pre_save you must emit
an error_report to say why.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
SunOS defines ESP (x86 register) in <sys/regset.h> as 7.
This fixes build on SmartOS (Joyent).
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Message-Id: <20170909142116.26816-1-n54@gmx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Complete the transition by renaming this header, which was
shared by block/iscsi.c and the SCSI emulation code.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move more knowledge of SG_IO out of hw/scsi/scsi-generic.c, for
reusability.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move more knowledge of sense data format out of hw/scsi/scsi-bus.c
for reusability.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>