We want every user to be specific about the permissions it needs, so
we'll pass the initial permissions as parameters to blk_new(). A user
only needs to call blk_set_perm() if it wants to change the permissions
after the fact.
The permissions are stored in the BlockBackend and applied whenever a
BlockDriverState should be attached in blk_insert_bs().
This does not include actually choosing the right set of permissions
everywhere yet. Instead, the usual FIXME comment is added to each place
and will be addressed in individual patches.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-16-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-15-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This covers both file descriptor callbacks and polling callbacks,
since they execute related code.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-14-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In virtio_queue_host_notifier_aio_poll, not all "!virtio_queue_empty()"
cases are making true progress.
Currently the offending one is virtio-scsi event queue, whose handler
does nothing if no event is pending. As a result aio_poll() will spin on
the "non-empty" VQ and take 100% host CPU.
Fix this by reporting actual progress from virtio queue aio handlers.
Reported-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Ed Swierk <eswierk@skyportsystems.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
msix_init() reports errors with error_report(), which is wrong when
it's used in realize(). The same issue was fixed for msi_init() in
commit 1108b2f. In order to make the API change as small as possible,
leave the return value check to later patch.
For some devices(like e1000e, vmxnet3, nvme) who won't fail because of
msix_init's failure, suppress the error report by passing NULL error
object.
Bonus: add comment for msix_init.
CC: Jiri Pirko <jiri@resnulli.us>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Dmitry Fleytman <dmitry@daynix.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Hannes Reinecke <hare@suse.de>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There are a number of unused trace events that
scripts/cleanup-trace-events.pl finds. The "hw/vfio/pci-quirks.c"
filename was typoed and "qapi/qapi-visit-core.c" was missing the qapi/
directory prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170126171613.1399-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The trace-events for a given source file should generally
always live in the same directory as the source file.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170125161417.31949-3-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For configurations of the pflash_cfi01 device which set it up with a
device-width not equal to the width (ie where we are emulating
multiple narrow flash devices wired up in parallel), we were giving
incorrect values in the CFI data table:
(1) the sector length entry should specify the sector length for a
single device, not the length for the overall collection of
devices
(2) the number of blocks per device must not be divided by the
number of devices because the resulting device size would not
match the overall size
(3) this then means that the overall write block size must be
modified depending on the number of devices because the entry is
per device and when the guest writes into the flash it
calculates the write size by using the CFI entry (write size
per device) multiplied by the number of chips.
(It would alternatively be possible to modify the write
block size in the CFI table (currently hardcoded at 2048) and
leave the overall write block size alone.)
This commit corrects these bugs, and adds a hw-compat property
to retain the old behaviour on 2.8 and earlier versions. (The
only board we have which uses this sort of flash config and
has machine versioning is the "virt" board -- the PC uses a
single flash device and so behaviour is unaffected whether
using old-multiple-chip-handling or not.)
Here is a configuration example from the vexpress board:
VEXPRESS_FLASH_SIZE = 64M
VEXPRESS_FLASH_SECT_SIZE 256K
num-blocks = VEXPRESS_FLASH_SIZE / VEXPRESS_FLASH_SECT_SIZE = 256
sector-length = 256K
width = 4
device-width = 2
The code will fill the CFI entry with the following entries:
num-blocks = 256
sector-length = 128K
writeblock_size = 2048
This results in two chips, each with 256 * 128K = 32M device size and
a write block size of 2048.
A sector erase will be sent to both chips, thus 256K must be erased.
When the guest sends a block write command, it will write 4096 bytes
data at once (2048 per device).
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: cleaned up and expanded commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
n25q00 and mt25q01 devices share the same JEDEC ID. The difference
between those two devices is number of dies and one bit in extended
JEDEC bytes. This commit adds proper entry for both devices by
introduction the number of dies and and new 25q00 entries.
Signed-off-by: Marcin Krzeminski <mar.krzeminski@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20170108083854.5006-4-mar.krzeminski@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Modern big flash NOR devices consist of more than one die.
Some of them do not support chip erase and instead have a die
erase command that can erase one die only. This commit adds
support for defining the number of dies in the chip, and adds
support for die erase command.
The NOR flash model is not strict, so no option to
disable chip erase has been added.
Signed-off-by: Marcin Krzeminski <mar.krzeminski@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20170108083854.5006-3-mar.krzeminski@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some flash chips have additional page program opcode that
takes only 4 byte address. This commit adds support
for such command in Qemu.
Signed-off-by: Marcin Krzeminski <mar.krzeminski@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20170108083854.5006-2-mar.krzeminski@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, all virtio devices bypass IOMMU completely. This is because
address_space_memory is assumed and used during DMA emulation. This
patch converts the virtio core API to use DMA API. This idea is
- introducing a new transport specific helper to query the dma address
space. (only pci version is implemented).
- query and use this address space during virtio device guest memory
accessing when iommu platform (VIRTIO_F_IOMMU_PLATFORM) was enabled
for this device.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In normal operation we should never attempt to put more
data into the data[] array than it can hold. However if the
SPI controller connected to us misbehaves then it can send
us a sequence of commands that attempt this. Since the
controller might be in the guest (if the hardware does SPI
via bit-banging), catch the possible overrun conditions and
reset the flash internal state, logging them as guest errors.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 20170107111631.24444-1-jcd@tribudubois.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: rewrote commit message to be more exact about when
this can happen]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The guest does not need to kick the virtqueue while we are processing
it. This reduces the number of vmexits during periods of heavy I/O.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-8-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1480434248-27138-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The patch is to fix the confusing assert fail message caused by
un-initialized device structure (from bite sized tasks).
The bug can be reproduced by
./qemu-system-x86_64 -nographic -device cfi.pflash01
The CFI hardware is dynamically loaded by QOM realizing mechanism,
however the realizing function in pflash_cfi01_realize function
requires the device being initialized manually before calling, like
./qemu-system-x86_64 -nographic
-device cfi.pflash01,num-blocks=1024,sector-length=4096,name=testcard
Once the initializing parameters are left off in the command, it will
leave the device structure not initialized, which makes
pflash_cfi01_realize try to realize a zero-volume card, causing
/mnt/EXT_volume/projects/qemu/qemu-dev/exec.c:1378:
find_ram_offset: Assertion `size != 0\' failed.
Through my test, at least the flash device's block-number, sector-length
and its name is needed for pflash_cfi01_realize to behave correctly. So
I think the new asserts are needed to hint the QEMU user to specify
the device's parameters correctly.
Signed-off-by: Ziyue Yang <skiver.cloud.yzy@gmail.com>
Message-Id: <1481810693-13733-1-git-send-email-skiver.cloud.yzy@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ziyue Yang <yzylivezh@hotmail.com>
The guest sends discard requests as u64 sector/count pairs, but the
block layer operates internally with s64/s32 pairs. The conversion
leads to IO errors in the guest, the discard request is not processed.
domU.cfg:
'vdev=xvda, format=qcow2, backendtype=qdisk, target=/x.qcow2'
domU:
mkfs.ext4 -F /dev/xvda
Discarding device blocks: failed - Input/output error
Fix this by splitting the request into chunks of BDRV_REQUEST_MAX_SECTORS.
Add input range checking to avoid overflow.
Fixes f313520 ("xen_disk: add discard support")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Dataplane has been omitting forever the step of setting ISR when
an interrupt is raised. This caused little breakage, because the
specification actually says that ISR may not be updated in MSI mode.
Some versions of the Windows drivers however didn't clear MSI mode
correctly, and proceeded using polling mode (using ISR, not the used
ring index!) for crashdump and hibernation. If it were just crashdump
and hibernation it would not be a big deal, but recent releases of
Windows do not really shut down, but rather log out and hibernate to
make the next startup faster. Hence, this manifested as a more serious
hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs.
Newer versions fixed this, while older versions do not use MSI at all.
The failure has always been there for virtio dataplane, but it became
visible after commits 9ffe337 ("virtio-blk: always use dataplane path
if ioeventfd is active", 2016-10-30) and ad07cd6 ("virtio-scsi: always
use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk
and virtio-scsi always use the dataplane code under KVM. The good news
therefore is that it was not a bug in the patches---they were doing
exactly what they were meant for, i.e. shake out remaining dataplane bugs.
The fix is not hard, so it's worth arranging for the broken drivers.
The virtio_should_notify+event_notifier_set pair that is common to
virtio-blk and virtio-scsi dataplane is replaced with a new public
function virtio_notify_irqfd that also sets ISR. The irqfd emulation
code now need not set ISR anymore, so virtio_irq is removed.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Override start_ioeventfd and stop_ioeventfd to start/stop the
whole dataplane logic. This has some positive side effects:
- no need anymore for virtio_add_queue_aio (i.e. a revert of
commit 0ff841f6d1)
- no need anymore to switch from generic ioeventfd handlers to
dataplane
It detects some errors better:
$ qemu-system-x86_64 -object iothread,id=io \
-drive id=null,file=null-aio://,if=none,format=raw \
-device virtio-blk-pci,ioeventfd=off,iothread=io,drive=null
qemu-system-x86_64: -device virtio-blk-pci,ioeventfd=off,iothread=io,drive=null:
ioeventfd is required for iothread
while previously it would have started just fine.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This simplifies the code and removes the ioeventfd_started
and ioeventfd_set_started callback. The only difference is
in how virtio-ccw handles an error---it doesn't disable
ioeventfd forever anymore. It was the only backend to do
so, and if desired this behavior should be implemented in
virtio-bus.c.
Instead of ioeventfd_started, the ioeventfd_assign callback now
determines whether the virtio bus supports host notifiers.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Prepare xen_be_send_notify to be shared with frontends:
* xen_be_send_notify -> xen_pv_send_notify
Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Prepare xen_be_unbind_evtchn to be shared with frontends:
* xen_be_unbind_evtchn -> xen_pv_unbind_evtchn
Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Prepare xen_be_printf to be used by both backend and frontends:
* xen_be_printf -> xen_pv_printf
Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Fixes:
* WARNING: line over 80 characters
Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
The condition '!A || (A && B)' is equivalent to '!A || B'.
Buglink: https://bugs.launchpad.net/qemu/+bug/1464611
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This makes the FloppyDrive qdev object actually useful: Now that it has
all properties that don't belong to the controller, you can actually
use '-device floppy' and get a working result.
Command line semantics is consistent with CD-ROM drives: By default you
get a single empty floppy drive. You can override it with -drive and
using the same index, but if you use -drive to add a floppy to a
different index, you get both of them. However, as soon as you use any
'-device floppy', even to a different slot, the default drive is
disabled.
Using '-device floppy' without specifying the unit will choose the first
free slot on the controller.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1477386868-21826-4-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Floppy controllers automatically create two floppy drive devices in qdev
now. (They always created two drives, but managed them only internally.)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1477386868-21826-3-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This adds a qbus to the floppy controller that should contain the floppy
drives eventually. At the moment it just exists and is empty.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1477386868-21826-2-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
The hard-coded default alignment is BDRV_SECTOR_SIZE, however this is not
necessarily the case for all platforms. Use this as the default alignment for
all current callers.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-id: 1476445266-27503-2-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: John Snow <jsnow@redhat.com>
Now all the usages of the old version of VMSTATE_VIRTIO_DEVICE are gone,
so we can get rid of the conditionals, and the old macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the new VMSTATE_VIRTIO_DEVICE macro.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All these errors are caused by a buggy guest: QEMU should not exit.
With this patch, if virtio_blk_handle_request() detects a buggy request, it
marks the device as broken and returns an error to the caller so it takes
appropriate action.
In the case of virtio_blk_handle_vq(), we detach the request from the
virtqueue, free its allocated memory and stop popping new requests.
We don't need to bother about multireq since virtio_blk_handle_request()
errors out early and mrb.num_reqs == 0.
In the case of virtio_blk_dma_restart_bh(), we need to detach and free all
queued requests as well.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some functions that were called from the dataplane code are now only used
locally:
virtio_blk_init_request()
virtio_blk_handle_request()
virtio_blk_submit_multireq()
since commit "03de2f527499 virtio-blk: do not use vring in dataplane", and
virtio_blk_free_request()
since commit "6aa46d8ff1ee virtio: move VirtQueueElement at the beginning
of the structs".
This patch converts them to static.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make sure to unmap the scatter-gather list and decrement vq->inuse
before freeing requests in virtio_blk_reset().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Almost all block devices are qdevified by now. This allows us to go back
from the BlockBackend to the DeviceState. xen_disk is the last device
that is missing. We'll remember in the BlockBackend if a xen_disk is
attached and can then disable any features that require going from a BB
to the DeviceState.
While at it, clearly mark the function used by xen_disk as legacy even
in its name, not just in TODO comments.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Copy data operated on during request from/to local buffers to/from
the grant references.
Before grant copy operation local buffers must be allocated what is
done by calling ioreq_init_copy_buffers. For the 'read' operation,
first, the qemu device invokes the read operation on local buffers
and on the completion grant copy is called and buffers are freed.
For the 'write' operation grant copy is performed before invoking
write by qemu device.
A new value 'feature_grant_copy' is added to recognize when the
grant copy operation is supported by a guest.
Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
The old one is confusing with @virtio_device_info in virtio.c,
so make it more appropriate.
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-id: 1470214147-32560-1-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The isa_register_portio_list() function allocates ioports
data/state. Let's keep the reference to this data on some owner. This
isn't enough to fix leaks, but at least, ASAN stops complaining of
direct leaks. Further cleanup would require calling
portio_list_del/destroy().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Correct bad name of the vmstate structure. Since this breaks
compatibility also update vmstate version back to 0 and make
all fields independent of the VMState version.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1473146346-27337-1-git-send-email-marcin.krzeminski@nokia.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The qdisk implementation is using the native xenbus protocol only in
case of no protocol specified at all. As using the explicit 32- or
64-bit protocol is slower than the native one due to copying requests
not by memcpy but element for element, this is not optimal.
Correct this by using the native protocol in case word sizes of
frontend and backend match.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
This is stale after commit 6e40b3bf (virtio-blk: Use blk_drain() to
drain IO requests), remove it.
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1470278654-13525-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
At system_reset, there is no point in retrying the queued request,
because the driver that issued the request won't be around any more.
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1470278654-13525-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The broken Identify implementation in earlier Qemu versions means we
need to blacklist it from issueing the NVMe 1.1 Identify Namespace List
command. As we want to be able to use it in newer Qemu versions we need
a way to identify those. Bump the PCI revision as a guest visible
indicator of this bug fix.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
NVMe 1.1 requires devices to implement a Namespace List subcommand of
the identify command. Qemu not only not implements this features, but
also misinterprets it as an Identify Controller request. Due to this
any OS trying to use the Namespace List will fail the probe.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
- interrupt remapping for intel iommus
- a bunch of virtio cleanups
- fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXkQsqAAoJECgfDbjSjVRpanoIAJ9JVlc1aEjt9sa0cSBcs+NQ
J7JmgU9FqFsj+4FrNTouO3AxTjHurd1UAULP1WMPD+V3JpbnHct8r6SCBLQ5EBMN
VOjYo4DwWs1g+DqnQ9WZmbadu06XvYi/yiAKNUzWfZk0MR11D0D/S5hmarNKw0Kq
tGHeTWjGeY4WqFLV7m+qB4+cqkAByn6um99UtUvgLL05RgIEIP2IEMKYZ+rXvAa9
iGUvzqlO7mbq/+LbL18kaWywa4TCwbbd2eSGWaqhX4CuB62Rl33mWTXFcfaYhkyp
Z3FgwaJ09h0lAjSVEbyAuLFMfO/BnMcsoKqwl4xc4vkn/xBCqFtgH9JcEVm3O8U=
=ge2D
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes
- interrupt remapping for intel iommus
- a bunch of virtio cleanups
- fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 21 Jul 2016 18:49:30 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (57 commits)
intel_iommu: avoid unnamed fields
virtio: Update migration docs
virtio-gpu: Wrap in vmstate
virtio-gpu: Use migrate_add_blocker for virgl migration blocking
virtio-input: Wrap in vmstate
9pfs: Wrap in vmstate
virtio-serial: Wrap in vmstate
virtio-net: Wrap in vmstate
virtio-balloon: Wrap in vmstate
virtio-rng: Wrap in vmstate
virtio-blk: Wrap in vmstate
virtio-scsi: Wrap in vmstate
virtio: Migration helper function and macro
virtio-serial: Remove old migration version support
virtio-net: Remove old migration version support
virtio-scsi: Replace HandleOutput typedef
Revert "mirror: Workaround for unexpected iohandler events during completion"
virtio-scsi: Call virtio_add_queue_aio
virtio-blk: Call virtio_add_queue_aio
virtio: Introduce virtio_add_queue_aio
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Forcibly convert it to a vmstate wrapper; proper conversion
comes later.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
AIO based handler is more appropriate here because it will then
cooperate with bdrv_drained_begin/end. It is needed by the coming
revert patch.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Change sector-based blk_discard(), blk_co_discard(), and
blk_aio_discard() to instead be byte-based blk_pdiscard(),
blk_co_pdiscard(), and blk_aio_pdiscard(). NBD gets a lot
simpler now that ignoring the unaligned portion of a
byte-based discard request is handled under the hood by
the block layer.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1468624988-423-6-git-send-email-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
No need duplicate the judgment, there is one in function entry.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1468814749-14510-1-git-send-email-caoj.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Winbond also support continuous read mode, but as an opposite for other
flash type read mode clock cycles are included to dummy cycles number.
This path add proper handling of read mode byte and update needed
dummy cycles. QPI mode and dummy cycles configuration are not supported.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467809036-6986-1-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The rerror/werror policies are implemented in the devices, so that's
where they should be configured. In comparison to the old options in
-drive, the qdev properties are only added to those devices that
actually support them.
If the option isn't given (or "auto" is specified), the setting of the
BlockBackend is used for compatibility with the old options. For block
jobs, "auto" is the same as "enospc".
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
As cache.writeback is a BlockBackend property and as such more related
to the guest device than the BlockDriverState, we already removed it
from the blockdev-add interface. This patch adds the new way to set it,
as a qdev property of the corresponding guest device.
For example: -drive if=none,file=test.img,node-name=img
-device ide-hd,drive=img,write-cache=off
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Tracked down with an ugly, brittle and probably buggy Perl script.
Also move includes converted to <...> up so they get included before
ours where that's obviously okay.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
These two are spot by Coverity 1357232 and 1357233.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1467684998-12076-1-git-send-email-zhaoshenglong@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Sector-based limits are awkward to think about; in our on-going
quest to move to byte-based interfaces, convert max_transfer_length
and opt_transfer_length. Rename them (dropping the _length suffix)
so that the compiler will help us catch the change in semantics
across any rebased code, and improve the documentation. Use unsigned
values, so that we don't have to worry about negative values and
so that bit-twiddling is easier; however, we are still constrained
by 2^31 of signed int in most APIs.
When a value comes from an external source (iscsi and raw-posix),
sanitize the results to ensure that opt_transfer is a power of 2.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Making all callers special-case 0 as unlimited is awkward,
and we DO have a hard maximum of BDRV_REQUEST_MAX_SECTORS given
our current block layer API limits.
In the case of scsi, this means that we now always advertise a
limit to the guest, even in cases where the underlying layers
previously use 0 for no inherent limit beyond the block layer.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This allows specifying the property via -drive if=none and creating
the flash device with -device.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-6-git-send-email-clg@kaod.org
[clg: added an extra fix for sabrelite_init()
keeping the test on flash_dev did not seem necessary. ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The maximum amount of storage that can be addressed by the m25p80 command
set is 4 GiB. However, cur_addr is currently a 64-bit integer. To avoid
further problems related to sign extension of signed 32-bit integer
expressions, change cur_addr to a 32 bit integer. Preserve migration
format by adding a dummy 4-byte field in place of the (big-endian)
high four bytes in the formerly 64-bit cur_addr field.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-5-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
s->cur_addr can be made to point outside s->storage, either by
writing a value >= 128 to s->ear (because s->ear * MAX_3BYTES_SIZE
is a signed integer and sign-extends into the 64-bit cur_addr),
or just by writing an address beyond the size of the flash being
emulated. Avoid the sign extension to make the code cleaner, and
on top of that mask s->cur_addr to s->size.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-4-git-send-email-clg@kaod.org
Reviewed by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When doing a read-modify-write cycle, QEMU uses the iovec after returning
from blk_aio_pwritev. m25p80 puts the iovec on the stack of blk_aio_pwritev's
caller, which causes trouble in this case. This has been a problem
since commit 243e6f6 ("m25p80: Switch to byte-based block access",
2016-05-12) started doing writes at a smaller granularity than 512 bytes.
In principle however it could have broken before when using -drive
if=mtd,cache=none on a disk with 4K native sectors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This enables qemu to handle late inits and report errors. All the SSI
slave routine names were changed accordingly. Code was modified to
handle errors when possible (m25p80 and ssi-sd)
Tested with the m25p80 slave object.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-2-git-send-email-clg@kaod.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qdev API can be used to create CFI pflash devices despite existance of helper
functions. The type name is needed in course of such creation. Using the
preprocessor alias instead of the string literal itself is preferable.
The patch makes the aliases accessible through the header.
Signed-off-by: Efimov Vasily <real@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Multiqueue virtio-blk can be enabled as follows:
qemu -device virtio-blk-pci,num-queues=8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-8-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Monitor ioeventfds for all virtqueues in the device's AioContext. This
is not true multiqueue because requests from all virtqueues are
processed in a single IOThread. In the future it will be possible to
use multiple IOThreads when the QEMU block layer supports multiqueue.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-7-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a field for the virtqueue index when migrating the s->rq request
list. The new field is only needed when num_queues > 1. Existing QEMUs
are unaffected by this change and therefore virtio-blk migration stays
compatible.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-6-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Multiqueue requires that each request knows to which virtqueue it
belongs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-5-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Let the virtio_blk_data_plane_notify() caller decide which virtqueue to
notify. This will allow the function to be used with multiqueue.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-4-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The batch notification BH needs to know which virtqueues to notify when
multiqueue is enabled. Use a bitmap to track the virtqueues with
pending notifications.
At this point there is only one virtqueue so hard-code virtqueue index
0. A later patch will switch to real virtqueue indices.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-3-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The num_queues field is always 1 for the time being. A later patch will
make it a configurable device property so that multiqueue can be
enabled.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-2-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This reverts commit 10a06fd65f.
Dataplane has used the same virtqueue code as non-dataplane since
commits e24a47c5b7 ("virtio-scsi: do not
use vring in dataplane") and 03de2f5274
("virtio-blk: do not use vring in dataplane"). It is no longer
necessary to stop dataplane in order to sync state since there is no
duplicated virtqueue state.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-id: 1466503331-9831-1-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit fix obvious bug in WINBOND command handling.
Datasheet states that default dummy cycles is 8 so fix it.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Message-id: 1466755631-25201-11-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Macronix: mx66u51235f and mx66u1g45g
Micron: mt25ql01g and mt25qu01g
Spansion: s25fs512s and s70fs01gs
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-10-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Support for Spansion and Macronix flashes.
Additionally Numonyx(Micron) moved from default
in fast read commands family. Also moved fast read
command decoding to functions.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-9-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Configuration registers for Spansion and Macronix devices.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-8-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Quad and Equad modes for Spansion and Macronix flash devices.
This commit also includes modification and new command to manipulate
quad mode (status registers and dedicated commands).
This work is based on Pawel Lenkow work.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-7-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Page program 4byte/quad and erase 32K sectors 4 bytes.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-6-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some flash allows to stop read at any time.
Allow framework to support this.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-5-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allow to have more than four 16MiB regions for bigger flash devices.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-4-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since it is now longer than 4. This work based on Pawel Lenkow
changes and the kernel SPI framework.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-3-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instead of always reading and comparing jededc ID,
replace it by function.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Message-id: 1466755631-25201-2-git-send-email-marcin.krzeminski@nokia.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
All users have been converted to the new ioevent callbacks.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Have vhost and dataplane use the new api for transports that
have been converted.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In case the word size of the domU and qemu running the qdisk backend
differ BLKIF_OP_DISCARD will not work reliably, as the request
structure in the ring have different layouts for different word size.
Correct this by copying the request structure in case of different
word size element by element in the BLKIF_OP_DISCARD case, too.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Move all trace-events for files in the hw/block/ directory to
their own file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-8-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check
is bogus without a previous inclusion of sys/mman.h. Include it in
sysemu/os-posix.h and remove it from everywhere else.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
commit 243e6f69c1 ("m25p80: Switch to byte-based block access")
replaced blk_read() calls with blk_pread() but return values are
different.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit f9e98e5d7a ("xen/blkif: Avoid double access to
src->nr_segments") didn't go far enough: src->operation is also being
used twice. And nothing was done to prevent the compiler from using the
source side of the copy done by blk_get_request() (granted that's very
unlikely).
Move the barrier()s up, and add another one to blk_get_request().
Note that for completing XSA-155, the barrier() getting added to
blk_get_request() would suffice, and hence the changes to xen_blkif.h
are more like just cleanup. And since, as said, the unpatched code
getting compiled to something vulnerable is very unlikely (and not
observed in practice), this isn't being viewed as a new security issue.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).
This patch is the result of coccinelle script
scripts/coccinelle/round.cocci
CC: qemu-block@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Block layer is prepared to unspecialize dataplane, an evidence is this
almost complete list of unblocked operations. It has all types except
two (actually three if DATAPLANE itself counts but blockdev.c makes sure
attaching twice is not possible): MIRROR_TARGET and BACKUP_TARGET.
blockdev-mirror refuses to start if target is attached, so the first is
not a problem.
By removing BACKUP_TARGET, blockdev-backup will become permissive to
write to a virtio-blk dataplane disk, but that is not worse than
non-dataplane given the latter is already possible. In either case,
blockdev.c always checks the target and source are on the same
AioContext, or bring them together if possible.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1463969978-24970-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Move it to the actual users. There are still a few includes of
qemu/bswap.h in headers; removing them is left for future work.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sector-based blk_read() should die; switch to byte-based
blk_pread() instead.
Likewise for blk_aio_readv() and blk_aio_writev().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead. Likewise for blk_read().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead. Likewise for blk_read().
This particular device picks its size during onenand_initfn(),
and can be at most 0x80000000 bytes; therefore, shifting an
'int sec' request to get back to a byte offset should never
overflow 32 bits. But adding assertions to document that point
should not hurt.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead. Likewise for blk_read().
This file is doing some complex computations to map various
flash page sizes (256, 512, and 2048) atop generic uses of
512-byte sector operations. Perhaps someone will want to tidy
up the file for fewer gymnastics in managing addresses and
offsets, and less wasteful visits of 256-byte pages, but it
was out of scope for this series, where I just went with the
mechanical conversion.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead. Likewise for blk_read().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.
The trace is modified at the same time, and nb_sectors is now
unused. Fix a comment typo while in the vicinity.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_read() should die; convert the one-off
variant blk_read_unthrottled().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Pflash migration (e.g. q35 + EFI variable storage) fails
with the assert:
bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.
This avoids the problem by delaying the pflash update until after
the device loads complete.
Tested by:
Migrating Q35/EFI vm.
Changing efi variable content (with efiboot in the guest)
md5sum'ing the variable file before migration and after.
This is a fix that Paolo posted in the message
570244B3.4070105@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eliminating the reentrancy is actually a nice thing that we can do
with the API that Michael proposed, so let's make it first class.
This also hides the complex assign/set_handler conventions from
callers of virtio_queue_aio_set_host_notifier_handler, which in
fact was always called with assign=true.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.
This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant. Use a separate handler just
for aio, and disable regular handlers when dataplane is active.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We must not call virtio_blk_data_plane_notify if dataplane is
disabled: we would hit a segmentation fault in notify_guest_bh as
s->guest_notifier has not been setup and is NULL.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-12-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-11-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implements FSR register, it is used for busy waits.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-10-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Adds fast read and 4bytes commands family.
This work is based on Pawel Lenkow patch from v1.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-9-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the setting from the volatile cfg register to correctly
set the number of dummy cycles.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-8-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds both volatile and non volatile configuration registers
and commands to allow modify them. It is needed for proper handling
dummy cycles. Initialization of those registers and flash state
has been included as well.
Some of this registers are used by kernel.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Acked-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-7-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds only 4byte address mode (does not cover dummy cycles).
This mode is needed to access more than 16 MiB of flash.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-6-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Extend address mode allows to switch flash 16 MiB banks,
allowing user to access all flash sectors.
This access mode is used by u-boot.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-5-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Extend the width of the flags variable to support the already existing
(but unused) WR_1 flag, which is above the range of 8 bits.
This allows support of EEPROM emulation which requires the WR_1 feature.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-4-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-3-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-2-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)
Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch replaces get_ticks_per_sec() calls with the macro
NANOSECONDS_PER_SECOND. Also, as there are no callers, get_ticks_per_sec()
is then removed. This replacement improves the readability and
understandability of code.
For example,
timer_mod(fdctrl->result_timer,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (get_ticks_per_sec() / 50));
NANOSECONDS_PER_SECOND makes it obvious that qemu_clock_get_ns
matches the unit of the expression on the right side of the plus.
Signed-off-by: Rutuja Shah <rutu.shah.26@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this patch, blk_new() automatically assigned a name to the new
BlockBackend and considered it referenced by the monitor. This patch
removes the implicit monitor_add_blk() call from blk_new() (and
consequently the monitor_remove_blk() call from blk_delete(), too) and
thus blk_new() (and related functions) no longer take a BB name
argument.
In fact, there is only a single point where blk_new()/blk_new_open() is
called and the new BB is monitor-owned, and that is in blockdev_init().
Besides thus relieving us from having to invent names for all of the BBs
we use in qemu-img, this fixes a bug where qemu cannot create a new
image if there already is a monitor-owned BB named "image".
If a BB and its BDS tree are created in a single operation, as of this
patch the BDS tree will be created before the BB is given a name
(whereas it was the other way around before). This results in minor
change to the output of iotest 087, whose reference output is amended
accordingly.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When populating ACPI objects for floppy drives one needs to provide the
maximum values for cylinder, sector, and head number the drive supports.
This patch adds a function that iterates through the array of predefined
floppy drive formats and returns the maximum values of c, h, s, out of
those matching the given floppy drive type.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
In disabled mode, virtio-blk dataplane seems to be enabled, but flow
actually goes through the normal virtio path. This patch simplifies a bit
the handling of disabled mode. In disabled mode, virtio_blk_handle_output
might be called even if s->dataplane is not NULL.
This is a bit tricky, because the current check for s->dataplane will
always trigger, causing a continuous stream of calls to
virtio_blk_data_plane_start. Unfortunately, these calls will not
do anything. To fix this, set the "started" flag even in disabled
mode, and skip virtio_blk_data_plane_start if the started flag is true.
The resulting changes also prepare the code for the next patch, were
virtio-blk dataplane will reuse the same virtio_blk_handle_output function
as "regular" virtio-blk.
Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have
to move s->dataplane->started inside struct VirtIOBlock.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Make the API more similar to the regular virtqueue API. This will
help when modifying the code to not use vring.c anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Include osdep.h as the first header in nand.c; this has to be
done manually because coccinelle gets confused by the way that
this C file includes itself.
We fix some odd spacing in #includes while we are in the area.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Coverity noticed that some variables are only used by debug prints, and
called them unused. Always compile the print statements. While we're
here, print to stderr as well.
Bonus: Fix a debug printf I broke in f31937aa8
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Touched up commit message. --js]
Message-id: 1454971529-14830-1-git-send-email-jsnow@redhat.com
Included here:
Refactoring and bugfix patches in PC/ACPI.
New commands for ipmi.
Virtio optimizations.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWtj8KAAoJECgfDbjSjVRpBIQIAJSB9xwTcBLXwD0+8z5lqjKC
GTtuVbHU0+Y/eO8O3llN5l+SzaRtPHo18Ele20Oz7IQc0ompANY273K6TOlyILwB
rOhrub71uqpOKbGlxXJflroEAXb78xVK02lohSUvOzCDpwV+6CS4ZaSer7yDCYkA
MODZj7rrEuN0RmBWqxbs1R7Mj2CeQJzlgTUNTBGCLEstoZGFOJq8FjVdG5P1q8vI
fnI9mGJ1JsDnmcUZe/bTFfB4VreqeQ7UuGyNAMMGnvIbr0D1a+CoaMdV7/HZ+KyT
5TIs0siVdhZei60A/Cq2OtSVCbj5QdxPBLhZfwJCp6oU4lh2U5tSvva0mh7MwJ0=
=D/cA
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc and misc cleanups and fixes, virtio optimizations
Included here:
Refactoring and bugfix patches in PC/ACPI.
New commands for ipmi.
Virtio optimizations.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sat 06 Feb 2016 18:44:26 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream: (45 commits)
net: set endianness on all backend devices
fix MSI injection on Xen
intel_iommu: large page support
dimm: Correct type of MemoryHotplugState->base
pc: set the OEM fields in the RSDT and the FADT from the SLIC
acpi: add function to extract oem_id and oem_table_id from the user's SLIC
acpi: expose oem_id and oem_table_id in build_rsdt()
acpi: take oem_id in build_header(), optionally
pc: Eliminate PcGuestInfo struct
pc: Move APIC and NUMA data from PcGuestInfo to PCMachineState
pc: Move PcGuestInfo.fw_cfg to PCMachineState
pc: Remove PcGuestInfo.isapc_ram_fw field
pc: Remove RAM size fields from PcGuestInfo
pc: Remove compat fields from PcGuestInfo
acpi: Don't save PcGuestInfo on AcpiBuildState
acpi: Remove guest_info parameters from functions
pc: Simplify xen_load_linux() signature
pc: Simplify pc_memory_init() signature
pc: Eliminate struct PcGuestInfoState
pc: Move PcGuestInfo declaration to top of file
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move allocation to virtio functions also when loading/saving a
VirtQueueElement. This will also let the load/save functions
keep backwards compatibility when the VirtQueueElement layout
is changed.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0. We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.
The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items. Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.
The patch is pretty large, but changes to each device are testable
more or less independently. Splitting it would mostly add churn.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
All functions relative to DMA (DMA_*() functions) are stubs on sparc platform.
Disable the DMA in the floppy controller, instead of calling these stubs.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-14-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
Accidentally, I removed a "feature" where empty drives had geometry
values applied to them, which allows seek on empty drives to work
"by accident," as QEMU actually tries to disallow that.
Seeks on empty drives should work, though, but the easiest thing is to
restore the misfeature where empty drives have non-zero geometries
applied.
Document the hack accordingly.
[Maintainer edit]
This fix corrects a regression introduced in d5d47efc, where
pick_geometry was modified such that it would not operate on empty
drives, and as a result if there is no diskette inserted, QEMU
no longer populates it with geometry bounds. As a result, seek fails
when QEMU denies to move the current track, but reports success anyway.
This can confuse the guest, leading to kernel panics in the guest.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1454106932-17236-1-git-send-email-jsnow@redhat.com
Put the code for setting up and removing op blockers into an own
function, respectively. Then, we can invoke those functions whenever a
BDS is removed from an virtio-blk BB or inserted into it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts the changes that commit
2e1280e8ff applied to hw/block/fdc.c;
also, an additional case of drv->media_inserted use has crept in since,
which is replaced by a call to blk_is_inserted().
That commit changed tests/fdc-test.c, too, because after it, one less
TRAY_MOVED event would be emitted when executing 'change' on an empty
drive. However, now, no TRAY_MOVED events will be emitted at all, and
the tray_open status returned by query-block will always be false,
necessitating (different) changes to tests/fdc-test.c and iotest 118,
which is why this patch is not a pure revert of said commit.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1454096953-31773-4-git-send-email-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Background on QEMU boot indices
-------------------------------
Normally, the "bootindex" property is configured for bootable devices
with:
DEVICE_instance_init()
device_add_bootindex_property(..., "bootindex", ...)
object_property_add(..., device_get_bootindex,
device_set_bootindex, ...)
and when the bootindex is set on the QEMU command line, with
-device DEVICE,...,bootindex=N
the setter that was configured above is invoked:
device_set_bootindex()
/* parse boot index */
visit_type_int32()
/* verify unicity */
check_boot_index()
/* store parsed boot index */
...
/* insert device path to boot order */
add_boot_device_path()
In the last step, add_boot_device_path() ensures that an OpenFirmware
device path will show up in the "bootorder" fw_cfg file, at a position
corresponding to the device's boot index. Thus guest firmware (SeaBIOS and
OVMF) can try to boot off the device with the right priority.
NVMe boot index
---------------
In QEMU commit 33739c7129,
nvma: ide: add bootindex to qom property
the following generic setters / getters:
- device_set_bootindex()
- device_get_bootindex()
were open-coded for NVMe, under the names
- nvme_set_bootindex()
- nvme_get_bootindex()
Plus nvme_instance_init() was added to configure the "bootindex" property
manually, designating the open-coded getter & setter, rather than calling
device_add_bootindex_property().
Crucially, nvme_set_bootindex() avoided the final add_boot_device_path()
call. This fact is spelled out in the message of commit 33739c7129, and
it was presumably the entire reason for all of the code duplication.
Now, Vladislav filed an RFE for OVMF
<https://github.com/tianocore/edk2/issues/48>; OVMF should boot off NVMe
devices. It is simple to build edk2's existent NvmExpressDxe driver into
OVMF, but the boot order matching logic in OVMF can only handle NVMe if
the "bootorder" fw_cfg file includes such devices.
Therefore this patch converts the NVMe device model to
device_set_bootindex() all the way.
Device paths
------------
device_set_bootindex() accepts an optional parameter called "suffix". When
present, it is expected to take the form of an OpenFirmware device path
node, and it gets appended as last node to the otherwise auto-generated
OFW path.
For NVMe, the auto-generated part is
/pci@i0cf8/pci8086,5845@6[,1]
^ ^ ^ ^
| | PCI slot and (present when nonzero)
| | function of the NVMe controller, both hex
| "driver name" component, built from PCI vendor & device IDs
PCI root at system bus port, PIO
to which here we append the suffix
/namespace@1,0
^ ^
| big endian (MSB at lowest address) numeric interpretation
| of the 64-bit IEEE Extended Unique Identifier, aka EUI-64,
| hex
32-bit NVMe namespace identifier, aka NSID, hex
resulting in the OFW device path
/pci@i0cf8/pci8086,5845@6[,1]/namespace@1,0
The reason for including the NSID and the EUI-64 is that an NVMe device
can in theory produce several different namespaces (distinguished by
NSID). Additionally, each of those may (optionally) have an EUI-64 value.
For now, QEMU only provides namespace 1.
Furthermore, QEMU doesn't even represent the EUI-64 as a standalone field;
it is embedded (and left unused) inside the "NvmeIdNs.res30" array, at the
last eight bytes. (Which is fine, since EUI-64 can be left zero-filled if
unsupported by the device.)
Based on the above, we set the "unit address" part of the last
("namespace") node to fixed "1,0".
OVMF will then map the above OFW device path to the following UEFI device
path fragment, for boot order processing:
PciRoot(0x0)/Pci(0x6,0x1)/NVMe(0x1,00-00-00-00-00-00-00-00)
^ ^ ^ ^ ^ ^
| | | | | octets of the EUI-64 in address order
| | | | NSID
| | | NVMe namespace messaging device path node
| PCI slot and function
PCI root bridge
Cc: Keith Busch <keith.busch@intel.com> (supporter:nvme)
Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core)
Cc: qemu-block@nongnu.org (open list:nvme)
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Vladislav Vovchenko <vladislav.vovchenko@sk.com>
Cc: Feng Tian <feng.tian@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Tested-by: Vladislav Vovchenko <vladislav.vovchenko@sk.com>
Message-id: 1453850483-27511-1-git-send-email-lersek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In Xen 4.7 we are refactoring parts libxenctrl into a number of
separate libraries which will provide backward and forward API and ABI
compatiblity.
One such library will be libxengnttab which provides access to grant
tables.
In preparation for this switch the compatibility layer in xen_common.h
(which support building with older versions of Xen) to use what will
be the new library API. This means that the gnttab shim will disappear
for versions of Xen which include libxengnttab.
To simplify things for the <= 4.0.0 support we wrap the int fd in a
malloc(sizeof int) such that the handle is always a pointer. This
leads to less typedef headaches and the need for
XC_HANDLER_INITIAL_VALUE etc for these interfaces.
Note that this patch does not add any support for actually using
libxengnttab, it just adjusts the existing shims.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The 2.88 drive is more suitable as a default because
it can still read 1.44 images correctly, but the reverse
is not true.
Since there exist virtio-win drivers that are shipped on
2.88 floppy images, this patch will allow VMs booted without
a floppy disk inserted to later insert a 2.88MB floppy and
have that work.
This patch has been tested with msdos, freedos, fedora,
windows 8 and windows 10 without issue: if problems do
arise for certain guests being unable to cope with 2.88MB
drives as the default, they are in the minority and can use
type=144 as needed (or insert a proper boot medium and omit
type=144/288 or use type=auto) to obtain different drive types.
As icing, the default will remain auto/144 for any pre-2.6
machine types, hopefully minimizing the impact of this change
in legacy hw to basically zero.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-13-git-send-email-jsnow@redhat.com
This one is the crazy one.
fd_revalidate currently uses pick_geometry to tell if the diskette
geometry has changed upon an eject/insert event, but it won't allow us
to insert a 1.44MB diskette into a 2.88MB drive. This is inflexible.
The new algorithm applies a new heuristic to guessing disk geometries
that allows us to switch diskette types as long as the physical size
matches before falling back to the old heuristic.
The old one is roughly:
- If the size (sectors) and type matches, choose it.
- Fall back to the first geometry that matched our type.
The new one is:
- If the size (sectors) and type matches, choose it.
- If the size (sectors) and physical size match, choose it.
- Fall back to the first geometry that matched our type.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1453495865-9649-11-git-send-email-jsnow@redhat.com
2.88MB capable drives can accept 1.44MB floppies,
for instance. To rework the pick_geometry function,
we need to know if our current drive can even accept
the type of disks we're considering.
NB: This allows us to distinguish between all of the
"total sectors" collisions between 1.20MB and 1.44MB
diskette types, by using the physical drive size as a
differentiator.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-10-git-send-email-jsnow@redhat.com
This patch adds a new explicit Floppy Drive Type option. The existing
behavior in QEMU is to automatically guess a drive type based on the
media inserted, or if a diskette is not present, arbitrarily assign one.
This behavior can be described as "auto." This patch adds the option
to pick an explicit behavior: 120, 144, 288 or none. The new "auto"
option is intended to mimic current behavior, while the other types
pick one explicitly.
Set the type given by the CLI during fd_init. If the type remains the
default (auto), we'll attempt to scan an inserted diskette if present
to determine a type. If auto is selected but no diskette is present,
we fall back to a predetermined default (currently 1.44MB to match
legacy QEMU behavior.)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-9-git-send-email-jsnow@redhat.com
Currently, QEMU chooses a drive type automatically based on the inserted
media. If there is no disk inserted, it chooses a 1.44MB drive type.
Change this behavior to be configurable, but leave it defaulted to 1.44.
This is not earnestly intended to be used by a user or a management
library, but rather exists so that pre-2.6 board types can configure it
to be a legacy value.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-8-git-send-email-jsnow@redhat.com
Split apart pick_geometry by creating a pick_drive routine that will only
ever called during device bring-up instead of relying on pick_geometry to
be used in both cases.
With this change, the drive field is changed to be 'write once'. It is
not altered after the initialization routines exit.
media_validated does not need to be migrated. The target VM
will just revalidate the media on post_load anyway.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-7-git-send-email-jsnow@redhat.com
pick_geometry is a convoluted function that makes it difficult to tell
at a glance what QEMU's current behavior for choosing a floppy drive
type is when it can't quite identify the diskette.
The code iterates over all entries in the candidate geometry table
("fd_formats") and if our specific drive type matches a row in the table,
then either "match" is set to that entry (an exact match) and the loop
exits, or "first_match" will be non-negative (the first such entry that
shares the same drive type), and the loop continues. If our specific
drive type is NONE, then all drive types in the candidate geometry table
are considered. After iteration, if "match" was not set, we fall back to
"first match".
This means that either "match" was set, or we exited the loop without an
exact match, in which case:
- If drive type is NONE, the default is truly fd_formats[0], a 1.44MB
type, because "first_match" will always get set to the first item.
- If drive type is not NONE, pick_geometry's iteration was fussier and
only looked at rows that matched our drive type. However, since all
possible drive types are represented in the table, we still know that
"first match" was set.
- If drive type is not NONE and the fd_formats table lists no options for
our drive type, we choose fd_formats[1], an incomprehensibly bizarre
choice that can never happen anyway.
Correct this: If first_match is -1, it can ONLY mean we didn't edit our
fd_formats table correctly. Throw an assertion instead.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-6-git-send-email-jsnow@redhat.com
Currently, 'drive' is used both to represent the current diskette
type as well as the current drive type.
This patch adds a 'disk' field that is updated explicitly to match
the type of the disk.
As of this patch, disk and drive are always the same, but forthcoming
patches to change the behavior of pick_geometry will invalidate this
assumption.
disk does not need to be migrated because it is not user-visible state
nor is it currently used for any calculations. It is purely informative,
and will be rebuilt automatically via fd_revalidate on the new host.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-5-git-send-email-jsnow@redhat.com
Change the floppy drive type to a QAPI enum type, to allow us to
specify the floppy drive type from the CLI in a forthcoming patch.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-4-git-send-email-jsnow@redhat.com
Modify this function to operate directly on FDrive objects instead of
unpacking and passing all of those parameters manually. Reduces the
complexity in the caller and reduces the number of args to just one.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-3-git-send-email-jsnow@redhat.com
Code motion: I want to refactor this function to work with FDrive
directly, so shuffle it below that definition.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-2-git-send-email-jsnow@redhat.com
Move the ssi.h include file into the ssi directory.
While touching the code also fix the typdef lines as
checkpatch complains.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the sst25wf080 SPI flash device.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Done with this Coccinelle semantic patch
@@
expression FMT, E1, E2;
expression list ARGS;
@@
- error_setg(E1, FMT, ARGS, error_get_pretty(E2));
+ error_propagate(E1, E2);/*###*/
+ error_prepend(E1, FMT/*@@@*/, ARGS);
followed by manual cleanup, first because I can't figure out how to
make Coccinelle transform strings, and second to get rid of now
superfluous error_propagate().
We now use or propagate the original error whole instead of just its
message obtained with error_get_pretty(). This avoids suppressing its
hint (see commit 50b7b00), but I can't see how the errors touched in
this commit could come with hints. It also improves the message
printed with &error_abort when we screw up (see commit 1e9b65b).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1449764955-10741-3-git-send-email-armbru@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
It's necessary to distinguish source and target before we can add
blockdev-mirror, because we would want a concrete type of operation to
check on target bs before starting.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450932306-13717-2-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Request merging must not result in a huge request that exceeds the
maximum number of iovec elements. Use BlockLimits.max_iov instead of
hardcoding IOV_MAX.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
1. avoid possible superflous checking
2. make code more robustness
["make code more robustness" refers to avoiding integer
underflows/overflows.
--Stefan]
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-id: 1447207166-12612-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
src is stored in shared memory and src->nr_segments is dereferenced
twice at the end of the function. If a compiler decides to compile this
into two separate memory accesses then the size limitation could be
bypassed.
Fix it by removing the double access to src->nr_segments.
This is part of XSA-155.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The Xen toolstack uses "vhd" to specify a disk in VHD format, however
the name of the driver in QEMU is "vpc". Replace "vhd" with "vpc", so
that QEMU can find the right driver to use for it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The official way of enabling dataplane is through the "iothread"
property that references an iothread object created by "-object
iothread". Since the old "x-data-plane=on" way now even crashes, it's
probably easier to just drop it:
$ qemu-system-x86_64 -drive file=null-co://,id=d0,if=none \
-device virtio-blk-pci,drive=d0,x-data-plane=on
ERROR:/home/fam/work/qemu/qom/object.c:1515:
object_get_canonical_path_component: assertion failed: (obj->parent != NULL)
Aborted
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1449485967-19240-1-git-send-email-famz@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For the "main area on file, oob in memory" case, fix the shifts so that
we erase the correct number of pages.
Signed-off-by: Ricard Wanderlöf <ricardw@axis.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This code has been dead for three years (since commit 7e7b7cba1).
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
"werror=report" would free the req in virtio_blk_handle_rw_error, we
mustn't write to it in that case.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1448239280-15025-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The shifts of the address mask and value shift beyond 32 bits when there
are 5 address cycles.
Cc: qemu-stable@nongnu.org
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a request R is absorbed by request M, it is appended to the
"mr_next" queue led by M, and is completed together with the completion
of M, in virtio_blk_rw_complete.
During DMA restart in virtio_blk_dma_restart_bh, requests in s->rq are
parsed and submitted again, possibly with a stale req->mr_next. It could
be a problem if the request merging in virtio_blk_handle_request hasn't
refreshed every mr_next pointer, in which case, virtio_blk_rw_complete
could walk through unexpected requests following the stale pointers.
Fix this by unsetting the pointer in virtio_blk_rw_complete. It is safe
because this req is either completed and freed right away, or it will be
restarted and parsed from scratch out of the vq later.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently both BLKIF_OP_WRITE and BLKIF_OP_FLUSH_DISKCACHE are being
accounted as write operations.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 7a2a14e3ac62027aa6267a6c02abc70717be9c0a.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When creating snapshot with the dataplane enabled, the snapshot file gets
not the actual state of virtqueue, because the current state is stored in
VirtIOBlockDataPlane. Therefore, before saving snapshot need to sync
the dataplane vring state to the virtqueue. The dataplane will resume its
work at the next notify virtqueue.
When snapshot loads with loadvm we get a message:
VQ 0 size 0x80 Guest index 0x15f5 inconsistent with Host index 0x0:
delta 0x15f5
error while loading state for instance 0x0 of device
'0000:00:08.0/virtio-blk'
Error -1 while loading VM state
to reproduce the error I used the following hmp commands:
savevm snap1
loadvm snap1
qemu parameters:
--enable-kvm -smp 4 -m 1024 -drive file=/var/lib/libvirt/images/centos6.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0 -set device.virtio-disk0.x-data-plane=on
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1445859777-2982-1-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Devices that are compliant with virtio-1 do not support scsi
passthrough any more (and it has not been a recommended setup
anyway for quite some time). To avoid having to switch it off
explicitly in newer qemus that turn on virtio-1 by default, let's
switch the default to scsi=false for 2.5.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1444991154-79217-4-git-send-email-cornelia.huck@de.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
They will be excluded by type in the nested event loops in block layer,
so that unwanted events won't be processed there.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers pass in false, and the real external ones will switch to
true in coming patches.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_bs() will not necessarily return a non-NULL value any more (unless
blk_is_available() is true or it can be assumed to otherwise, e.g.
because it is called immediately after a successful blk_new_with_bs() or
blk_new_open()).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The tray of an FDD is open iff there is no medium inserted (there are
only two states for an FDD: "medium inserted" or "no medium inserted").
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Simplify memory allocation by sticking with a single API. GSlice
is not that fast anyway (tcmalloc/jemalloc are better).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The raw-posix block driver implements Linux AIO batching so multiple
requests can be submitted with a single io_submit(2) system call.
Batching is currently only used by virtio-scsi and
virtio-blk-data-plane.
Enable batching for regular virtio-blk so the number of io_submit(2)
system calls is reduced for workloads with queue depth > 1.
In 4KB random read performance tests with queue depth 32, the CPU
utilization on the host is reduced by 9.4%. The fio job is as follows:
[global]
bs=4k
ioengine=libaio
iodepth=32
direct=1
sync=0
time_based=1
runtime=30
clocksource=gettimeofday
ramp_time=5
[job1]
rw=randread
filename=/dev/vdb
size=4096M
write_bw_log=fio
write_iops_log=fio
write_lat_log=fio
log_avg_msec=1000
This benchmark was run on an raw image on LVM. The disk was an SSD
drive and -drive cache=none,aio=native was used.
Tested-by: Pradeep Surisetty <psuriset@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Symptom:
$ qemu-system-x86_64 -m 10000000
Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456:
upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory
Aborted (core dumped)
Root cause: commit ef701d7 screwed up handling of out-of-memory
conditions. Before the commit, we report the error and exit(1), in
one place, ram_block_add(). The commit lifts the error handling up
the call chain some, to three places. Fine. Except it uses
&error_abort in these places, changing the behavior from exit(1) to
abort(), and thus undoing the work of commit 3922825 "exec: Don't
abort when we can't allocate guest memory".
The three places are:
* memory_region_init_ram()
Commit 4994653 (right after commit ef701d7) lifted the error
handling further, through memory_region_init_ram(), multiplying the
incorrect use of &error_abort. Later on, imitation of existing
(bad) code may have created more.
* memory_region_init_ram_ptr()
The &error_abort is still there.
* memory_region_init_rom_device()
Doesn't need fixing, because commit 33e0eb5 (soon after commit
ef701d7) lifted the error handling further, and in the process
changed it from &error_abort to passing it up the call chain.
Correct, because the callers are realize() methods.
Fix the error handling after memory_region_init_ram() with a
Coccinelle semantic patch:
@r@
expression mr, owner, name, size, err;
position p;
@@
memory_region_init_ram(mr, owner, name, size,
(
- &error_abort
+ &error_fatal
|
err@p
)
);
@script:python@
p << r.p;
@@
print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column)
When the last argument is &error_abort, it gets replaced by
&error_fatal. This is the fix.
If the last argument is anything else, its position is reported. This
lets us check the fix is complete. Four positions get reported:
* ram_backend_memory_alloc()
Error is passed up the call chain, ultimately through
user_creatable_complete(). As far as I can tell, it's callers all
handle the error sanely.
* fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize()
DeviceClass.realize() methods, errors handled sanely further up the
call chain.
We're good. Test case again behaves:
$ qemu-system-x86_64 -m 10000000
qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory
[Exit 1 ]
The next commits will repair the rest of commit ef701d7's damage.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1441983105-26376-3-git-send-email-armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD
NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j
fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H
dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF
JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3
7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA=
=nB06
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
# gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream: (44 commits)
cutils: work around platform differences in strto{l,ul,ll,ull}
cpu-exec: fix lock hierarchy for user-mode emulation
exec: make mmap_lock/mmap_unlock globally available
tcg: comment on which functions have to be called with mmap_lock held
tcg: add memory barriers in page_find_alloc accesses
remove unused spinlock.
replace spinlock by QemuMutex.
cpus: remove tcg_halt_cond and tcg_cpu_thread globals
cpus: protect work list with work_mutex
scripts/dump-guest-memory.py: fix after RAMBlock change
configure: Add support for jemalloc
add macro file for coccinelle
configure: factor out adding disas configure
vhost-scsi: fix wrong vhost-scsi firmware path
checkpatch: remove tests that are not relevant outside the kernel
checkpatch: adapt some tests to QEMU
CODING_STYLE: update mixed declaration rules
qmp: Add example usage of strto*l() qemu wrapper
cutils: Add qemu_strtoull() wrapper
cutils: Add qemu_strtoll() wrapper
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A number of files were including signal.h but not using any
of the functions it provides
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit ef546f1275 ("virtio: add
feature checking helpers") introduced a helper __virtio_has_feature.
We don't want to use reserved identifiers, though, so let's
rename __virtio_has_feature to virtio_has_feature and virtio_has_feature
to virtio_vdev_has_feature.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The i8257 DMA controller uses an idle bottom half, which by default
does not cause the main loop to exit. Therefore, the DMA_schedule
function is there to ensure that the CPU relinquishes the iothread
mutex to the iothread.
However, this is not enough since the iothread will call
aio_compute_timeout() and go to sleep again. In the iothread
world, forcing execution of the idle bottom half is much simpler,
and only requires a call to qemu_notify_event(). Do it, removing
the need for the "cpu_request_exit" pseudo-irq. The next patch
will remove it.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use pow2ceil() to round up to the next power of 2, rather
than an inline calculation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1437741192-20955-4-git-send-email-peter.maydell@linaro.org
Chapter 6.3 of spec said
"
Transitional devices MUST offer, and if offered by the device
transitional drivers MUST accept the following:
VIRTIO_F_ANY_LAYOUT (27)
"
So this patch only clear VIRTIO_F_LAYOUT for legacy device.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
SCSI passthrough was no longer supported in virtio 1.0, so this patch
fail the get_features() when both 1.0 and scsi is set. And also only
advertise VIRTIO_BLK_F_SCSI for legacy virtio-blk device.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Implement support in Identify and Get/Set Features to properly report
and allow to change the Volatile Write Cache status reported by the
virtual NVMe device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Implement a real flush instead of faking it. This is especially important
as Qemu assume Write back cashing by default and thus requires a working
cache flush operation for data integrity.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
in_num = req->elem.in_num, and req->elem.in_num is
checked in line 489, so the check about in_num variable
is superflous, let's drop it.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1435138164-11728-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Each call of the virtio_blk_reset() function calls blk_drain_all(),
which works for all existing BlockDriverStates, while draining only
one is needed.
This patch replaces blk_drain_all() by blk_drain() in
virtio_blk_reset(). virtio_blk_data_plane_stop() should be called
after draining because it restores vblk->complete_request.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Message-id: 1434537440-28236-3-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We create optional sections with this patch. But we already have
optional subsections. Instead of having two mechanism that do the
same, we can just generalize it.
For subsections we just change:
- Add a needed function to VMStateDescription
- Remove VMStateSubsection (after removal of the needed function
it is just a VMStateDescription)
- Adjust the whole tree, moving the needed function to the corresponding
VMStateDescription
Signed-off-by: Juan Quintela <quintela@redhat.com>
* CONFIG_PARALLEL fix from Mirek
* Atomic/optimized dirty bitmap access from myself and Stefan
* BUILD_DIR convenience/bugfix from Peter C
* Memory leak fix from Shannon
* SMM improvements (though still TCG only) from myself and Gerd, acked by mst
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJVceAwAAoJEL/70l94x66Dyz4H/RHS/OUGo6HOwG1FZ4l8RxRl
FY+pwJqinxFyGySmMLVHEeQCsIfxgi8bOmuWblG7sdt245nhMIj2jglyEOCUA3RN
Q9qxQr6QyXBWiwK4bfB7xI1z3/mc8cVvuxjtkLaBMa16A4MXMunWCDcyhsX9/0Vw
VySgTgBbn5AyY5x58TbkB7Tl6hMZgxF0yNwU6IGQvP079dgREAL2tzR1Wk8kPC80
ltLWlrwTAzF2km5m6rmstpMeZ/XIaq3DD2LU03SyUhefMsYowGKK+7Boo4lHpVm9
XAlxflahN7VGtQuno5RpYNNSzGqSJgqu5X5JxCMnbWdPi4sX3bijQdcUhW3/0oo=
=KPIz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* KVM error improvement from Laurent
* CONFIG_PARALLEL fix from Mirek
* Atomic/optimized dirty bitmap access from myself and Stefan
* BUILD_DIR convenience/bugfix from Peter C
* Memory leak fix from Shannon
* SMM improvements (though still TCG only) from myself and Gerd, acked by mst
# gpg: Signature made Fri Jun 5 18:45:20 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (62 commits)
update Linux headers from kvm/next
atomics: add explicit compiler fence in __atomic memory barriers
ich9: implement SMI_LOCK
q35: implement TSEG
q35: add test for SMRAM.D_LCK
q35: implement SMRAM.D_LCK
q35: add config space wmask for SMRAM and ESMRAMC
q35: fix ESMRAMC default
q35: implement high SMRAM
hw/i386: remove smram_update
target-i386: use memory API to implement SMRAM
hw/i386: add a separate region that tracks the SMRAME bit
target-i386: create a separate AddressSpace for each CPU
vl: run "late" notifiers immediately
qom: add object_property_add_const_link
vl: allow full-blown QemuOpts syntax for -global
pflash_cfi01: add secure property
pflash_cfi01: change to new-style MMIO accessors
pflash_cfi01: change big-endian property to BIT type
target-i386: wake up processors that receive an SMI
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When this property is set, MMIO accesses are only allowed with the
MEMTXATTRS_SECURE attribute. This is used for secure access to UEFI
variables stored in flash.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The RQM bit in MSR should be set whenever the guest is supposed to
access the FIFO, and it should be cleared in all other cases. This is
important so the guest can't continue writing/reading the FIFO beyond
the length that it's suppossed to access (see CVE-2015-3456).
Commit e9077462 fixed the CVE by adding code that avoids the buffer
overflow; however it doesn't correct the wrong behaviour of the floppy
controller which should already have cleared RQM.
Currently, RQM stays set all the time and during all phases while a
command is being processed. This is error-prone because the command has
to explicitly clear the flag if it doesn't need data (and indeed, the
two buggy commands that are the culprits for the CVE just forgot to do
that).
This patch clears RQM immediately as soon as all bytes that are expected
have been received. If the the FIFO is used in the next phase, the flag
has to be set explicitly there.
It also clear RQM after receiving all bytes even if the phase transition
immediately sets it again. While it's technically not necessary at the
moment because the state between clearing and setting RQM is not
observable by the guest, this is more explicit and matches how real
hardware works. It will actually become necessary in qemu once
asynchronous code paths are introduced.
This alone should have been enough to fix the CVE, but now we have two
lines of defense - even better.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-8-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This commit makes similar improvements as have already been made to the
write function: Instead of relying on a flag in the MSR to distinguish
controller phases, use the explicit phase that we store now. Assertions
of the right MSR flags are added.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-7-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Factor out a few common lines of code, reformat, improve comments.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-6-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Instead of relying on a flag in the MSR to distinguish controller phases,
use the explicit phase that we store now. Assertions of the right MSR
flags are added.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-5-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
The floppy controller spec describes three different controller phases,
which are currently not explicitly modelled in our emulation. Instead,
each phase is represented by a combination of flags in registers.
This patch makes explicit in which phase the controller currently is.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-4-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
What callers really do with this function is to switch from execution
phase (including data transfers) to result phase where the guest can
read out one or more status bytes from the FIFO (the number depends on
the command).
Rename the function accordingly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-3-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
What all callers of fdctrl_reset_fifo() really want to do is to start
the command phase, where writes to the data port initiate a new command.
The function doesn't only clear the FIFO, but also sets up the state so
that a new command can be received. Rename it to reflect this.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-2-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Make features 64bit wide everywhere.
On migration a full 64bit guest_features field is sent if one of the
high bits is set, in addition to the lower 32bit guest_features field
which must stay for compatibility reasons. That way we send the lower
32 feature bits twice, but the code is simpler because we don't have
to split and compose the 64bit features into two 32bit fields.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The SCSI emulation in the Linux NVMe driver really wants to know
if a device has a volatile write cache. Given that qemu has moved
away from a model where we report the backing store WCE bit to
one where the WCE bit is supposed to be part of the migratable
guest-visible state we always return 1 here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
During processing of certain commands such as FD_CMD_READ_ID and
FD_CMD_DRIVE_SPECIFICATION_COMMAND the fifo memory access could
get out of bounds leading to memory corruption with values coming
from the guest.
Fix this by making sure that the index is always bounded by the
allocated memory.
This is CVE-2015-3456.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Windows seems to send two separate calls to NVMe controller configuration. The
first sends configuration info and the second the enable bit. I couldn't
enable the Windows 8.1 in-box NVMe driver with base Qemu. I made the
following change to store the configuration data and then handle enable and
NVMe driver works on Windows 8.1.
I am not a Windows expert and I'm not entirely sure this is the correct
approach. I'm offering it for anyone who wishes to use NVMe on Windows 8.1
using Qemu.
I have tested this change with Linux and Windows guests with NVMe devices.
Signed-off-by: Daniel Stekloff <dan@wendan.org>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All of them were reported by codespell.
Most typos are in comments, one is in an error message.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
They were introduced in 6f7e9aec5e and
82407d1a40 and lots of bug fixes were done after that.
This fixes (at least) the detection of the floppy controller on Debian 4.0r9/SPARC,
and SS-5's OBP initialization routine still works.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 1426351846-6497-1-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
Delay the call to blk_blockalign() until s->blk has been assigned.
This never caused a crash because blk_blockalign(NULL, size) defaults to
4096 alignment but it's technically incorrect.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1429091024-25098-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Of the block devices that poked into -drive options via drive_get_next,
m25p80 was the only one who also did not attach itself to the BlockBackend.
Since sd does it, and all other devices go through a "drive" property,
with this change all block backends attached to the guest will have a
non-NULL result for blk_get_attached_dev().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1429025387-11077-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After qemu_iovec_destroy, the QEMUIOVector's size is zeroed and
the zero size ultimately is used to compute virtqueue_push's len
argument. Therefore, reads from virtio-blk devices did not
migrate their results correctly. (Writes were okay).
Save the size in virtio_blk_handle_request, and use it when the request
is completed.
Based on a patch by Wen Congyang.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-id: 1427997044-392-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Drives defined with if!=none are for board initialization to wire up.
Board code calls drive_get() or similar to find them, and creates
devices with their qdev drive properties set accordingly.
Except a few devices go on a fishing expedition for a suitable backend
instead of exposing a drive property for board code to set: they call
driver_get() or drive_get_next() in their realize() or init() method
to implicitly connect to the "next" backend with a certain interface
type.
Picking up backends that way works when the devices are created by
board code. But it's inappropriate for -device or device_add. Not
only is this inconsistent with how the other block device models work
(they connect to a backend explicitly identified by a "drive"
property), it breaks when the "next" backend has been picked up by the
board already.
Example:
$ qemu-system-arm -S -M connex -pflash flash.img -device ssi-sd
Aborted (core dumped)
Mark them with suitable FIXME comments.
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: "Andreas Färber" <andreas.faerber@web.de>
Cc: Michael Walle <michael@walle.cc>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The shift operation on nlb gives a 32 bit result if no type cast is
applied. This bug was reported by Coverity.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1426348844-8793-1-git-send-email-sw@weilnetz.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Coverity/sparse fix for iscsi driver
- RCU fallout: fix -daemonize and s390x system emulation
- KVM: kvm_stat improvements and new man page
- x86: SYSRET fix for VxWorks
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJU/sUFAAoJEL/70l94x66D1JwIAJ28Lan2DQwi+xHvNxF8zW6n
v7eMc04/fepuon0TYmUZC3qbqc00sccEQZQ+yAAauT9epZ/kdSDudDOzG+3F4MuQ
/X3crXw2/jrhtWedGq49vFCONX4MKoaoudqK8kOFMe1ImQgkOYeAzOoqeFXyHsFh
jINlKTJZB6oKzrZ+SYryY14cO7pvGaIhyqaCC+6GcVihTjm9Yq13lP1lFj7LsVRV
aGfd6xH9RSV/mwzvZwD4i3cUWSUaV/wY0NDhAEzDPCUcxX0/nAj3XF1YeJUF30Qd
ETaCLo/Nxq2R6POK3c/Zm/FRLvjzZ2caD+q1LcwB/bCYdc2lJ1JDxE/hr48ANv0=
=OWXY
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- scsi: improvements to error reporting and conversion to realize,
Coverity/sparse fix for iscsi driver
- RCU fallout: fix -daemonize and s390x system emulation
- KVM: kvm_stat improvements and new man page
- x86: SYSRET fix for VxWorks
# gpg: Signature made Tue Mar 10 10:18:45 2015 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
x86: fix SS selector in SYSRET
scsi: Convert remaining PCI HBAs to realize()
scsi: Improve error reporting for invalid drive property
hw: Propagate errors through qdev_prop_set_drive()
scsi: Clean up duplicated error in legacy if=scsi code
cpus: initialize cpu->memory_dispatch
rcu: handle forks safely
qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK
kvm_stat: add kvm_stat.1 man page
kvm_stat: add column headers to text UI
iscsi: Fix check for username
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJU/uuVAAoJEH8JsnLIjy/WULwP/jeARjYkFuG3ahSWpeY0JnTK
QCkLF06iSQQUiirXI4H+Tofl8kNVBd/Iinv+LbkF27iWbTiwalmLz7NiyboX8dl+
NJZtCrqp44q7KFbl3g19/jop/zdZ9N5Gxp8BARVUILHQb1y5cXJwrDhBxTmNRDL+
sSZXfomCgKtMP40nGLa0CcNIYKlm8MePJEM2TsMoWv7tYz4CXgBG39EqK6NJluCY
kTTMcbdrLbR0imfKOVPutCgV8rhRXJ0oGVD3Q+D3/LFmPG++hoRnWCcDm6ZZ62Hi
Ra7u87TBfAUUtiT+vFQJnd7hTpN+stQidsCDBLEY3qPTKYhzm648PHvcEwOAv6YW
sjAELF2Rrsbe4vkL3/qgYDusnaPMElrHVEdbKtHofWtg6KctLnYIhusV+qKq1Fpa
cRQEbQIZMVFeWN1G9WuYH8RBYrwJqp+/qq7DcnV62lUAdY4e3iO7E3yMLFDwpxku
PLl7eofU/ZpnAOrrU2QAQvgXZRqy1ie/Unv8jFwefQkK5mXHoCtkAeBlOM8t4kJf
HjkC/hYO7kwPdaz6xK80wpXqYd3vT6jKi7mlJqC5oQQLGJbRigxlMZ16UIAx+IrL
NxhnQChp7IP21KMATFbpvYjcJyGMw3ZuVRaUhQBgqQArIomVHvM5WcN9M6S5dsmj
vClFOIqjlSbtsmChceWr
=hlbC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.3
# gpg: Signature made Tue Mar 10 13:03:17 2015 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (73 commits)
MAINTAINERS: Add jcody as blockjobs, block devices maintainer
iotests: add O_DIRECT alignment probing test
block/raw-posix: fix launching with failed disks
MAINTAINERS: Add jsnow as IDE maintainer
sheepdog: Fix misleading error messages in sd_snapshot_create()
Add testcase for scsi-hd devices without drive property
scsi-hd: fix property unset case
block/vdi: Add locking for parallel requests
iotests: Drop vpc from 004's and 104's format list
iotests: Remove 006
iotests: Fix 051's reference output
virtio-blk: Remove the stale FIXME comment
tests: Check QVIRTIO_F_ANY_LAYOUT flag in virtio-blk test
libqos: Solve bug in interrupt checking when using MSIX in virtio-pci.c
sheepdog: fix confused return values
qtest/ahci: add fragmented dma test
qtest/ahci: Add PIO and LBA48 tests
qtest/ahci: Add DMA test variants
libqos/ahci: add ahci command helpers
qtest/ahci: Add a macro bootup routine
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
By default, we have ioeventfd enabled, so the IO request processing is
in IO thread; in the vcpu thread, guest mode is returned to as quickly
as possible, and completion is delivered via irqfd. Therefore this
comment from the initial implementation is barely relevant.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
geometry: hd_geometry_guess function autodetects the drive geometry.
This patch adds a block backend call, that probes the backing device
geometry. If the inner driver method is implemented and succeeds
(currently only for DASDs), the blkconf_geometry will pass-through
the backing device geometry. Otherwise will fallback to old logic.
blocksize: This patch initializes blocksize properties to 0.
In order to set the property a blkconf_blocksizes was introduced.
If user didn't set physical or logical blocksize, it will
retrieve its value from a driver (only succeeds for DASD), otherwise
it will set default 512 value.
The blkconf_blocksizes call was added to all users of BlkConf.
Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424087278-49393-6-git-send-email-tumanova@linux.vnet.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since commit 1dc936aa84 (virtio-blk: Use blk_aio_ioctl) we silently lose
the request if blk_aio_ioctl returns NULL (not implemented).
Fix it by directly returning VIRTIO_BLK_S_UNSUPP as we used to do.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ kwolf: Fixed build error on win32 ]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Three kinds of callers:
1. On failure, report the error and abort
Passing &error_abort does the job. No functional change.
2. On failure, report the error and exit()
This is qdev_prop_set_drive_nofail(). Error reporting moves from
qdev_prop_set_drive() to its caller. Because hiding away the error
in the monitor right before exit() isn't helpful, replace
qerror_report_err() by error_report_err(). Shouldn't make a
difference, because qdev_prop_set_drive_nofail() should never be
used in QMP context.
3. On failure, report the error and recover
This is usb_msd_init() and scsi_bus_legacy_add_drive(). Error
reporting and freeing the error object moves from
qdev_prop_set_drive() to its callers.
Because usb_msd_init() can't run in QMP context, replace
qerror_report_err() by error_report_err() there.
No functional change.
scsi_bus_legacy_add_drive() calling qerror_report_err() is of
course inappropriate, but this commit merely makes it more obvious.
The next one will clean it up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-Id: <1425925048-15482-3-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a helper function for checking whether a bit is set in the guest
features for a vdev as well as one that works on a feature bit set.
Convert code that open-coded this: It cleans up the code and makes it
easier to extend the guest feature bits.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add virtio_{add,clear}_feature helper functions for manipulating a
feature bits variable. This has some benefits over open coding:
- add check that the bit is in a sane range
- make it obvious at a glance what is going on
- have a central point to change when we want to extend feature bits
Convert existing code manipulating features to use the new helpers.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop duplicated code. Minor codechanges were required
as geometry is a sub-structure now.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
As part of the required changes, this fixes a bug where specifying an
invalid driver would result in the block layer probing the image format;
now it will result in an error, unless "<unset>" is specified as the
driver name. Fixing this would require further work on the xen_disk code
which does not seem worth it (at this point and for this patch).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1423162705-32065-7-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The vring.c code currently assumes that guest and host endianness match,
which is not true for a number of cases:
- emulating targets with a different endianness than the host
- bi-endian targets, where the correct endianness depends on the virtio
device
- upcoming support for the virtio-1 standard mandates little-endian
accesses even for big-endian targets and hosts
Make sure to use accessors that depend on the virtio device.
Note that dataplane now needs to be built per-target.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1422289602-17874-2-git-send-email-cornelia.huck@de.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
we check and adjust request sizes at several places with
sometimes inconsistent checks or default values:
INT_MAX
INT_MAX >> BDRV_SECTOR_BITS
UINT_MAX >> BDRV_SECTOR_BITS
SIZE_MAX >> BDRV_SECTOR_BITS
This patches introdocues a macro for the maximal allowed sectors
per request and uses it at several places.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
this adds a knob to disable request merging for debugging or benchmarks if dedired.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
this patch finally introduces multiread support to virtio-blk. While
multiwrite support was there for a long time, read support was missing.
The complete merge logic is moved into virtio-blk.c which has
been the only user of request merging ever since. This is required
to be able to merge chunks of requests and immediately invoke callbacks
for those requests. Secondly, this is required to switch to
direct invocation of coroutines which is planned at a later stage.
The following benchmarks show the performance of running fio with
4 worker threads on a local ram disk. The numbers show the average
of 10 test runs after 1 run as warmup phase.
| 4k | 64k | 4k
MB/s | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand
--------------+--------+---------+--------+---------+--------+--------
master | 1221 | 1187 | 4178 | 4114 | 1745 | 1213
multiread | 1829 | 1189 | 4639 | 4110 | 1894 | 1216
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>