Multiqueue requires that each request knows to which virtqueue it
belongs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-5-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Let the virtio_blk_data_plane_notify() caller decide which virtqueue to
notify. This will allow the function to be used with multiqueue.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-4-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The batch notification BH needs to know which virtqueues to notify when
multiqueue is enabled. Use a bitmap to track the virtqueues with
pending notifications.
At this point there is only one virtqueue so hard-code virtqueue index
0. A later patch will switch to real virtqueue indices.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-3-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The num_queues field is always 1 for the time being. A later patch will
make it a configurable device property so that multiqueue can be
enabled.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-2-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This reverts commit 10a06fd65f.
Dataplane has used the same virtqueue code as non-dataplane since
commits e24a47c5b7 ("virtio-scsi: do not
use vring in dataplane") and 03de2f5274
("virtio-blk: do not use vring in dataplane"). It is no longer
necessary to stop dataplane in order to sync state since there is no
duplicated virtqueue state.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-id: 1466503331-9831-1-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit fix obvious bug in WINBOND command handling.
Datasheet states that default dummy cycles is 8 so fix it.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Message-id: 1466755631-25201-11-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Macronix: mx66u51235f and mx66u1g45g
Micron: mt25ql01g and mt25qu01g
Spansion: s25fs512s and s70fs01gs
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-10-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Support for Spansion and Macronix flashes.
Additionally Numonyx(Micron) moved from default
in fast read commands family. Also moved fast read
command decoding to functions.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-9-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Configuration registers for Spansion and Macronix devices.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-8-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Quad and Equad modes for Spansion and Macronix flash devices.
This commit also includes modification and new command to manipulate
quad mode (status registers and dedicated commands).
This work is based on Pawel Lenkow work.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-7-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Page program 4byte/quad and erase 32K sectors 4 bytes.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-6-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some flash allows to stop read at any time.
Allow framework to support this.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-5-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allow to have more than four 16MiB regions for bigger flash devices.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-4-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since it is now longer than 4. This work based on Pawel Lenkow
changes and the kernel SPI framework.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-3-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instead of always reading and comparing jededc ID,
replace it by function.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Message-id: 1466755631-25201-2-git-send-email-marcin.krzeminski@nokia.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
All users have been converted to the new ioevent callbacks.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Have vhost and dataplane use the new api for transports that
have been converted.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In case the word size of the domU and qemu running the qdisk backend
differ BLKIF_OP_DISCARD will not work reliably, as the request
structure in the ring have different layouts for different word size.
Correct this by copying the request structure in case of different
word size element by element in the BLKIF_OP_DISCARD case, too.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Move all trace-events for files in the hw/block/ directory to
their own file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-8-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check
is bogus without a previous inclusion of sys/mman.h. Include it in
sysemu/os-posix.h and remove it from everywhere else.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
commit 243e6f69c1 ("m25p80: Switch to byte-based block access")
replaced blk_read() calls with blk_pread() but return values are
different.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit f9e98e5d7a ("xen/blkif: Avoid double access to
src->nr_segments") didn't go far enough: src->operation is also being
used twice. And nothing was done to prevent the compiler from using the
source side of the copy done by blk_get_request() (granted that's very
unlikely).
Move the barrier()s up, and add another one to blk_get_request().
Note that for completing XSA-155, the barrier() getting added to
blk_get_request() would suffice, and hence the changes to xen_blkif.h
are more like just cleanup. And since, as said, the unpatched code
getting compiled to something vulnerable is very unlikely (and not
observed in practice), this isn't being viewed as a new security issue.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).
This patch is the result of coccinelle script
scripts/coccinelle/round.cocci
CC: qemu-block@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Block layer is prepared to unspecialize dataplane, an evidence is this
almost complete list of unblocked operations. It has all types except
two (actually three if DATAPLANE itself counts but blockdev.c makes sure
attaching twice is not possible): MIRROR_TARGET and BACKUP_TARGET.
blockdev-mirror refuses to start if target is attached, so the first is
not a problem.
By removing BACKUP_TARGET, blockdev-backup will become permissive to
write to a virtio-blk dataplane disk, but that is not worse than
non-dataplane given the latter is already possible. In either case,
blockdev.c always checks the target and source are on the same
AioContext, or bring them together if possible.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1463969978-24970-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Move it to the actual users. There are still a few includes of
qemu/bswap.h in headers; removing them is left for future work.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sector-based blk_read() should die; switch to byte-based
blk_pread() instead.
Likewise for blk_aio_readv() and blk_aio_writev().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead. Likewise for blk_read().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead. Likewise for blk_read().
This particular device picks its size during onenand_initfn(),
and can be at most 0x80000000 bytes; therefore, shifting an
'int sec' request to get back to a byte offset should never
overflow 32 bits. But adding assertions to document that point
should not hurt.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead. Likewise for blk_read().
This file is doing some complex computations to map various
flash page sizes (256, 512, and 2048) atop generic uses of
512-byte sector operations. Perhaps someone will want to tidy
up the file for fewer gymnastics in managing addresses and
offsets, and less wasteful visits of 256-byte pages, but it
was out of scope for this series, where I just went with the
mechanical conversion.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead. Likewise for blk_read().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.
The trace is modified at the same time, and nb_sectors is now
unused. Fix a comment typo while in the vicinity.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sector-based blk_read() should die; convert the one-off
variant blk_read_unthrottled().
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Pflash migration (e.g. q35 + EFI variable storage) fails
with the assert:
bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.
This avoids the problem by delaying the pflash update until after
the device loads complete.
Tested by:
Migrating Q35/EFI vm.
Changing efi variable content (with efiboot in the guest)
md5sum'ing the variable file before migration and after.
This is a fix that Paolo posted in the message
570244B3.4070105@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Eliminating the reentrancy is actually a nice thing that we can do
with the API that Michael proposed, so let's make it first class.
This also hides the complex assign/set_handler conventions from
callers of virtio_queue_aio_set_host_notifier_handler, which in
fact was always called with assign=true.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.
This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant. Use a separate handler just
for aio, and disable regular handlers when dataplane is active.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We must not call virtio_blk_data_plane_notify if dataplane is
disabled: we would hit a segmentation fault in notify_guest_bh as
s->guest_notifier has not been setup and is NULL.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-12-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-11-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implements FSR register, it is used for busy waits.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-10-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Adds fast read and 4bytes commands family.
This work is based on Pawel Lenkow patch from v1.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-9-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the setting from the volatile cfg register to correctly
set the number of dummy cycles.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-8-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds both volatile and non volatile configuration registers
and commands to allow modify them. It is needed for proper handling
dummy cycles. Initialization of those registers and flash state
has been included as well.
Some of this registers are used by kernel.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Acked-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-7-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds only 4byte address mode (does not cover dummy cycles).
This mode is needed to access more than 16 MiB of flash.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-6-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Extend address mode allows to switch flash 16 MiB banks,
allowing user to access all flash sectors.
This access mode is used by u-boot.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-5-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Extend the width of the flags variable to support the already existing
(but unused) WR_1 flag, which is above the range of 8 bits.
This allows support of EEPROM emulation which requires the WR_1 feature.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-4-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-3-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-2-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)
Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch replaces get_ticks_per_sec() calls with the macro
NANOSECONDS_PER_SECOND. Also, as there are no callers, get_ticks_per_sec()
is then removed. This replacement improves the readability and
understandability of code.
For example,
timer_mod(fdctrl->result_timer,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (get_ticks_per_sec() / 50));
NANOSECONDS_PER_SECOND makes it obvious that qemu_clock_get_ns
matches the unit of the expression on the right side of the plus.
Signed-off-by: Rutuja Shah <rutu.shah.26@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this patch, blk_new() automatically assigned a name to the new
BlockBackend and considered it referenced by the monitor. This patch
removes the implicit monitor_add_blk() call from blk_new() (and
consequently the monitor_remove_blk() call from blk_delete(), too) and
thus blk_new() (and related functions) no longer take a BB name
argument.
In fact, there is only a single point where blk_new()/blk_new_open() is
called and the new BB is monitor-owned, and that is in blockdev_init().
Besides thus relieving us from having to invent names for all of the BBs
we use in qemu-img, this fixes a bug where qemu cannot create a new
image if there already is a monitor-owned BB named "image".
If a BB and its BDS tree are created in a single operation, as of this
patch the BDS tree will be created before the BB is given a name
(whereas it was the other way around before). This results in minor
change to the output of iotest 087, whose reference output is amended
accordingly.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When populating ACPI objects for floppy drives one needs to provide the
maximum values for cylinder, sector, and head number the drive supports.
This patch adds a function that iterates through the array of predefined
floppy drive formats and returns the maximum values of c, h, s, out of
those matching the given floppy drive type.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
In disabled mode, virtio-blk dataplane seems to be enabled, but flow
actually goes through the normal virtio path. This patch simplifies a bit
the handling of disabled mode. In disabled mode, virtio_blk_handle_output
might be called even if s->dataplane is not NULL.
This is a bit tricky, because the current check for s->dataplane will
always trigger, causing a continuous stream of calls to
virtio_blk_data_plane_start. Unfortunately, these calls will not
do anything. To fix this, set the "started" flag even in disabled
mode, and skip virtio_blk_data_plane_start if the started flag is true.
The resulting changes also prepare the code for the next patch, were
virtio-blk dataplane will reuse the same virtio_blk_handle_output function
as "regular" virtio-blk.
Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have
to move s->dataplane->started inside struct VirtIOBlock.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Make the API more similar to the regular virtqueue API. This will
help when modifying the code to not use vring.c anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Include osdep.h as the first header in nand.c; this has to be
done manually because coccinelle gets confused by the way that
this C file includes itself.
We fix some odd spacing in #includes while we are in the area.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Coverity noticed that some variables are only used by debug prints, and
called them unused. Always compile the print statements. While we're
here, print to stderr as well.
Bonus: Fix a debug printf I broke in f31937aa8
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Touched up commit message. --js]
Message-id: 1454971529-14830-1-git-send-email-jsnow@redhat.com
Included here:
Refactoring and bugfix patches in PC/ACPI.
New commands for ipmi.
Virtio optimizations.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWtj8KAAoJECgfDbjSjVRpBIQIAJSB9xwTcBLXwD0+8z5lqjKC
GTtuVbHU0+Y/eO8O3llN5l+SzaRtPHo18Ele20Oz7IQc0ompANY273K6TOlyILwB
rOhrub71uqpOKbGlxXJflroEAXb78xVK02lohSUvOzCDpwV+6CS4ZaSer7yDCYkA
MODZj7rrEuN0RmBWqxbs1R7Mj2CeQJzlgTUNTBGCLEstoZGFOJq8FjVdG5P1q8vI
fnI9mGJ1JsDnmcUZe/bTFfB4VreqeQ7UuGyNAMMGnvIbr0D1a+CoaMdV7/HZ+KyT
5TIs0siVdhZei60A/Cq2OtSVCbj5QdxPBLhZfwJCp6oU4lh2U5tSvva0mh7MwJ0=
=D/cA
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc and misc cleanups and fixes, virtio optimizations
Included here:
Refactoring and bugfix patches in PC/ACPI.
New commands for ipmi.
Virtio optimizations.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sat 06 Feb 2016 18:44:26 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream: (45 commits)
net: set endianness on all backend devices
fix MSI injection on Xen
intel_iommu: large page support
dimm: Correct type of MemoryHotplugState->base
pc: set the OEM fields in the RSDT and the FADT from the SLIC
acpi: add function to extract oem_id and oem_table_id from the user's SLIC
acpi: expose oem_id and oem_table_id in build_rsdt()
acpi: take oem_id in build_header(), optionally
pc: Eliminate PcGuestInfo struct
pc: Move APIC and NUMA data from PcGuestInfo to PCMachineState
pc: Move PcGuestInfo.fw_cfg to PCMachineState
pc: Remove PcGuestInfo.isapc_ram_fw field
pc: Remove RAM size fields from PcGuestInfo
pc: Remove compat fields from PcGuestInfo
acpi: Don't save PcGuestInfo on AcpiBuildState
acpi: Remove guest_info parameters from functions
pc: Simplify xen_load_linux() signature
pc: Simplify pc_memory_init() signature
pc: Eliminate struct PcGuestInfoState
pc: Move PcGuestInfo declaration to top of file
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move allocation to virtio functions also when loading/saving a
VirtQueueElement. This will also let the load/save functions
keep backwards compatibility when the VirtQueueElement layout
is changed.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0. We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.
The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items. Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.
The patch is pretty large, but changes to each device are testable
more or less independently. Splitting it would mostly add churn.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
All functions relative to DMA (DMA_*() functions) are stubs on sparc platform.
Disable the DMA in the floppy controller, instead of calling these stubs.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-14-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
Accidentally, I removed a "feature" where empty drives had geometry
values applied to them, which allows seek on empty drives to work
"by accident," as QEMU actually tries to disallow that.
Seeks on empty drives should work, though, but the easiest thing is to
restore the misfeature where empty drives have non-zero geometries
applied.
Document the hack accordingly.
[Maintainer edit]
This fix corrects a regression introduced in d5d47efc, where
pick_geometry was modified such that it would not operate on empty
drives, and as a result if there is no diskette inserted, QEMU
no longer populates it with geometry bounds. As a result, seek fails
when QEMU denies to move the current track, but reports success anyway.
This can confuse the guest, leading to kernel panics in the guest.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1454106932-17236-1-git-send-email-jsnow@redhat.com
Put the code for setting up and removing op blockers into an own
function, respectively. Then, we can invoke those functions whenever a
BDS is removed from an virtio-blk BB or inserted into it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts the changes that commit
2e1280e8ff applied to hw/block/fdc.c;
also, an additional case of drv->media_inserted use has crept in since,
which is replaced by a call to blk_is_inserted().
That commit changed tests/fdc-test.c, too, because after it, one less
TRAY_MOVED event would be emitted when executing 'change' on an empty
drive. However, now, no TRAY_MOVED events will be emitted at all, and
the tray_open status returned by query-block will always be false,
necessitating (different) changes to tests/fdc-test.c and iotest 118,
which is why this patch is not a pure revert of said commit.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1454096953-31773-4-git-send-email-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Background on QEMU boot indices
-------------------------------
Normally, the "bootindex" property is configured for bootable devices
with:
DEVICE_instance_init()
device_add_bootindex_property(..., "bootindex", ...)
object_property_add(..., device_get_bootindex,
device_set_bootindex, ...)
and when the bootindex is set on the QEMU command line, with
-device DEVICE,...,bootindex=N
the setter that was configured above is invoked:
device_set_bootindex()
/* parse boot index */
visit_type_int32()
/* verify unicity */
check_boot_index()
/* store parsed boot index */
...
/* insert device path to boot order */
add_boot_device_path()
In the last step, add_boot_device_path() ensures that an OpenFirmware
device path will show up in the "bootorder" fw_cfg file, at a position
corresponding to the device's boot index. Thus guest firmware (SeaBIOS and
OVMF) can try to boot off the device with the right priority.
NVMe boot index
---------------
In QEMU commit 33739c7129,
nvma: ide: add bootindex to qom property
the following generic setters / getters:
- device_set_bootindex()
- device_get_bootindex()
were open-coded for NVMe, under the names
- nvme_set_bootindex()
- nvme_get_bootindex()
Plus nvme_instance_init() was added to configure the "bootindex" property
manually, designating the open-coded getter & setter, rather than calling
device_add_bootindex_property().
Crucially, nvme_set_bootindex() avoided the final add_boot_device_path()
call. This fact is spelled out in the message of commit 33739c7129, and
it was presumably the entire reason for all of the code duplication.
Now, Vladislav filed an RFE for OVMF
<https://github.com/tianocore/edk2/issues/48>; OVMF should boot off NVMe
devices. It is simple to build edk2's existent NvmExpressDxe driver into
OVMF, but the boot order matching logic in OVMF can only handle NVMe if
the "bootorder" fw_cfg file includes such devices.
Therefore this patch converts the NVMe device model to
device_set_bootindex() all the way.
Device paths
------------
device_set_bootindex() accepts an optional parameter called "suffix". When
present, it is expected to take the form of an OpenFirmware device path
node, and it gets appended as last node to the otherwise auto-generated
OFW path.
For NVMe, the auto-generated part is
/pci@i0cf8/pci8086,5845@6[,1]
^ ^ ^ ^
| | PCI slot and (present when nonzero)
| | function of the NVMe controller, both hex
| "driver name" component, built from PCI vendor & device IDs
PCI root at system bus port, PIO
to which here we append the suffix
/namespace@1,0
^ ^
| big endian (MSB at lowest address) numeric interpretation
| of the 64-bit IEEE Extended Unique Identifier, aka EUI-64,
| hex
32-bit NVMe namespace identifier, aka NSID, hex
resulting in the OFW device path
/pci@i0cf8/pci8086,5845@6[,1]/namespace@1,0
The reason for including the NSID and the EUI-64 is that an NVMe device
can in theory produce several different namespaces (distinguished by
NSID). Additionally, each of those may (optionally) have an EUI-64 value.
For now, QEMU only provides namespace 1.
Furthermore, QEMU doesn't even represent the EUI-64 as a standalone field;
it is embedded (and left unused) inside the "NvmeIdNs.res30" array, at the
last eight bytes. (Which is fine, since EUI-64 can be left zero-filled if
unsupported by the device.)
Based on the above, we set the "unit address" part of the last
("namespace") node to fixed "1,0".
OVMF will then map the above OFW device path to the following UEFI device
path fragment, for boot order processing:
PciRoot(0x0)/Pci(0x6,0x1)/NVMe(0x1,00-00-00-00-00-00-00-00)
^ ^ ^ ^ ^ ^
| | | | | octets of the EUI-64 in address order
| | | | NSID
| | | NVMe namespace messaging device path node
| PCI slot and function
PCI root bridge
Cc: Keith Busch <keith.busch@intel.com> (supporter:nvme)
Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core)
Cc: qemu-block@nongnu.org (open list:nvme)
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Vladislav Vovchenko <vladislav.vovchenko@sk.com>
Cc: Feng Tian <feng.tian@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Tested-by: Vladislav Vovchenko <vladislav.vovchenko@sk.com>
Message-id: 1453850483-27511-1-git-send-email-lersek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In Xen 4.7 we are refactoring parts libxenctrl into a number of
separate libraries which will provide backward and forward API and ABI
compatiblity.
One such library will be libxengnttab which provides access to grant
tables.
In preparation for this switch the compatibility layer in xen_common.h
(which support building with older versions of Xen) to use what will
be the new library API. This means that the gnttab shim will disappear
for versions of Xen which include libxengnttab.
To simplify things for the <= 4.0.0 support we wrap the int fd in a
malloc(sizeof int) such that the handle is always a pointer. This
leads to less typedef headaches and the need for
XC_HANDLER_INITIAL_VALUE etc for these interfaces.
Note that this patch does not add any support for actually using
libxengnttab, it just adjusts the existing shims.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The 2.88 drive is more suitable as a default because
it can still read 1.44 images correctly, but the reverse
is not true.
Since there exist virtio-win drivers that are shipped on
2.88 floppy images, this patch will allow VMs booted without
a floppy disk inserted to later insert a 2.88MB floppy and
have that work.
This patch has been tested with msdos, freedos, fedora,
windows 8 and windows 10 without issue: if problems do
arise for certain guests being unable to cope with 2.88MB
drives as the default, they are in the minority and can use
type=144 as needed (or insert a proper boot medium and omit
type=144/288 or use type=auto) to obtain different drive types.
As icing, the default will remain auto/144 for any pre-2.6
machine types, hopefully minimizing the impact of this change
in legacy hw to basically zero.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-13-git-send-email-jsnow@redhat.com
This one is the crazy one.
fd_revalidate currently uses pick_geometry to tell if the diskette
geometry has changed upon an eject/insert event, but it won't allow us
to insert a 1.44MB diskette into a 2.88MB drive. This is inflexible.
The new algorithm applies a new heuristic to guessing disk geometries
that allows us to switch diskette types as long as the physical size
matches before falling back to the old heuristic.
The old one is roughly:
- If the size (sectors) and type matches, choose it.
- Fall back to the first geometry that matched our type.
The new one is:
- If the size (sectors) and type matches, choose it.
- If the size (sectors) and physical size match, choose it.
- Fall back to the first geometry that matched our type.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1453495865-9649-11-git-send-email-jsnow@redhat.com
2.88MB capable drives can accept 1.44MB floppies,
for instance. To rework the pick_geometry function,
we need to know if our current drive can even accept
the type of disks we're considering.
NB: This allows us to distinguish between all of the
"total sectors" collisions between 1.20MB and 1.44MB
diskette types, by using the physical drive size as a
differentiator.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-10-git-send-email-jsnow@redhat.com
This patch adds a new explicit Floppy Drive Type option. The existing
behavior in QEMU is to automatically guess a drive type based on the
media inserted, or if a diskette is not present, arbitrarily assign one.
This behavior can be described as "auto." This patch adds the option
to pick an explicit behavior: 120, 144, 288 or none. The new "auto"
option is intended to mimic current behavior, while the other types
pick one explicitly.
Set the type given by the CLI during fd_init. If the type remains the
default (auto), we'll attempt to scan an inserted diskette if present
to determine a type. If auto is selected but no diskette is present,
we fall back to a predetermined default (currently 1.44MB to match
legacy QEMU behavior.)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-9-git-send-email-jsnow@redhat.com
Currently, QEMU chooses a drive type automatically based on the inserted
media. If there is no disk inserted, it chooses a 1.44MB drive type.
Change this behavior to be configurable, but leave it defaulted to 1.44.
This is not earnestly intended to be used by a user or a management
library, but rather exists so that pre-2.6 board types can configure it
to be a legacy value.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-8-git-send-email-jsnow@redhat.com
Split apart pick_geometry by creating a pick_drive routine that will only
ever called during device bring-up instead of relying on pick_geometry to
be used in both cases.
With this change, the drive field is changed to be 'write once'. It is
not altered after the initialization routines exit.
media_validated does not need to be migrated. The target VM
will just revalidate the media on post_load anyway.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-7-git-send-email-jsnow@redhat.com
pick_geometry is a convoluted function that makes it difficult to tell
at a glance what QEMU's current behavior for choosing a floppy drive
type is when it can't quite identify the diskette.
The code iterates over all entries in the candidate geometry table
("fd_formats") and if our specific drive type matches a row in the table,
then either "match" is set to that entry (an exact match) and the loop
exits, or "first_match" will be non-negative (the first such entry that
shares the same drive type), and the loop continues. If our specific
drive type is NONE, then all drive types in the candidate geometry table
are considered. After iteration, if "match" was not set, we fall back to
"first match".
This means that either "match" was set, or we exited the loop without an
exact match, in which case:
- If drive type is NONE, the default is truly fd_formats[0], a 1.44MB
type, because "first_match" will always get set to the first item.
- If drive type is not NONE, pick_geometry's iteration was fussier and
only looked at rows that matched our drive type. However, since all
possible drive types are represented in the table, we still know that
"first match" was set.
- If drive type is not NONE and the fd_formats table lists no options for
our drive type, we choose fd_formats[1], an incomprehensibly bizarre
choice that can never happen anyway.
Correct this: If first_match is -1, it can ONLY mean we didn't edit our
fd_formats table correctly. Throw an assertion instead.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-6-git-send-email-jsnow@redhat.com
Currently, 'drive' is used both to represent the current diskette
type as well as the current drive type.
This patch adds a 'disk' field that is updated explicitly to match
the type of the disk.
As of this patch, disk and drive are always the same, but forthcoming
patches to change the behavior of pick_geometry will invalidate this
assumption.
disk does not need to be migrated because it is not user-visible state
nor is it currently used for any calculations. It is purely informative,
and will be rebuilt automatically via fd_revalidate on the new host.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-5-git-send-email-jsnow@redhat.com
Change the floppy drive type to a QAPI enum type, to allow us to
specify the floppy drive type from the CLI in a forthcoming patch.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-4-git-send-email-jsnow@redhat.com
Modify this function to operate directly on FDrive objects instead of
unpacking and passing all of those parameters manually. Reduces the
complexity in the caller and reduces the number of args to just one.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-3-git-send-email-jsnow@redhat.com
Code motion: I want to refactor this function to work with FDrive
directly, so shuffle it below that definition.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1453495865-9649-2-git-send-email-jsnow@redhat.com
Move the ssi.h include file into the ssi directory.
While touching the code also fix the typdef lines as
checkpatch complains.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the sst25wf080 SPI flash device.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Done with this Coccinelle semantic patch
@@
expression FMT, E1, E2;
expression list ARGS;
@@
- error_setg(E1, FMT, ARGS, error_get_pretty(E2));
+ error_propagate(E1, E2);/*###*/
+ error_prepend(E1, FMT/*@@@*/, ARGS);
followed by manual cleanup, first because I can't figure out how to
make Coccinelle transform strings, and second to get rid of now
superfluous error_propagate().
We now use or propagate the original error whole instead of just its
message obtained with error_get_pretty(). This avoids suppressing its
hint (see commit 50b7b00), but I can't see how the errors touched in
this commit could come with hints. It also improves the message
printed with &error_abort when we screw up (see commit 1e9b65b).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1449764955-10741-3-git-send-email-armbru@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
It's necessary to distinguish source and target before we can add
blockdev-mirror, because we would want a concrete type of operation to
check on target bs before starting.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450932306-13717-2-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Request merging must not result in a huge request that exceeds the
maximum number of iovec elements. Use BlockLimits.max_iov instead of
hardcoding IOV_MAX.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
1. avoid possible superflous checking
2. make code more robustness
["make code more robustness" refers to avoiding integer
underflows/overflows.
--Stefan]
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-id: 1447207166-12612-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
src is stored in shared memory and src->nr_segments is dereferenced
twice at the end of the function. If a compiler decides to compile this
into two separate memory accesses then the size limitation could be
bypassed.
Fix it by removing the double access to src->nr_segments.
This is part of XSA-155.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The Xen toolstack uses "vhd" to specify a disk in VHD format, however
the name of the driver in QEMU is "vpc". Replace "vhd" with "vpc", so
that QEMU can find the right driver to use for it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The official way of enabling dataplane is through the "iothread"
property that references an iothread object created by "-object
iothread". Since the old "x-data-plane=on" way now even crashes, it's
probably easier to just drop it:
$ qemu-system-x86_64 -drive file=null-co://,id=d0,if=none \
-device virtio-blk-pci,drive=d0,x-data-plane=on
ERROR:/home/fam/work/qemu/qom/object.c:1515:
object_get_canonical_path_component: assertion failed: (obj->parent != NULL)
Aborted
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1449485967-19240-1-git-send-email-famz@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For the "main area on file, oob in memory" case, fix the shifts so that
we erase the correct number of pages.
Signed-off-by: Ricard Wanderlöf <ricardw@axis.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This code has been dead for three years (since commit 7e7b7cba1).
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
"werror=report" would free the req in virtio_blk_handle_rw_error, we
mustn't write to it in that case.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1448239280-15025-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The shifts of the address mask and value shift beyond 32 bits when there
are 5 address cycles.
Cc: qemu-stable@nongnu.org
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a request R is absorbed by request M, it is appended to the
"mr_next" queue led by M, and is completed together with the completion
of M, in virtio_blk_rw_complete.
During DMA restart in virtio_blk_dma_restart_bh, requests in s->rq are
parsed and submitted again, possibly with a stale req->mr_next. It could
be a problem if the request merging in virtio_blk_handle_request hasn't
refreshed every mr_next pointer, in which case, virtio_blk_rw_complete
could walk through unexpected requests following the stale pointers.
Fix this by unsetting the pointer in virtio_blk_rw_complete. It is safe
because this req is either completed and freed right away, or it will be
restarted and parsed from scratch out of the vq later.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently both BLKIF_OP_WRITE and BLKIF_OP_FLUSH_DISKCACHE are being
accounted as write operations.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 7a2a14e3ac62027aa6267a6c02abc70717be9c0a.1446044837.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When creating snapshot with the dataplane enabled, the snapshot file gets
not the actual state of virtqueue, because the current state is stored in
VirtIOBlockDataPlane. Therefore, before saving snapshot need to sync
the dataplane vring state to the virtqueue. The dataplane will resume its
work at the next notify virtqueue.
When snapshot loads with loadvm we get a message:
VQ 0 size 0x80 Guest index 0x15f5 inconsistent with Host index 0x0:
delta 0x15f5
error while loading state for instance 0x0 of device
'0000:00:08.0/virtio-blk'
Error -1 while loading VM state
to reproduce the error I used the following hmp commands:
savevm snap1
loadvm snap1
qemu parameters:
--enable-kvm -smp 4 -m 1024 -drive file=/var/lib/libvirt/images/centos6.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0 -set device.virtio-disk0.x-data-plane=on
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1445859777-2982-1-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Devices that are compliant with virtio-1 do not support scsi
passthrough any more (and it has not been a recommended setup
anyway for quite some time). To avoid having to switch it off
explicitly in newer qemus that turn on virtio-1 by default, let's
switch the default to scsi=false for 2.5.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1444991154-79217-4-git-send-email-cornelia.huck@de.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
They will be excluded by type in the nested event loops in block layer,
so that unwanted events won't be processed there.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers pass in false, and the real external ones will switch to
true in coming patches.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_bs() will not necessarily return a non-NULL value any more (unless
blk_is_available() is true or it can be assumed to otherwise, e.g.
because it is called immediately after a successful blk_new_with_bs() or
blk_new_open()).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The tray of an FDD is open iff there is no medium inserted (there are
only two states for an FDD: "medium inserted" or "no medium inserted").
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Simplify memory allocation by sticking with a single API. GSlice
is not that fast anyway (tcmalloc/jemalloc are better).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The raw-posix block driver implements Linux AIO batching so multiple
requests can be submitted with a single io_submit(2) system call.
Batching is currently only used by virtio-scsi and
virtio-blk-data-plane.
Enable batching for regular virtio-blk so the number of io_submit(2)
system calls is reduced for workloads with queue depth > 1.
In 4KB random read performance tests with queue depth 32, the CPU
utilization on the host is reduced by 9.4%. The fio job is as follows:
[global]
bs=4k
ioengine=libaio
iodepth=32
direct=1
sync=0
time_based=1
runtime=30
clocksource=gettimeofday
ramp_time=5
[job1]
rw=randread
filename=/dev/vdb
size=4096M
write_bw_log=fio
write_iops_log=fio
write_lat_log=fio
log_avg_msec=1000
This benchmark was run on an raw image on LVM. The disk was an SSD
drive and -drive cache=none,aio=native was used.
Tested-by: Pradeep Surisetty <psuriset@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Symptom:
$ qemu-system-x86_64 -m 10000000
Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456:
upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory
Aborted (core dumped)
Root cause: commit ef701d7 screwed up handling of out-of-memory
conditions. Before the commit, we report the error and exit(1), in
one place, ram_block_add(). The commit lifts the error handling up
the call chain some, to three places. Fine. Except it uses
&error_abort in these places, changing the behavior from exit(1) to
abort(), and thus undoing the work of commit 3922825 "exec: Don't
abort when we can't allocate guest memory".
The three places are:
* memory_region_init_ram()
Commit 4994653 (right after commit ef701d7) lifted the error
handling further, through memory_region_init_ram(), multiplying the
incorrect use of &error_abort. Later on, imitation of existing
(bad) code may have created more.
* memory_region_init_ram_ptr()
The &error_abort is still there.
* memory_region_init_rom_device()
Doesn't need fixing, because commit 33e0eb5 (soon after commit
ef701d7) lifted the error handling further, and in the process
changed it from &error_abort to passing it up the call chain.
Correct, because the callers are realize() methods.
Fix the error handling after memory_region_init_ram() with a
Coccinelle semantic patch:
@r@
expression mr, owner, name, size, err;
position p;
@@
memory_region_init_ram(mr, owner, name, size,
(
- &error_abort
+ &error_fatal
|
err@p
)
);
@script:python@
p << r.p;
@@
print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column)
When the last argument is &error_abort, it gets replaced by
&error_fatal. This is the fix.
If the last argument is anything else, its position is reported. This
lets us check the fix is complete. Four positions get reported:
* ram_backend_memory_alloc()
Error is passed up the call chain, ultimately through
user_creatable_complete(). As far as I can tell, it's callers all
handle the error sanely.
* fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize()
DeviceClass.realize() methods, errors handled sanely further up the
call chain.
We're good. Test case again behaves:
$ qemu-system-x86_64 -m 10000000
qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory
[Exit 1 ]
The next commits will repair the rest of commit ef701d7's damage.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1441983105-26376-3-git-send-email-armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD
NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j
fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H
dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF
JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3
7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA=
=nB06
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
# gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream: (44 commits)
cutils: work around platform differences in strto{l,ul,ll,ull}
cpu-exec: fix lock hierarchy for user-mode emulation
exec: make mmap_lock/mmap_unlock globally available
tcg: comment on which functions have to be called with mmap_lock held
tcg: add memory barriers in page_find_alloc accesses
remove unused spinlock.
replace spinlock by QemuMutex.
cpus: remove tcg_halt_cond and tcg_cpu_thread globals
cpus: protect work list with work_mutex
scripts/dump-guest-memory.py: fix after RAMBlock change
configure: Add support for jemalloc
add macro file for coccinelle
configure: factor out adding disas configure
vhost-scsi: fix wrong vhost-scsi firmware path
checkpatch: remove tests that are not relevant outside the kernel
checkpatch: adapt some tests to QEMU
CODING_STYLE: update mixed declaration rules
qmp: Add example usage of strto*l() qemu wrapper
cutils: Add qemu_strtoull() wrapper
cutils: Add qemu_strtoll() wrapper
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A number of files were including signal.h but not using any
of the functions it provides
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit ef546f1275 ("virtio: add
feature checking helpers") introduced a helper __virtio_has_feature.
We don't want to use reserved identifiers, though, so let's
rename __virtio_has_feature to virtio_has_feature and virtio_has_feature
to virtio_vdev_has_feature.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The i8257 DMA controller uses an idle bottom half, which by default
does not cause the main loop to exit. Therefore, the DMA_schedule
function is there to ensure that the CPU relinquishes the iothread
mutex to the iothread.
However, this is not enough since the iothread will call
aio_compute_timeout() and go to sleep again. In the iothread
world, forcing execution of the idle bottom half is much simpler,
and only requires a call to qemu_notify_event(). Do it, removing
the need for the "cpu_request_exit" pseudo-irq. The next patch
will remove it.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use pow2ceil() to round up to the next power of 2, rather
than an inline calculation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1437741192-20955-4-git-send-email-peter.maydell@linaro.org
Chapter 6.3 of spec said
"
Transitional devices MUST offer, and if offered by the device
transitional drivers MUST accept the following:
VIRTIO_F_ANY_LAYOUT (27)
"
So this patch only clear VIRTIO_F_LAYOUT for legacy device.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
SCSI passthrough was no longer supported in virtio 1.0, so this patch
fail the get_features() when both 1.0 and scsi is set. And also only
advertise VIRTIO_BLK_F_SCSI for legacy virtio-blk device.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Implement support in Identify and Get/Set Features to properly report
and allow to change the Volatile Write Cache status reported by the
virtual NVMe device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Implement a real flush instead of faking it. This is especially important
as Qemu assume Write back cashing by default and thus requires a working
cache flush operation for data integrity.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
in_num = req->elem.in_num, and req->elem.in_num is
checked in line 489, so the check about in_num variable
is superflous, let's drop it.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1435138164-11728-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Each call of the virtio_blk_reset() function calls blk_drain_all(),
which works for all existing BlockDriverStates, while draining only
one is needed.
This patch replaces blk_drain_all() by blk_drain() in
virtio_blk_reset(). virtio_blk_data_plane_stop() should be called
after draining because it restores vblk->complete_request.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Message-id: 1434537440-28236-3-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We create optional sections with this patch. But we already have
optional subsections. Instead of having two mechanism that do the
same, we can just generalize it.
For subsections we just change:
- Add a needed function to VMStateDescription
- Remove VMStateSubsection (after removal of the needed function
it is just a VMStateDescription)
- Adjust the whole tree, moving the needed function to the corresponding
VMStateDescription
Signed-off-by: Juan Quintela <quintela@redhat.com>
* CONFIG_PARALLEL fix from Mirek
* Atomic/optimized dirty bitmap access from myself and Stefan
* BUILD_DIR convenience/bugfix from Peter C
* Memory leak fix from Shannon
* SMM improvements (though still TCG only) from myself and Gerd, acked by mst
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJVceAwAAoJEL/70l94x66Dyz4H/RHS/OUGo6HOwG1FZ4l8RxRl
FY+pwJqinxFyGySmMLVHEeQCsIfxgi8bOmuWblG7sdt245nhMIj2jglyEOCUA3RN
Q9qxQr6QyXBWiwK4bfB7xI1z3/mc8cVvuxjtkLaBMa16A4MXMunWCDcyhsX9/0Vw
VySgTgBbn5AyY5x58TbkB7Tl6hMZgxF0yNwU6IGQvP079dgREAL2tzR1Wk8kPC80
ltLWlrwTAzF2km5m6rmstpMeZ/XIaq3DD2LU03SyUhefMsYowGKK+7Boo4lHpVm9
XAlxflahN7VGtQuno5RpYNNSzGqSJgqu5X5JxCMnbWdPi4sX3bijQdcUhW3/0oo=
=KPIz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* KVM error improvement from Laurent
* CONFIG_PARALLEL fix from Mirek
* Atomic/optimized dirty bitmap access from myself and Stefan
* BUILD_DIR convenience/bugfix from Peter C
* Memory leak fix from Shannon
* SMM improvements (though still TCG only) from myself and Gerd, acked by mst
# gpg: Signature made Fri Jun 5 18:45:20 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (62 commits)
update Linux headers from kvm/next
atomics: add explicit compiler fence in __atomic memory barriers
ich9: implement SMI_LOCK
q35: implement TSEG
q35: add test for SMRAM.D_LCK
q35: implement SMRAM.D_LCK
q35: add config space wmask for SMRAM and ESMRAMC
q35: fix ESMRAMC default
q35: implement high SMRAM
hw/i386: remove smram_update
target-i386: use memory API to implement SMRAM
hw/i386: add a separate region that tracks the SMRAME bit
target-i386: create a separate AddressSpace for each CPU
vl: run "late" notifiers immediately
qom: add object_property_add_const_link
vl: allow full-blown QemuOpts syntax for -global
pflash_cfi01: add secure property
pflash_cfi01: change to new-style MMIO accessors
pflash_cfi01: change big-endian property to BIT type
target-i386: wake up processors that receive an SMI
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When this property is set, MMIO accesses are only allowed with the
MEMTXATTRS_SECURE attribute. This is used for secure access to UEFI
variables stored in flash.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The RQM bit in MSR should be set whenever the guest is supposed to
access the FIFO, and it should be cleared in all other cases. This is
important so the guest can't continue writing/reading the FIFO beyond
the length that it's suppossed to access (see CVE-2015-3456).
Commit e9077462 fixed the CVE by adding code that avoids the buffer
overflow; however it doesn't correct the wrong behaviour of the floppy
controller which should already have cleared RQM.
Currently, RQM stays set all the time and during all phases while a
command is being processed. This is error-prone because the command has
to explicitly clear the flag if it doesn't need data (and indeed, the
two buggy commands that are the culprits for the CVE just forgot to do
that).
This patch clears RQM immediately as soon as all bytes that are expected
have been received. If the the FIFO is used in the next phase, the flag
has to be set explicitly there.
It also clear RQM after receiving all bytes even if the phase transition
immediately sets it again. While it's technically not necessary at the
moment because the state between clearing and setting RQM is not
observable by the guest, this is more explicit and matches how real
hardware works. It will actually become necessary in qemu once
asynchronous code paths are introduced.
This alone should have been enough to fix the CVE, but now we have two
lines of defense - even better.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-8-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This commit makes similar improvements as have already been made to the
write function: Instead of relying on a flag in the MSR to distinguish
controller phases, use the explicit phase that we store now. Assertions
of the right MSR flags are added.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-7-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Factor out a few common lines of code, reformat, improve comments.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-6-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Instead of relying on a flag in the MSR to distinguish controller phases,
use the explicit phase that we store now. Assertions of the right MSR
flags are added.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-5-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
The floppy controller spec describes three different controller phases,
which are currently not explicitly modelled in our emulation. Instead,
each phase is represented by a combination of flags in registers.
This patch makes explicit in which phase the controller currently is.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-4-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
What callers really do with this function is to switch from execution
phase (including data transfers) to result phase where the guest can
read out one or more status bytes from the FIFO (the number depends on
the command).
Rename the function accordingly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-3-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
What all callers of fdctrl_reset_fifo() really want to do is to start
the command phase, where writes to the data port initiate a new command.
The function doesn't only clear the FIFO, but also sets up the state so
that a new command can be received. Rename it to reflect this.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-2-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Make features 64bit wide everywhere.
On migration a full 64bit guest_features field is sent if one of the
high bits is set, in addition to the lower 32bit guest_features field
which must stay for compatibility reasons. That way we send the lower
32 feature bits twice, but the code is simpler because we don't have
to split and compose the 64bit features into two 32bit fields.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The SCSI emulation in the Linux NVMe driver really wants to know
if a device has a volatile write cache. Given that qemu has moved
away from a model where we report the backing store WCE bit to
one where the WCE bit is supposed to be part of the migratable
guest-visible state we always return 1 here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
During processing of certain commands such as FD_CMD_READ_ID and
FD_CMD_DRIVE_SPECIFICATION_COMMAND the fifo memory access could
get out of bounds leading to memory corruption with values coming
from the guest.
Fix this by making sure that the index is always bounded by the
allocated memory.
This is CVE-2015-3456.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Windows seems to send two separate calls to NVMe controller configuration. The
first sends configuration info and the second the enable bit. I couldn't
enable the Windows 8.1 in-box NVMe driver with base Qemu. I made the
following change to store the configuration data and then handle enable and
NVMe driver works on Windows 8.1.
I am not a Windows expert and I'm not entirely sure this is the correct
approach. I'm offering it for anyone who wishes to use NVMe on Windows 8.1
using Qemu.
I have tested this change with Linux and Windows guests with NVMe devices.
Signed-off-by: Daniel Stekloff <dan@wendan.org>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All of them were reported by codespell.
Most typos are in comments, one is in an error message.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
They were introduced in 6f7e9aec5e and
82407d1a40 and lots of bug fixes were done after that.
This fixes (at least) the detection of the floppy controller on Debian 4.0r9/SPARC,
and SS-5's OBP initialization routine still works.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 1426351846-6497-1-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
Delay the call to blk_blockalign() until s->blk has been assigned.
This never caused a crash because blk_blockalign(NULL, size) defaults to
4096 alignment but it's technically incorrect.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1429091024-25098-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Of the block devices that poked into -drive options via drive_get_next,
m25p80 was the only one who also did not attach itself to the BlockBackend.
Since sd does it, and all other devices go through a "drive" property,
with this change all block backends attached to the guest will have a
non-NULL result for blk_get_attached_dev().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1429025387-11077-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After qemu_iovec_destroy, the QEMUIOVector's size is zeroed and
the zero size ultimately is used to compute virtqueue_push's len
argument. Therefore, reads from virtio-blk devices did not
migrate their results correctly. (Writes were okay).
Save the size in virtio_blk_handle_request, and use it when the request
is completed.
Based on a patch by Wen Congyang.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-id: 1427997044-392-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Drives defined with if!=none are for board initialization to wire up.
Board code calls drive_get() or similar to find them, and creates
devices with their qdev drive properties set accordingly.
Except a few devices go on a fishing expedition for a suitable backend
instead of exposing a drive property for board code to set: they call
driver_get() or drive_get_next() in their realize() or init() method
to implicitly connect to the "next" backend with a certain interface
type.
Picking up backends that way works when the devices are created by
board code. But it's inappropriate for -device or device_add. Not
only is this inconsistent with how the other block device models work
(they connect to a backend explicitly identified by a "drive"
property), it breaks when the "next" backend has been picked up by the
board already.
Example:
$ qemu-system-arm -S -M connex -pflash flash.img -device ssi-sd
Aborted (core dumped)
Mark them with suitable FIXME comments.
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: "Andreas Färber" <andreas.faerber@web.de>
Cc: Michael Walle <michael@walle.cc>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The shift operation on nlb gives a 32 bit result if no type cast is
applied. This bug was reported by Coverity.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1426348844-8793-1-git-send-email-sw@weilnetz.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Coverity/sparse fix for iscsi driver
- RCU fallout: fix -daemonize and s390x system emulation
- KVM: kvm_stat improvements and new man page
- x86: SYSRET fix for VxWorks
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJU/sUFAAoJEL/70l94x66D1JwIAJ28Lan2DQwi+xHvNxF8zW6n
v7eMc04/fepuon0TYmUZC3qbqc00sccEQZQ+yAAauT9epZ/kdSDudDOzG+3F4MuQ
/X3crXw2/jrhtWedGq49vFCONX4MKoaoudqK8kOFMe1ImQgkOYeAzOoqeFXyHsFh
jINlKTJZB6oKzrZ+SYryY14cO7pvGaIhyqaCC+6GcVihTjm9Yq13lP1lFj7LsVRV
aGfd6xH9RSV/mwzvZwD4i3cUWSUaV/wY0NDhAEzDPCUcxX0/nAj3XF1YeJUF30Qd
ETaCLo/Nxq2R6POK3c/Zm/FRLvjzZ2caD+q1LcwB/bCYdc2lJ1JDxE/hr48ANv0=
=OWXY
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- scsi: improvements to error reporting and conversion to realize,
Coverity/sparse fix for iscsi driver
- RCU fallout: fix -daemonize and s390x system emulation
- KVM: kvm_stat improvements and new man page
- x86: SYSRET fix for VxWorks
# gpg: Signature made Tue Mar 10 10:18:45 2015 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
x86: fix SS selector in SYSRET
scsi: Convert remaining PCI HBAs to realize()
scsi: Improve error reporting for invalid drive property
hw: Propagate errors through qdev_prop_set_drive()
scsi: Clean up duplicated error in legacy if=scsi code
cpus: initialize cpu->memory_dispatch
rcu: handle forks safely
qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK
kvm_stat: add kvm_stat.1 man page
kvm_stat: add column headers to text UI
iscsi: Fix check for username
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJU/uuVAAoJEH8JsnLIjy/WULwP/jeARjYkFuG3ahSWpeY0JnTK
QCkLF06iSQQUiirXI4H+Tofl8kNVBd/Iinv+LbkF27iWbTiwalmLz7NiyboX8dl+
NJZtCrqp44q7KFbl3g19/jop/zdZ9N5Gxp8BARVUILHQb1y5cXJwrDhBxTmNRDL+
sSZXfomCgKtMP40nGLa0CcNIYKlm8MePJEM2TsMoWv7tYz4CXgBG39EqK6NJluCY
kTTMcbdrLbR0imfKOVPutCgV8rhRXJ0oGVD3Q+D3/LFmPG++hoRnWCcDm6ZZ62Hi
Ra7u87TBfAUUtiT+vFQJnd7hTpN+stQidsCDBLEY3qPTKYhzm648PHvcEwOAv6YW
sjAELF2Rrsbe4vkL3/qgYDusnaPMElrHVEdbKtHofWtg6KctLnYIhusV+qKq1Fpa
cRQEbQIZMVFeWN1G9WuYH8RBYrwJqp+/qq7DcnV62lUAdY4e3iO7E3yMLFDwpxku
PLl7eofU/ZpnAOrrU2QAQvgXZRqy1ie/Unv8jFwefQkK5mXHoCtkAeBlOM8t4kJf
HjkC/hYO7kwPdaz6xK80wpXqYd3vT6jKi7mlJqC5oQQLGJbRigxlMZ16UIAx+IrL
NxhnQChp7IP21KMATFbpvYjcJyGMw3ZuVRaUhQBgqQArIomVHvM5WcN9M6S5dsmj
vClFOIqjlSbtsmChceWr
=hlbC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches for 2.3
# gpg: Signature made Tue Mar 10 13:03:17 2015 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (73 commits)
MAINTAINERS: Add jcody as blockjobs, block devices maintainer
iotests: add O_DIRECT alignment probing test
block/raw-posix: fix launching with failed disks
MAINTAINERS: Add jsnow as IDE maintainer
sheepdog: Fix misleading error messages in sd_snapshot_create()
Add testcase for scsi-hd devices without drive property
scsi-hd: fix property unset case
block/vdi: Add locking for parallel requests
iotests: Drop vpc from 004's and 104's format list
iotests: Remove 006
iotests: Fix 051's reference output
virtio-blk: Remove the stale FIXME comment
tests: Check QVIRTIO_F_ANY_LAYOUT flag in virtio-blk test
libqos: Solve bug in interrupt checking when using MSIX in virtio-pci.c
sheepdog: fix confused return values
qtest/ahci: add fragmented dma test
qtest/ahci: Add PIO and LBA48 tests
qtest/ahci: Add DMA test variants
libqos/ahci: add ahci command helpers
qtest/ahci: Add a macro bootup routine
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
By default, we have ioeventfd enabled, so the IO request processing is
in IO thread; in the vcpu thread, guest mode is returned to as quickly
as possible, and completion is delivered via irqfd. Therefore this
comment from the initial implementation is barely relevant.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
geometry: hd_geometry_guess function autodetects the drive geometry.
This patch adds a block backend call, that probes the backing device
geometry. If the inner driver method is implemented and succeeds
(currently only for DASDs), the blkconf_geometry will pass-through
the backing device geometry. Otherwise will fallback to old logic.
blocksize: This patch initializes blocksize properties to 0.
In order to set the property a blkconf_blocksizes was introduced.
If user didn't set physical or logical blocksize, it will
retrieve its value from a driver (only succeeds for DASD), otherwise
it will set default 512 value.
The blkconf_blocksizes call was added to all users of BlkConf.
Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424087278-49393-6-git-send-email-tumanova@linux.vnet.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since commit 1dc936aa84 (virtio-blk: Use blk_aio_ioctl) we silently lose
the request if blk_aio_ioctl returns NULL (not implemented).
Fix it by directly returning VIRTIO_BLK_S_UNSUPP as we used to do.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ kwolf: Fixed build error on win32 ]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Three kinds of callers:
1. On failure, report the error and abort
Passing &error_abort does the job. No functional change.
2. On failure, report the error and exit()
This is qdev_prop_set_drive_nofail(). Error reporting moves from
qdev_prop_set_drive() to its caller. Because hiding away the error
in the monitor right before exit() isn't helpful, replace
qerror_report_err() by error_report_err(). Shouldn't make a
difference, because qdev_prop_set_drive_nofail() should never be
used in QMP context.
3. On failure, report the error and recover
This is usb_msd_init() and scsi_bus_legacy_add_drive(). Error
reporting and freeing the error object moves from
qdev_prop_set_drive() to its callers.
Because usb_msd_init() can't run in QMP context, replace
qerror_report_err() by error_report_err() there.
No functional change.
scsi_bus_legacy_add_drive() calling qerror_report_err() is of
course inappropriate, but this commit merely makes it more obvious.
The next one will clean it up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-Id: <1425925048-15482-3-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a helper function for checking whether a bit is set in the guest
features for a vdev as well as one that works on a feature bit set.
Convert code that open-coded this: It cleans up the code and makes it
easier to extend the guest feature bits.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add virtio_{add,clear}_feature helper functions for manipulating a
feature bits variable. This has some benefits over open coding:
- add check that the bit is in a sane range
- make it obvious at a glance what is going on
- have a central point to change when we want to extend feature bits
Convert existing code manipulating features to use the new helpers.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop duplicated code. Minor codechanges were required
as geometry is a sub-structure now.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
As part of the required changes, this fixes a bug where specifying an
invalid driver would result in the block layer probing the image format;
now it will result in an error, unless "<unset>" is specified as the
driver name. Fixing this would require further work on the xen_disk code
which does not seem worth it (at this point and for this patch).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1423162705-32065-7-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The vring.c code currently assumes that guest and host endianness match,
which is not true for a number of cases:
- emulating targets with a different endianness than the host
- bi-endian targets, where the correct endianness depends on the virtio
device
- upcoming support for the virtio-1 standard mandates little-endian
accesses even for big-endian targets and hosts
Make sure to use accessors that depend on the virtio device.
Note that dataplane now needs to be built per-target.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1422289602-17874-2-git-send-email-cornelia.huck@de.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
we check and adjust request sizes at several places with
sometimes inconsistent checks or default values:
INT_MAX
INT_MAX >> BDRV_SECTOR_BITS
UINT_MAX >> BDRV_SECTOR_BITS
SIZE_MAX >> BDRV_SECTOR_BITS
This patches introdocues a macro for the maximal allowed sectors
per request and uses it at several places.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
this adds a knob to disable request merging for debugging or benchmarks if dedired.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
this patch finally introduces multiread support to virtio-blk. While
multiwrite support was there for a long time, read support was missing.
The complete merge logic is moved into virtio-blk.c which has
been the only user of request merging ever since. This is required
to be able to merge chunks of requests and immediately invoke callbacks
for those requests. Secondly, this is required to switch to
direct invocation of coroutines which is planned at a later stage.
The following benchmarks show the performance of running fio with
4 worker threads on a local ram disk. The numbers show the average
of 10 test runs after 1 run as warmup phase.
| 4k | 64k | 4k
MB/s | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand
--------------+--------+---------+--------+---------+--------+--------
master | 1221 | 1187 | 4178 | 4114 | 1745 | 1213
multiread | 1829 | 1189 | 4639 | 4110 | 1894 | 1216
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As it was not obvious (at least for me) where the 32 comes from;
add a constant for it.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
- New "edu" device (v1->v2: fix 32-bit compilation)
- Disabling HLE and RTM on Haswell & Broadwell
- kvm_stat updates
- Added --enable-modules to Travis, in preparation for switching
the default
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJUxiioAAoJEL/70l94x66D+zQIAKVq9DPm4RNJ2/c2nt6phAVr
6Z5yB+TMf4BKFORwVkionvIOEqOC0pdm3oo93/XH7DTsN7pFg7rdwJl+ADESgvl6
+tpUbrgjZuCuNQNXy/mjx0EJQUmTk8/x+a054hSo6XNvs2ZM9HjaKNX3ojS6pG1J
mhIH4cjGPMCwu2hgm2mho/1zdIs4Qk3xT8Uzfq8i5gES14YX0Fmt93idUn3DRs7m
zHdzHWr0JmXfZweNDdPgsfGO6g+NnwgGUOqeGY4Ucmurkepk9ViCaaJP7lOnuRhT
52isayOrfrZsSLm5xxwtSUjmgbxzlOEit1b8jLzpHb5b9b+LJiCtuJnN7vwHb34=
=jgup
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- Many fixes from the floor as usual
- New "edu" device (v1->v2: fix 32-bit compilation)
- Disabling HLE and RTM on Haswell & Broadwell
- kvm_stat updates
- Added --enable-modules to Travis, in preparation for switching
the default
# gpg: Signature made Mon 26 Jan 2015 11:44:40 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
kvm_stat: Add RESET support for perf event ioctl
target-i386: Disable HLE and RTM on Haswell & Broadwell
sparse: Fix build with sparse on .S files
exec: fix madvise of NULL pointer
.travis.yml: Add "--enable-modules"
apic: do not dereference pointer before it is checked for NULL
kvm_stat: Print errno when syscall to perf_event_open() fails
kvm_stat: Update exit reasons to the latest defintion
kvm_stat: Add aarch64 support
hw: misc, add educational driver
vmstate: accept QEMUTimer in VMSTATE_TIMER*, add VMSTATE_TIMER_PTR*
qemu-timer: introduce timer_deinit
qemu-timer: add timer_init and timer_init_ns/us/ms
target-i386: make xmm_regs 512-bit wide
target-i386: use vmstate_offset_sub_array for AVX registers
tests/multiboot: Add test for modules
multiboot: Fix offset of bootloader name
tests/multiboot: Update reference output
pc: fix KVM features in pc-1.3 and earlier machine types
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the asynchronous interface of ioctl. This will not make the VM
unresponsive if the ioctl takes a long time.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In preparation for calling blk_aio_ioctl. Also make the function static
as no other files need it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
According to NVMe specifications Bits 15:08 represent Minor Version number.
Signed-off-by: Anubhav Rakshit <anubhav.rakshit@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
According to the specification, the low 16 bits should contain the number of
I/O submission queues, and the high 16 bits should contain the number of
I/O completion queues.
Signed-off-by: Alex Friedman <alex@e8storage.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Like BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET,
block-commit involves two asymmetric devices.
This change is not user-visible (yet), because commit only works with
device names.
But once we enable backing reference in blockdev-add, or specifying
node-name in block-commit command, we don't want the user to start two
commit jobs on the same backing chain, which will corrupt things because
of the final bdrv_swap.
Before we have per category blockers, splitting this type is still
better.
[Resolved virtio-blk dataplane conflict by replacing
BLOCK_OP_TYPE_COMMIT with both BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}.
They are safe since the block job runs in the same AioContext as the
dataplane IOThread.
--Stefan]
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUid7mAAoJEL7lnXSkw9fbX0YH/RXeoX7IN1QVEP8Z2dLq4nzt
igHm4Gd4ENX6pdluHmXAxzKoHR78yNGFegzuv0IJBfav8kGCQWW6zAf4TUk/udtn
AKPk1yauVKNzCvmYldrfbuu4HedZqkftE0tyDZuAK50pZH1hzX7qiAT1C0OlarLQ
Tqy4+ouYiRja2hLq4YJCM9mmYt0sbMDShIcHBYRTdD0cxoPZ+JZEeAQYg+FYNdIo
jioVg8NgmFZW37UWeBTCKG+DcX9NwXwyo/ASdIozM+aQcTBx/nXn7/NOAxXlxUX8
M9AS9iz+LWBfwof3HLbzLLTvTmE66Z78/TluFMmEbpK4ts0ZXRJKDHE/pfynRD0=
=PXlQ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2014-12-11' into staging
trivial patches for 2014-12-11
# gpg: Signature made Thu 11 Dec 2014 18:13:58 GMT using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg: aka "Michael Tokarev <mjt@corpit.ru>"
# gpg: aka "Michael Tokarev <mjt@debian.org>"
* remotes/mjt/tags/pull-trivial-patches-2014-12-11:
Sort include/qemu/typedefs.h
hpet: increase spelling precision
pflash_cfi02.c: associate "cfi.pflash02" to "Storage devices" category
vt82c686: fix coverity warning about out-of-bounds write
virtio: remove useless declaration of virtio_net_init()
qapi-schema: fix typo about change-vnc-password
fw_cfg: remove superfluous blank line
get_maintainer.pl: Remove the --git-chief-penguins option
configure: Replace which(1) with "has"
util: Use g_new() & friends where that makes obvious sense
util: Fuse g_malloc(); memset() into g_new0()
util: Drop superfluous conditionals around g_free()
Drop superfluous conditionals around g_strdup()
Drop superfluous conditionals around qemu_opts_del()
usb: delete redundant brackets in usb_host_handle_control()
virtio-bus: avoid breaking build when open DEBUG switch
acpi-build: Make DPRINTF working for acpi-build
acpi-build: adjust indention 8 -> 4 spaces
target-s390x: fix possible out of bounds read
qmp: fix typo in input-send-event examples
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Initialise our maximum page size capability to 64kB and increase
the page_size variable from 16 to 32 bits.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The transaction QMP command performs operations atomically on a group of
drives. This command needs to acquire AioContext in order to work
safely when virtio-blk dataplane IOThreads are accessing drives.
The transactional nature of the command means that actions are split
into prepare, commit, abort, and clean functions. Acquire the
AioContext in prepare and don't release it until one of the other
functions is called. This prevents the IOThread from running the
AioContext before the transaction has completed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416566940-4430-4-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add dataplane support to the change-backing-file QMP commands. By
acquiring the AioContext we avoid race conditions with the dataplane
thread which may also be accessing the BlockDriverState.
Note that this command operates on both bs and a node in its chain
(image_bs). The bdrv_chain_contains(bs, image_bs) check guarantees that
bs and image_bs are in the same AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
By acquiring the AioContext we avoid race conditions with the dataplane
thread which may also be accessing the BlockDriverState.
Fix up eject, change, and block_passwd in a single patch because
qmp_eject() and qmp_change_blockdev() both call eject_device(). Also
fix block_passwd while we're tackling a command that takes a block
encryption password.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add dataplane support to the blockdev-snapshot-delete-internal-sync QMP
command. By acquiring the AioContext we avoid race conditions with the
dataplane thread which may also be accessing the BlockDriverState.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Operands don't affect result (CONSTANT_EXPRESSION_RESULT)
((n->bar.aqa >> AQA_ASQS_SHIFT) & AQA_ASQS_MASK) > 4095
is always false regardless of the values of its operands.
This occurs as the logical second operand of '||'.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch fixes two issues with persistent grants and the disk PV backend
(Qdisk):
- Keep track of memory regions where persistent grants have been mapped
since we need to unmap them as a whole. It is not possible to unmap a
single grant if it has been batch-mapped. A new check has also been added
to make sure persistent grants are only used if the whole mapped region
can be persistently mapped in the batch_maps case.
- Unmap persistent grants before switching to the closed state, so the
frontend can also free them.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Now that blockjobs use AioContext they are safe for use with dataplane.
Unblock them!
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-12-git-send-email-stefanha@redhat.com
blockdev_init() always creates a DriveInfo, but only drive_new() fills
it in. qmp_blockdev_add() leaves it blank. This results in a drive
with type = IF_IDE, bus = 0, unit = 0. Screwed up in commit ee13ed1c.
Board initialization code looking for IDE drive (0,0) can pick up one
of these bogus drives. The QMP command has to execute really early to
be visible. Not sure how likely that is in practice.
Fix by creating DriveInfo in drive_new(). Block backends created by
blockdev-add don't get one.
Breaks the test for "has been created by qmp_blockdev_add()" in
blockdev_mark_auto_del() and do_drive_del(), because it changes the
value of dinfo && !dinfo->enable_auto_del from true to false. Simply
test !dinfo instead.
Leaves DriveInfo member enable_auto_del unused. Drop it.
A few places assume a block backend always has a DriveInfo. Fix them
up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Device models should access their block backends only through the
block-backend.h API. Convert them, and drop direct includes of
inappropriate headers.
Just four uses of BlockDriverState are left:
* The Xen paravirtual block device backend (xen_disk.c) opens images
itself when set up via xenbus, bypassing blockdev.c. I figure it
should go through qmp_blockdev_add() instead.
* Device model "usb-storage" prompts for keys. No other device model
does, and this one probably shouldn't do it, either.
* ide_issue_trim_cb() uses bdrv_aio_discard() instead of
blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
which has only the BlockDriverState.
* PC87312State has an unused BlockDriverState[] member.
The next two commits take care of the latter two.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is consistent with how VirtIOFOOConf variables are named
elsewhere, and makes blk available for BlockBackend variables.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 12c5674 turned it into a pointer to member blk.conf.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'll use BlockDriverAIOCB with block backends shortly, and the name is
going to fit badly there. It's a block layer thing anyway, not just a
block driver thing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The patch is big, but all it really does is replacing
dinfo->bdrv
by
blk_bs(blk_by_legacy_dinfo(dinfo))
The replacement is repetitive, but the conversion of device models to
BlockBackend is imminent, and will shorten it to just
blk_legacy_dinfo(dinfo).
Line wrapping muddies the waters a bit. I also omit tests whether
dinfo->bdrv is null, because it never is.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On BlockBackend destruction, unref its BlockDriverState. Replaces the
callers' unrefs.
This turns the pointer from BlockBackend to BlockDriverState into a
strong reference, managed with bdrv_ref() / bdrv_unref(). The
back-pointer remains weak.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Convenience function blk_new_with_bs() creates a BlockBackend with its
BlockDriverState. Callers have to unref both. The commit after next
will relieve them of the need to unref the BlockDriverState.
Complication: due to the silly way drive_del works, we need a way to
hide a BlockBackend, just like bdrv_make_anon(). To emphasize its
"special" status, give the function a suitably off-putting name:
blk_hide_on_behalf_of_do_drive_del(). Unfortunately, hiding turns the
BlockBackend's name into the empty string. Can't avoid that without
breaking the blk->bs->device_name equals blk->name invariant.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A block device consists of a frontend device model and a backend.
A block backend has a tree of block drivers doing the actual work.
The tree is managed by the block layer.
We currently use a single abstraction BlockDriverState both for tree
nodes and the backend as a whole. Drawbacks:
* Its API includes both stuff that makes sense only at the block
backend level (root of the tree) and stuff that's only for use
within the block layer. This makes the API bigger and more complex
than necessary. Moreover, it's not obvious which interfaces are
meant for device models, and which really aren't.
* Since device models keep a reference to their backend, the backend
object can't just be destroyed. But for media change, we need to
replace the tree. Our solution is to make the BlockDriverState
generic, with actual driver state in a separate object, pointed to
by member opaque. That lets us replace the tree by deinitializing
and reinitializing its root. This special need of the root makes
the data structure awkward everywhere in the tree.
The general plan is to separate the APIs into "block backend", for use
by device models, monitor and whatever other code dealing with block
backends, and "block driver", for use by the block layer and whatever
other code (if any) dealing with trees and tree nodes.
Code dealing with block backends, device models in particular, should
become completely oblivious of BlockDriverState. This should let us
clean up both APIs, and the tree data structures.
This commit is a first step. It creates a minimal "block backend"
API: type BlockBackend and functions to create, destroy and find them.
BlockBackend objects are created and destroyed exactly when root
BlockDriverState objects are created and destroyed. "Root" in the
sense of "in bdrv_states". They're not yet used for anything; that'll
come shortly.
A root BlockDriverState is created with bdrv_new_root(), so where to
create a BlockBackend is obvious. Where these roots get destroyed
isn't always as obvious.
It is obvious in qemu-img.c, qemu-io.c and qemu-nbd.c, and in error
paths of blockdev_init(), blk_connect(). That leaves destruction of
objects successfully created by blockdev_init() and blk_connect().
blockdev_init() is used only by drive_new() and qmp_blockdev_add().
Objects created by the latter are currently indestructible (see commit
48f364d "blockdev: Refuse to drive_del something added with
blockdev-add" and commit 2d246f0 "blockdev: Introduce
DriveInfo.enable_auto_del"). Objects created by the former get
destroyed by drive_del().
Objects created by blk_connect() get destroyed by blk_disconnect().
BlockBackend is reference-counted. Its reference count never exceeds
one so far, but that's going to change.
In drive_del(), the BB's reference count is surely one now. The BDS's
reference count is greater than one when something else is holding a
reference, such as a block job. In this case, the BB is destroyed
right away, but the BDS lives on until all extra references get
dropped.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Creating an anonymous BDS can't fail. Make that obvious.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On this way, we can assure the new bootindex take effect
during vm rebooting.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
At present, nvma cannot boot. However, it provides already
a bootindex property, so change bootindex to qom for nvma
device, but not call add_boot_device_path.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Remove bootindexA/B form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Dataplane doesn't depend on linux-aio any more, so we don't need the
compiling condition now.
Configure options are kept but just print a message.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1410329871-28885-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJUEwy3AAoJEH8JsnLIjy/WqkUQALPsR68w2bB6aiN6zUaJt1X3
VaksCQGgtZdN6itDvn6v6ktayFXXfjRE+U0hK7joXUiokq17YZmKqf+1V4LPJRSW
Tv21gIAHuIyf+8LL/xGS3W9+EEXAaKbp1t6AT/VDWv/mQ4KY5xrvhn2E/+7r0wKr
EBOHrKd4tQualV12MtrZsrWZy3oMQvkimcVIfnjFZ2gJg5dmUBXQ35Kdj9+AxDiX
1hDizBRbozvzSBCnS9PUcJ1OfCxoCRewbHn43LeCYWyB8m3ttpdPpuMaUoSNGrVY
Tw7aYvYjMArr/ChrF8eH2vKJSeHabSPbYqgNsGqpS2n5KYJbzoyv8iQQCSHjtKZe
vagoIRomF/BtOWT8mvUSHGw2vmQm6JZJdHJsXNeyDJ/P8ZSSm0vsZMjqh6vwS7sB
+AURb5BaFWNnThwm80tJl23uJLjohNsdrmuLvAiHX0e03dyyQFDBS1zqb9BTbOsP
SdBPFZy1hA0deYnJlyeLj94iyIosdsMihLkDJrIdNzn6qMF9QCdFs+rgOepwsfml
ZNG1h2V+Wo3LS1SkKpK0mhiTBFLCit8Cq03+n95zBTcPCBMGgoJVC2VZef8XXKDn
v6vuSYikCkEIDEWhsUrIZmDWKv/83AwSW+i+ir3IOVgxOJ51Z/mr5PAQQ+3/Gaat
G5gSIDmW4rGgYDk/coDf
=3He1
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches
# gpg: Signature made Fri 12 Sep 2014 16:09:43 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (22 commits)
qcow2: Add falloc and full preallocation option
raw-posix: Add falloc and full preallocation option
qapi: introduce PreallocMode and new PreallocModes full and falloc.
block: don't convert file size to sector size
block: round up file size to nearest sector
iotests: Send the correct fd in socket_scm_helper
blockdev: Refuse to drive_del something added with blockdev-add
block: extend BLOCK_IO_ERROR with reason string
dataplane: fix virtio_blk_data_plane_create() op blocker error path
qemu-iotests: Run 025 for Archipelago block driver
block/archipelago: Implement bdrv_truncate()
block: Make the block accounting functions operate on BlockAcctStats
block: rename BlockAcctType members to start with BLOCK_ instead of BDRV_
block: Extract the block accounting code
block: Extract the BlockAcctStats structure
IDE: MMIO IDE device control should be little endian
thread-pool: Drop unnecessary includes
xen: Drop redundant bdrv_close() from pci_piix3_xen_ide_unplug()
xen_disk: Plug memory leak on error path
qemu-io: Clean up openfile() after commit 2e40134
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Build: fixing block/iscsi.so and ranlib warnings on Mac OS X
- Migration fixes for x86
- The odd KVM patch.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJUEXeWAAoJEBvWZb6bTYby4AwP/0Hh55A7QzkkzZ66y65zM+G5
dsgRcLjufHSRQHoNQqm6LOcicV3Ygc/X644EY6jnZCZxFh/fsWuTPqUDGxLAnxEc
2V0PkLRIScAMOPezzxvRy6/9hkG+UYM3ZOL5D9yxA9pGuBtttw7tkts19Vqf9WZc
NYG5TBDuEGM1c596Zpo7t10m+Oiw+Jyi5luLXsb4lh5ikdFPDrtJaf0AnFvR+ym0
HXlj2K/0vHNowUeLoo+oWnZsW8mLE6OyJhgfo1tJtsH1BR+lQJnBnQ4moq4Sl/Wz
+iht/4gtz34XwLILokFR6yiNrPe+MIryyv+FYxOD5loIdGVDtKMx30UkIE2/D933
6/n5i3GBLi9JapeT9gkKTxk/UVRPzJ1PK07RWevgNZNQyTGKAUGp+p48nSzMYX7V
7GFSy3Q8uqOR8g9n+t+RURxkoMNbhhw7v53Z3PPXPCALCMDzg9RARlW/nkfiExcZ
oThUjE/8xfMTQlN1SO5HTyQXEkYjtknZhfC7/KFvkWYMbCG0KBTf212Md0zlTNkj
+C6r8Gq4ZWVIc07QyKkoCMxB+a9Uhvy4T1PKuSlm6iu94zUgZRhdf/PlOXimhFqH
9GL67Tv15kpj05xCS6jDXjeMZ416/UKw91OcsiT1UUHcq7/rc+GBycd0ngV1UgnQ
di5V12IVt8JwdzFxMeCT
=GIKW
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- Memory: improve error reporting and avoid crashes on hotplug
- Build: fixing block/iscsi.so and ranlib warnings on Mac OS X
- Migration fixes for x86
- The odd KVM patch.
# gpg: Signature made Thu 11 Sep 2014 11:21:10 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream: (21 commits)
gdbstub: init mon_chr through qemu_chr_alloc
pckbd: adding new fields to vmstate
mc146818rtc: add missed field to vmstate
piix: do not set irq while loading vmstate
serial: fixing vmstate for save/restore
parallel: adding vmstate for save/restore
fdc: adding vmstate for save/restore
cpu: init vmstate for ticks and clock offset
apic_common: vapic_paddr synchronization fix
vl: use QLIST_FOREACH_SAFE to visit change state handlers
exec: add parameter errp to gethugepagesize
exec: report error when memory < hpagesize
hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big
memory: add parameter errp to memory_region_init_rom_device
memory: add parameter errp to memory_region_init_ram
exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr
rules.mak: Fix DSO build by pulling in archive symbols
util: Don't link host-utils.o if it's empty
util: Move general qemu_getauxval to util/getauxval.c
trace: Only link generated-tracers.o with "simple" backend
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 3718d8ab65 ("block: Replace in_use
with operation blocker") broke the error path because it consumed
local_err instead of propagating it.
The caller has no way to know that the function failed. This caused
virtio-blk to start "successfully" even though there was a fatal
dataplane error.
Steps to reproduce:
$ qemu-system-x86_64 -enable-kvm -object iothread,id=iothread0 \
-drive if=none,id=drive0,file=a.img \
(qemu) drive_mirror drive0 /tmp/foo.img
(qemu) device_add virtio-blk-pci,iothread=iothread0,drive=drive0
Expected result:
Since the mirror block job is using drive0 it is not possible to start
virtio-blk data-plane.
device_add fails and the PCI adapter is not added.
Actual result:
device_add completes and the PCI adapter is added.
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VMState added by this patch preserves correct
loading of the FDC device state.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is the next step for decoupling block accounting functions from
BlockDriverState.
In a future commit the BlockAcctStats structure will be moved from
BlockDriverState to the device models structures.
Note that bdrv_get_stats was introduced so device models can retrieve the
BlockAcctStats structure of a BlockDriverState without being aware of it's
layout.
This function should go away when BlockAcctStats will be embedded in the device
models structures.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Michael Tokarev <mjt@tls.msk.ru>
CC: John Snow <jsnow@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The middle term goal is to move the BlockAcctStats structure in the device models.
(Capturing I/O accounting statistics in the device models is good for billing)
This patch make a small step in this direction by removing a reference to BDRV.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>i
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The Error object was leaked after failed bdrv_new(). While there,
streamline control flow a bit.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add parameter errp to memory_region_init_rom_device and update all call
sites to propagate the error.
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[Propagate the error out of realize. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add parameter errp to memory_region_init_ram and update all call sites
to pass in &error_abort.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A drive that backs a pflash device is special:
- it is very small,
- its entire contents are kept in a RAMBlock at all times, covering the
guest-phys address range that provides the guest's view of the emulated
flash chip.
The pflash device model keeps the drive (the host-side file) and the
guest-visible flash contents in sync. When migrating the guest, the
guest-visible flash contents (the RAMBlock) is migrated by default, but on
the target host, the drive (the host-side file) remains in full sync with
the RAMBlock only if:
- the source and target hosts share the storage underlying the pflash
drive,
- or the migration requests full or incremental block migration too, which
then covers all drives.
Due to the special nature of pflash drives, the following scenario makes
sense as well:
- no full nor incremental block migration, covering all drives, alongside
the base migration (justified eg. by shared storage for "normal" (big)
drives),
- non-shared storage for pflash drives.
In this case, currently only those portions of the flash drive are updated
on the target disk that the guest reprograms while running on the target
host.
In order to restore accord, dump the entire flash contents to the bdrv in
a post_load() callback.
- The read-only check follows the other call-sites of pflash_update();
- both "pfl->ro" and pflash_update() reflect / consider the case when
"pfl->bs" is NULL;
- the total size of the flash device is calculated as in
pflash_cfi01_realize().
When using shared storage, or requesting full or incremental block
migration along with the normal migration, the patch should incur a
harmless rewrite from the target side.
It is assumed that, on the target host, RAM is loaded ahead of the call to
pflash_post_load().
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that drive_del acquires the AioContext we can safely allow deleting
the drive. As with non-dataplane mode, all I/Os submitted by the guest
after drive_del will return EIO.
This patch makes hot unplug work with virtio-blk dataplane. Previously
drive_del reported an error because the device was busy.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
reporting from Fam and a fix for DPRINTF bitrot. Memory patches try
again to initialize name from the QOM name.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJT/zhXAAoJEBvWZb6bTYby/UMP/jyhrHAaCgPgrvM4bXzMXBoZ
l4UQXSTAmhlpr/OUI9pgT/392IGNTkDZ3mJi0sqgG6p6egWUT6a4+lzJjlhExTJy
K5GRkbgfj83nVI1Jr0uxs58dwM527IFc5RD2Fzz0QXJIMA+HDseLfSYfa3gxbdTU
iU8fK4PG1usb8FMR+Rd7SzGgGbGhgs6KOar98izH9C+SsPtCIGEu86KW9EkCh2dZ
t7RI9PAJZUA1Ci2GuFISAuxl08ZkfKo29fXfM0DsovbaQda2dph7j1y6sYqXYQBA
jZW0BEedpC1PfbWEODU81PG5t4AH5AUNmrNIsG04NiDwcRzWQckA5/x6qHsdA/33
N/GGzqfmLknvIPecuzmwmgRBVMzf+K0xXd+StSFJWR9dwP09Y0UfhdkuBsiqdd1a
H+xtsDyBl9pR9VgqtetIq8uQf3fpiHSRUnh++YYU8V/uK2C8ZyTmYYBJNwk2FK6l
2PBTD1Jsl0WYCZBScM0IK+BnDNDwrygdfAa2CF1KcfNHQSiHvjHIQwsVo6Lwev7i
1eE6/0zQ7Yumi3LOSyUx3v6JwdKH1zsN3uIjlrg4SxpmgzK/vhwrtuJti4W7HpLd
eHLHfszCWAmNO6zx0/bH44lPZNDBBFZeaZ+NVjW0nv7y3pAeLP3qyuY3pUsr+Suh
0xRPwhfmoSz9CZ+5mAIX
=iLTP
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI patches include bug fixes from Fam and Peter, improved error
reporting from Fam and a fix for DPRINTF bitrot. Memory patches try
again to initialize name from the QOM name.
# gpg: Signature made Thu 28 Aug 2014 15:10:31 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream:
memory: Lazy init name from QOM name as needed
xen: hvm: Abstract away memory region name ref
xen-hvm: Constify string
virtio-scsi: Report error if num_queues is 0 or too large
scsi-generic: remove superfluous DPRINTF avoid to break compiling
block/iscsi: fix memory corruption on iscsi resize
scsi-bus: Convert DeviceClass init to realize
block: Pass errp in blkconf_geometry
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This allows us to pass error information to caller.
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In function virtio_blk_handle_request, it may freed memory pointed by req,
So do not access member of req after calling this function.
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that block_resize acquires the AioContext we can safely allow
resizing the disk.
Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
g_new(T, n) is safer than g_malloc(sizeof(*v) * n) for two reasons.
One, it catches multiplication overflowing size_t. Two, it returns
T * rather than void *, which lets the compiler catch more type
errors.
Perhaps a conversion to g_malloc_n() would be neater in places, but
that's merely four years old, and we can't use such newfangled stuff.
This commit only touches allocations with size arguments of the form
sizeof(T), plus two that use 4 instead of sizeof(uint32_t). We can
make the others safe by converting to g_malloc_n() when it becomes
available to us in a couple of years.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Memory changes for QOMification and automatic tracking of MR lifetime.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJT8et9AAoJEBvWZb6bTYbyIJAQAI3AlLSe27xWoUGfQUgWH30z
Rt/pShHz3BJMfQpD79JfTH8u6uBpkQmKtflerNT7FhXN9ULDzNq+b/jRtke8nkuy
ctCt05FhhK00rfWpUoRue4XiCuvbizBU7MK0DI3yCyNdXQyYnFvgnvsJtlqox8Zh
J5HZcBJEmdCiWBxq7UPk0qBitp4PqNoy7jlD/Ex3m7fJN5WK2cyspQIT9zmhehVn
B8Nwp+RitDDbXbwm0r18col5rFr/6Nj6+dW1gr+7sVJDLNsmJEqC2l3Kgk0wbPkG
Uqwbih29me9PC9/L1VLGHY0ApKDQ8JGE0GrYgEg162hbhoxEHkjjoHMhDUfV6Pj8
NkqcjjWl11UUhgkNqrGafayXbBVnOiEglxy8uXCeq14y9Xd/gjK9Fz6MQvRSOjms
PFmaKknhdmpxh0DuZmTix7WBmKim8zOiCE0/vrAPvwx5L+d1bn5xh6yQvtVjBMpU
Sru3Mhdm9bL9dUDBgOM/G6WCxSTVLBlExOblcYkQh03MfabD7bfplcrKYPXt5ull
Y8YLjqkoIfoy5t0ErvtlpdBJjeEz99JXU+wLQ6NYHnzwzTV+oUtSaEph14mAFOcY
XkFKdoPDI9PnyEfvy4193du8z/dSbhu7sWgHWbTCQyrcaNnSaVhlH43NUC+p23YN
8vfEsVLd1X7MFkDBUmWp
=M+/m
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI changes that enable sending vendor-specific commands via virtio-scsi.
Memory changes for QOMification and automatic tracking of MR lifetime.
# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream:
mtree: remove write-only field
memory: Use canonical path component as the name
memory: Use memory_region_name for name access
memory: constify memory_region_name
exec: Abstract away ref to memory region names
loader: Abstract away ref to memory region names
tpm_tis: remove instance_finalize callback
memory: remove memory_region_destroy
memory: convert memory_region_destroy to object_unparent
ioport: split deletion and destruction
nic: do not destroy memory regions in cleanup functions
vga: do not dynamically allocate chain4_alias
sysbus: remove unused function sysbus_del_io
qom: object: move unparenting to the child property's release callback
qom: object: delete properties before calling instance_finalize
virtio-scsi: implement parse_cdb
scsi-block, scsi-generic: implement parse_cdb
scsi-block: extract scsi_block_is_passthrough
scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
scsi-bus: prepare scsi_req_new for introduction of parse_cdb
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The function is empty after the previous patch, so remove it.
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Without this correction, only a three descriptor layout is accepted, and
requests with just two descriptors are not completed and no error message is
displayed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Marc Marí <marc.mari.barcelo@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If we fail to set up guest or host notifiers, there's no use trying again
every time the guest kicks, so disable dataplane in that case.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The dataplane code is currently doing a hard exit if it fails to set
up either guest or host notifiers. In practice, this may mean that a
guest suddenly dies after a dataplane device failed to come up (e.g.,
when a file descriptor limit is hit for tne nth device).
Let's just try to unwind the setup instead and return.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Setting up guest or host notifiers may fail, but the user will have
no idea why: Let's print the error returned by the callback.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now requests are submitted as a batch, so it is natural
to notify guest as a batch too.
This may suppress interrupt notification to VM a lot:
- in my test, decreased by ~13K/sec
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The callback has to be saved and reset in virtio_blk_data_plane_start(),
otherwise dataplane's requests will be completed in qemu aio context.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The block layer fails such reads and writes just fine. However, they
then get treated like valid operations that fail: the error action
gets executed. Unwanted; reporting the error to the guest is the only
sensible action.
Reject them before passing them to the block layer. This bypasses the
error action and I/O accounting.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a device model's I/O operation fails, we execute the error
action. This lets layers above QEMU implement thin provisioning, or
attempt to correct errors before they reach the guest. But when the
I/O operation fails because it's invalid, reporting the error to the
guest is the only sensible action.
If the guest's read or write asks for an invalid sector range, fail
the request right away, without considering the error action. No
change with error action BDRV_ACTION_REPORT.
Furthermore, bypass I/O accounting, because we want to track only I/O
that actually reaches the block layer.
The next commit will extend "invalid sector range" to cover attempts
to read/write beyond the end of the medium.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The memory allocation between hw/block/virtio-blk.c,
hw/block/dataplane/virtio-blk.c, and hw/virtio/dataplane/vring.c is
messy. Structs are allocated in different files than they are freed in.
This is risky and makes memory leaks easier.
Embed VirtQueueElement in VirtIOBlockReq to reduce the amount of memory
allocation we need to juggle. This also makes vring.c and virtio.c
slightly more similar.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In commit de6c8042ec ("virtio-blk: Avoid
zeroing every request structure") we avoided the 40 KB memset when
allocating VirtIOBlockReq.
The memset was reintroduced in commit
671ec3f056 ("virtio-blk: Convert
VirtIOBlockReq.elem to pointer").
It must be fixed again to avoid a performance regression.
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VirtIOBlockReq is freed later by virtio_blk_free_request() in
hw/block/virtio-blk.c. Remove this extraneous g_slice_free().
This patch fixes the following segfault:
0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
99 bdrv_acct_done(req->dev->bs, &req->acct);
(gdb) print req
$1 = (VirtIOBlockReq *) 0x5555565ff5e0
(gdb) print req->dev
$2 = (VirtIOBlock *) 0x0
(gdb) bt
#0 0x00005555556373af in virtio_blk_rw_complete (opaque=0x5555565ff5e0, ret=0) at hw/block/virtio-blk.c:99
#1 0x0000555555840ebe in bdrv_co_em_bh (opaque=0x5555566152d0) at block.c:4675
#2 0x000055555583de77 in aio_bh_poll (ctx=ctx@entry=0x5555563a8150) at async.c:81
#3 0x000055555584b7a7 in aio_poll (ctx=0x5555563a8150, blocking=blocking@entry=true) at aio-posix.c:188
#4 0x00005555556e520e in iothread_run (opaque=0x5555563a7fd8) at iothread.c:41
#5 0x00007ffff42ba124 in start_thread () from /usr/lib/libpthread.so.0
#6 0x00007ffff16d14bd in clone () from /usr/lib/libc.so.6
Reported-by: Max Reitz <mreitz@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Before commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O), dataplane for virtio-blk submits block
I/O as a batch.
This commit 580b6b2aa2 replaces the custom linux AIO
implementation(including submit I/O as a batch) with QEMU
block layer, but this commit causes ~40% throughput regression
on virtio-blk performance, and removing submitting I/O
as a batch is one of the causes.
This patch applies the newly introduced bdrv_io_plug() and
bdrv_io_unplug() interfaces to support submitting I/O
at batch for Qemu block layer, and in my test, the change
can improve throughput by ~30% with 'aio=native'.
Following my fio test script:
[global]
direct=1
size=4G
bsrange=4k-4k
timeout=40
numjobs=4
ioengine=libaio
iodepth=64
filename=/dev/vdc
group_reporting=1
[f]
rw=randread
Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
- qemu master: 65K IOPS
- qemu master with these patches: 92K IOPS
- 2.0.0 release(dataplane using custom linux aio): 104K IOPS
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The x-data-plane=on|off option is no longer useful because the
iothread=<iothread> option conveys the same information plus which
IOThread to use.
Do not delete x-data-plane=on|off yet as a convenience to people using
this legacy experimental option. We will drop it in QEMU 2.2.
Instead, turn on data-plane when either x-data-plane=on or
iothread=<iothread> are used. The following command-line uses
data-plane:
qemu -device virtio-blk-pci,iothread=foo,drive=drive0
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Up until now -device virtio-blk-pci,x-iothread=<id> was used to assign
an IOThread. This was a temporary solution while we cleaned up QOM link
properties.
This patch switches over to a QOM link property since it is now possible
to restrict the setter to unrealized instances and automatically unref
the IOThread when the virtio-blk-pci device is freed.
Since the "iothread" property is a QOM property and not a qdev property,
we must alias it explicitly for virtio-blk-pci, as well as CCW and
s390-virtio.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There is no need to make DEFINE_VIRTIO_BLK_PROPERTIES() public. Inline
it into virtio-blk.c so it cannot be used by mistake from other source
files.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
This function is no longer used since parent objects now use child
aliases to set the VirtIOBlkConf directly.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
If the virtio transport does not support notifiers (like s390-virtio),
we can't use dataplane. Bail out early and let the user know what is
wrong.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
[ pass VirtIODevice * to memory accessors,
Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to migrate virtio subsections, they should be streamed after
the device itself. We need the device specific code to be called from
the common migration code to achieve this. This patch introduces load
and save methods for this purpose.
Suggested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The old name is misleading in its new usage, so rename it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This drops request handling code from dataplane, and uses code from
hw/block/virtio-blk.c.
It starts to use multiwrite as non-dataplane does.
Dataplane sets VirtIOBlock.complete_request to vring version, and calls
into non-dataplane's process handling. In complete_request_early,
qiov.size is added to vring push length, because it's also called in rw
completion now.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The BH must be called in the AioContext of bs. Currently it is only the
main loop, but with coming changes, it could also be a dataplane
IOThread.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
So that dataplane can use virtio_blk_handle_request and
virtio_submit_multiwrite.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_blk_req_complete will call VirtIOBlock.complete_request() to push
data and notify guest. No functional change.
Later, this will allow dataplane to provide it's own (vring_) version.
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
out_sg is checked by iov_to_buf below, so it can be dropped.
Add assert and iov_discard_back around in_sg, as the in_sg is handled in
dataplane code.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
VirtIOBlockReq is allocated in process_request, and freed in command
functions.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The virtio code currently assumes that the outhdr is in its own iovec.
This is not guaranteed by the spec, so we should relax this assumption.
Convert the VirtIOBlockReq.out field to structrue so that we can use
iov_to_buf and then discard the header from the beginning of iovec.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In current virtio spec, inhdr is a single byte, and is unlikely to
change for both functionality and compatibility considerations.
Non-dataplane uses .in, and we are on the way to converge them. So
let's unify it to get cleaner code.
Remove .inhdr and use .in.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Field "inhdr" is added temporarily for a more mechanical change, and
will be dropped in the next commit.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since it's set but not used.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block layer will handle the unaligned request.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This will make converging with dataplane code easier.
Add virtio_blk_free_request to handle the freeing of request internal
fields.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
For later reusing by dataplane code.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In order to let event defines use existing types later, instead of
redefine new ones, some old type defines for spice and vnc are changed,
and BlockErrorAction is moved from block.h to qapi schema. Note that
BlockErrorAction is not merged with BlockdevOnError.
At this point, VncInfo is not made a child of VncBasicInfo, because
VncBasicInfo has mandatory fields where VncInfo makes them optional.
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
By just never doing write-backs. This is completely invisible to the
guest, as the entire storage area is implemented as device state (at
realize time the entire drive is read in).
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sync_page() was conditionalizing it's whole fn body on the bdrv being
non-null. Just return for the function immediately on NULL brdv and
get rid of the big if.
Makes implementation consistent with flash_zynq_area().
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the flash device is configured with a device-width which is
not equal to the bank-width, indicating that it is actually several
narrow flash devices in parallel, the CFI table should report the
number of blocks and the size of a single device, not of the whole
combined setup. This stops Linux from complaining:
"NOR chip too large to fit in mapping. Attempting to cope..."
As usual, we retain the old broken but backwards compatible behaviour
when the device-width is not specified.
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1402409025-25694-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
After previous Peter patch, they are redundant. This way we don't
assign them except when needed. Once there, there were lots of case
where the ".fields" indentation was wrong:
.fields = (VMStateField []) {
and
.fields = (VMStateField []) {
Change all the combinations to:
.fields = (VMStateField[]){
The biggest problem (appart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTkeeSAAoJEJykq7OBq3PIi9sIAIKVSHE/c6Tc4rGcIgPoAXTS
li9FCQ/vXFUehIe6MFMmf6OX08zKxBw18DLS6mLyqAn5lJGUWQBXiP+nXqgzhy5T
EoCtwE/n5fpbQuCVxXm+CsaGZc4IySyZ4FiwqAe4MysYKXkJbDGEF/TEzL+0vFw4
wEHZUQ3ykydJtPLtZVI3MNRmqRtk7zbq8So6g/FYB3VyIeyhMp1peWahHPLXGOQ/
phpk5G4Yk6p8TtjTe6lo4ajBzAprsxc052Jz+MaxbuihzfHjf2i62iLixcDDTUzz
6n/sCVsvorA6O6SJ/sMtkQs1N1cXoj3tKSIu8gh6zkAtlg/V6sSRaLCnQQtFQB4=
=xvq7
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Block pull request
# gpg: Signature made Fri 06 Jun 2014 17:08:50 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/block-pull-request: (42 commits)
qapi: Extract qapi/block.json definitions
qapi: Extract qapi/block-core.json definitions
qapi: create two block related json modules
qapi: Extract qapi/common.json definitions
sheepdog: reload only header in a case of live snapshot
sheepdog: fix vdi object update after live snapshot
rbd: Fix leaks in rbd_start_aio() error path
qemu-img: Document check exit codes
block: fix wrong order in live block migration setup
blockdev: acquire AioContext in block_set_io_throttle
throttle: add detach/attach test case
throttle: add throttle_detach/attach_aio_context()
dataplane: Support VIRTIO_BLK_T_SCSI_CMD
virtio-blk: Factor out virtio_blk_handle_scsi_req from virtio_blk_handle_scsi
virtio-blk: Allow config-wce in dataplane
block: Move declaration of bdrv_get_aio_context to block.h
raw-posix: drop raw_get_aio_fd() since it is no longer used
dataplane: implement async flush
dataplane: delete IOQueue since it is no longer used
dataplane: use the QEMU block layer for I/O
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ld_raw and st_raw definitions are only needed in code that
must compile for both user-mode and softmmu emulation. Device
models can use the equivalent ld_p/st_p which are simple
pointer accessors.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The common logic to process a scsi request in a VirtQueueElement is
extracted to a function to share with dataplane.
This makes VirtIOBlockReq.scsi unused, so drop it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Dataplane now uses block layer. Protect bdrv_set_enable_write_cache with
aio_context_acquire and aio_context_release, so we can enable config-wce
to allow guest to modify the write cache online.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stop using the raw-posix file descriptor for synchronous
qemu_fdatasync(). Use bdrv_aio_flush() instead and drop the
VirtIOBlockDataPlane->fd field.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stop using a custom Linux AIO request queue from ioq.h and instead use
the QEMU block layer for I/O.
This patch adjusts the VirtIOBlockRequest struct with fields needed for
bdrv_aio_readv()/bdrv_aio_writev(). ioq.h used struct iovec and struct
iocb, which we don't need directly anymore.
Modify dataplane start/stop to set the AioContext on the
BlockDriverState. We also no longer need to get the raw-posix file
descriptor. This means image formats are now supported with dataplane!
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This drops BlockDriverState.in_use with op_blockers:
- Call bdrv_op_block_all in place of bdrv_set_in_use(bs, 1).
- Call bdrv_op_unblock_all in place of bdrv_set_in_use(bs, 0).
- Check bdrv_op_is_blocked() in place of bdrv_in_use(bs).
The specific types are used, e.g. in place of starting block backup,
bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, ...).
There is one exception in block_job_create, where
bdrv_op_blocker_is_empty() is used, because we don't know the operation
type here. This doesn't matter because in a few commits away we will drop
the check and move it to callers that _do_ know the type.
- Check bdrv_op_blocker_is_empty() in place of assert(!bs->in_use).
Note: there is only bdrv_op_block_all and bdrv_op_unblock_all callers at
this moment. So although the checks are specific to op types, this
changes can still be seen as identical logic with previously with
in_use. The difference is error message are improved because of blocker
error info.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJTdBkpAAoJEPSH7xhYctcj5jAQAJKiJ/43TBk0S0wlpoPjw1yA
WSC/oJ2yvf+i6EndJPdnxIA/3K3s7suRFBJVYwLVnGJTUZ6gwGcI7cc/EX/iK0vf
1Ffv/RAuh3nkohFw1ZMLQGqACFskSer0K1wZgQg+YjiJRPzzpVOpaAVm3+c04LRN
AjO9LTqudi2pFaDHfGQMFES3j/5+h2srdad7YLMu4h/17SLtcowO7Yf7GT+GHRyD
c3OHcL6vNoCKkIJQULC8X5czh6zapQZG3SRZ/nLt58yjEyVuq4L2cNRA/614K+5b
1YLz4dX6slj8y17tg8On60emU0Sdu4mOws9HtYSytfmEu045iuF+Wr4kNjO74UGd
NnnjD1pdnuF7356pz87PiVBf7qer5yrv+Bog1KRUwn3ETFqpSOMiKJo0JAGBMKdN
w6Mbkg9C7IR62OfmM5157Mj2Xd681GImNiYzerBMfuFAWk5XsMxuz9Xh03I7WpzS
ASow6SCBIr7bsiKENV7twuz1mIQclsYWXaVt8ldH8bMsTCqiefVeDs4o6A3Na+WC
LztUHzlhVLZU92uJDiIsyQnwmsw032RjhKY9sV/NMa5HLAhF/zZLcn1OfXshrv8t
mVnWngGbY185kml0GmLu9TSnSCBSD8lcwZlRA9JVyxfusxElJUOrQzKzvu69Rn6U
XBWPBwppnDQ0MjOgSZD+
=gg8m
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20140515' into staging
migration/next for 20140515
# gpg: Signature made Thu 15 May 2014 02:32:25 BST using RSA key ID 5872D723
# gpg: Can't check signature: public key not found
* remotes/juanquintela/tags/migration/20140515:
usb: fix up post load checks
migration: show average throughput when migration finishes
savevm: Remove all the unneeded version_minimum_id_old (rest)
savevm: Remove all the unneeded version_minimum_id_old (usb)
Split ram_save_block
arch_init: Simplify code for load_xbzrle()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
After previous Peter patch, they are redundant. This way we don't
assign them except when needed. Once there, there were lots of case
where the ".fields" indentation was wrong:
.fields = (VMStateField []) {
and
.fields = (VMStateField []) {
Change all the combinations to:
.fields = (VMStateField[]){
The biggest problem (appart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
After commit 767adce2d, they are redundant. This way we don't assign them
except when needed. Once there, there were lots of cases where the ".fields"
indentation was wrong:
.fields = (VMStateField []) {
and
.fields = (VMStateField []) {
Change all the combinations to:
.fields = (VMStateField[]){
The biggest problem (apart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.
Signed-off-by: Juan Quintela <quintela@redhat.com>
[PMM: fixed minor conflict, corrected commit message typos]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implement discard support for xen_disk. It makes use of the existing
discard code in qemu.
The discard support is enabled unconditionally. The tool stack may
provide a property "discard-enable" in the backend node to optionally
disable discard support. This is helpful in case the backing file was
intentionally created non-sparse to avoid fragmentation.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
This patch adds an errp parameter to bdrv_new() and updates all its
callers. The next patches will make use of this in order to check for
duplicate IDs. Most of the callers know that their ID is fine, so they
can simply assert that there is no error.
Behaviour doesn't change with this patch yet as bdrv_new() doesn't
actually assign errors to errp.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Before IOThread was its own object, each virtio-blk device would create
its own internal thread. We need to preserve this behavior for
backwards compatibility when users do not specify -device
virtio-blk-pci,iothread=<id>.
This patch changes how the internal IOThread object is created.
Previously we used the monitor object_add() function, which is really a
layering violation. The problem is that this needs to assign a name but
we don't have a name for this internal object.
Generating names for internal objects is a pain but even worse is that
they may collide with user-defined names.
Paolo Bonzini <pbonzini@redhat.com> suggested that the internal IOThread
object should not be named. This way the conflict cannot happen and we
no longer need object_add().
One gotcha is that internal IOThread objects will not be listed by the
query-iothreads command since they are not named. This is okay though
because query-iothreads is new and the internal IOThread is just for
backwards compatibility. New users should explicitly define IOThread
objects.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It may not be sensible for normal use cases, but it allows to use
/dev/null in QTest.
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
When creating an IOThread implicitly (the user did not specify
x-iothread=<id>) remember that iothread_find() does not return the
object with an incremented refcount.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Today virtio-blk dataplane uses a 1:1 device-per-thread model. Now that
IOThreads have been introduced we can generalize this to N:M devices per
threads.
This patch drops thread code from dataplane in favor of running inside
an IOThread AioContext.
As a bonus we solve the case where a guest keeps submitting I/O requests
while dataplane is trying to stop. Previously the dataplane thread
would continue to process requests until the request gave it a break.
Now we can shut down in bounded time thanks to
aio_context_acquire/release.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If enabled, set the thread name at creation (on GNU systems with
pthread_set_np)
Fix up all the callers with a thread name
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTB8hAAAoJEH8JsnLIjy/WQdgP/jEu5baA1/qKanDsS9l+81u1
/sIYSWpHDEJ0uavqTMBeyMOwkzel7SZRusIwA/d5pMqxbY6/86YJumTTozFWvtqc
IABqHtRKCxjcLdZRPbkuNAOiw6p76vSZa543o2t8OAhK2DIFy530wWXeoQEYvuJX
4pOh0lTradOrF1z6uW4ozgQ1efPppwh/iqwfWWNJVTgfnWxJk6qQaATEgkuSdsUN
Wp78UzOxLGO6JKJB6kP3LfNL0ANTYHpfH2/wkE6cW6TkSUduOm6hIBY+tb9khqYt
INOKxqFADK6EOgjvJBsZuZUtOnHK5oM921LepN/mOPAs6gKcn2j+FfqJrl3I1/5M
AXM3M0FPuijEKPGWw7pCLt7j84KJkD9a/rsKO37yRzw17fOma2Rpr4TrX43BF+5t
CGqQ7PzDJ6Fng4EXjyNDzviwXIK8xmG1tfn92tq/BUd6OuM9MCyzEGvEiGOMBoXv
w4iOV7UC+1P3TjnTBhMlBVGywSfdOJoHr9k4lXGNp0h8fPhM9rfruI3BFysxaas6
GmKbd7yvKwXOTptd3I9SB8BzVUL3CcD3FK24+cWKAl8GgyiDIWRlvBYyMp3p8Z8f
NDzcxYP6aRGsoddvpIWr3Tz89hw5wTW5u3RmNgxJUguz6HYKFbl30dpGT+96q2BN
YIAANTdPxn7BP6r3glQH
=ZaDG
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block patches
# gpg: Signature made Fri 21 Feb 2014 21:42:24 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (54 commits)
iotests: Mixed quorum child device specifications
quorum: Simplify quorum_open()
quorum: Add unit test.
quorum: Add quorum_open() and quorum_close().
quorum: Implement recursive .bdrv_recurse_is_first_non_filter in quorum.
quorum: Add quorum_co_flush().
quorum: Add quorum_invalidate_cache().
quorum: Add quorum_getlength().
quorum: Add quorum mechanism.
quorum: Add quorum_aio_readv.
blkverify: Extract qemu_iovec_clone() and qemu_iovec_compare() from blkverify.
quorum: Add quorum_aio_writev and its dependencies.
quorum: Create BDRVQuorumState and BlkDriver and do init.
quorum: Create quorum.c, add QuorumChildRequest and QuorumAIOCB.
check-qdict: Test termination of qdict_array_split()
check-qdict: Adjust test for qdict_array_split()
qdict: Extract non-QDicts in qdict_array_split()
qemu-config: Sections must consist of keys
qemu-iotests: Check qemu-img command line parsing
qemu-img: Allow -o help with incomplete argument list
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allow bdrv_open() to handle references to existing block devices just as
bdrv_file_open() is already capable of.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make bdrv_open() take a pointer to a BDS pointer, similarly to
bdrv_file_open(). If a pointer to a NULL pointer is given, bdrv_open()
will create a new BDS with an empty name; if the BDS pointer is not
NULL, that existing BDS will be reused (in the same way as bdrv_open()
already did).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_acct_done was called unconditional. But in case the ioreq has no
segments there is no matching bdrv_acct_start call. This could lead to
bogus accounting values.
Found by code inspection.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Replace them with uint8/32/64.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Commit 7426aa72c3 (nand: Don't inherit
from Sysbus) changed the parent type of TYPE_NAND but continued to use
qdev_create(), which handled a NULL BusState as SysBus.
Use object_new() instead, and reuse the TYPE_NAND define while at it.
Reported-by: Markus Armbruster <armbru@redhat.com>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
The alignment field is now set to the value that is promised to the
guest, rather than required by the host. The next patches will make
QEMU aware of the host-provided values, so make this clear.
The alignment is also not about memory buffers, but about the sectors on
the disk, change the documentation of the field.
At this point, the field is set by the device emulation, but completely
ignored by the block layer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
* stefanha/block:
commit: Remove unused check
qemu-iotests: Update test cases for commit active
commit: Support commit active layer
block: Add commit_active_start()
mirror: Move base to MirrorBlockJob
mirror: Don't close target
qemu-iotests: drop duplicate virtio-blk initialization failure
vmdk: Allow vmdk_create to work with protocol
vmdk: Check VMFS extent line field number
docs: updated qemu-img man page and qemu-doc to reflect VHDX support.
block: vhdx - improve error message, and .bdrv_check implementation
block/iscsi: Fix compilation for libiscsi 1.4.0 (API change)
qapi-schema: fix QEMU 1.8 references
dataplane: replace hostmem with memory_region_find
dataplane: change vring API to use VirtQueueElement
vring: factor common code for error exits
vring: create a common function to parse descriptors
sheepdog: fix dynamic grow for running qcow2 format
Message-id: 1387554416-5837-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Drop it when there's no obvious reason why device_add could not work.
Else keep and document why.
* isa-fdc: drop
* i8042: drop, even though its I/O base is hardcoded (because you
could conceivably still add one to a board that has none), and even
though PC board code wires up the A20 line (because that wiring is
optional)
* port92: keep because it needs additional wiring by port92_init()
* mc146818rtc: keep because it needs to be wired up by rtc_init()
* m48t59_isa: keep because needs to be wired up by m48t59_init_isa()
* isa-pit, kvm-pit: keep (in their abstract base pic-common) because
the PIT needs additional wiring by board code, depending on HPET
presence
* pcspk: keep because of pointer property pit, and because realize
sets global pcspk_state
* vmmouse: keep because of pointer property ps2_mouse
* vmport: keep because realize sets global port_state
* isa-i8259, kvm-i8259: keep (in their abstract base pic-common),
because the PICs' IRQ input lines are set up by board code, and the
wiring of the slave to the master is hard-coded in device model code
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
In an ideal world, machines can be built by wiring devices together
with configuration, not code. Unfortunately, that's not the world we
live in right now. We still have quite a few devices that need to be
wired up by code. If you try to device_add such a device, it'll fail
in sometimes mysterious ways. If you're lucky, you get an
unmysterious immediate crash.
To protect users from such badness, DeviceClass member no_user used to
make device models unavailable with -device / device_add, but that
regressed in commit 18b6dad. The device model is still omitted from
help, but is available anyway.
Attempts to fix the regression have been rejected with the argument
that the purpose of no_user isn't clear, and it's prone to misuse.
This commit clarifies no_user's purpose. Anthony suggested to rename
it cannot_instantiate_with_device_add_yet_due_to_internal_bugs, which
I shorten somewhat to keep checkpatch happy. While there, make it
bool.
Every use of cannot_instantiate_with_device_add_yet gets a FIXME
comment asking for rationale. The next few commits will clean them
all up, either by providing a rationale, or by getting rid of the use.
With that done, the regression fix is hopefully acceptable.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Fix NOR flash manufacturer and device ID reading. This now
properly takes into account device widths and device max widths
as required. The reading of these IDs uses the same max_width
dependent addressing as CFI queries.
The old code remains for chips that don't specify a device width,
as the new code relies on a device width being set in order to
properly operate. The existing code seems very broken.
Only ident0 and ident1 are used in the new code, as other fields
relate to the lock state of blocks in flash.
The VExpress flash configuration has been updated to match
the new code, as the existing definition was 'wrong' in order
to return the expected results with the broken device ID code.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Message-id: 1386279359-32286-8-git-send-email-roy.franz@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This change fixes the CFI query responses to handle NOR device
widths that are different from the bank width. Support is also
added for multi-width devices in a x8 configuration. This is
typically x8/x16 devices, but the CFI specification mentions
x8/x32 devices so those should be supported as well if they
exist.
The query response data is now replicated per-device in the bank,
and is adjusted for x16 or x32 parts configured in x8 mode.
The existing code is left in place for boards that have not
been updated to specify an explicit device_width. The VExpress
board has been updated in an earlier patch in this series so
this is the only board currently affected.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Message-id: 1386279359-32286-7-git-send-email-roy.franz@linaro.org
[PMM: fixed a few formatting nits]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For handling CFI and device ID reads, we need to not only know the
width that a NOR flash device is configured for, but also its maximum
width. The maximum width addressing mode is used for multi-width
parts no matter which width they are configured for. The most common
case is x16 parts that also support x8 mode. When configured for x8
operation these devices respond to CFI and device ID requests differently
than native x8 NOR parts.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Message-id: 1386279359-32286-6-git-send-email-roy.franz@linaro.org
[PMM: Added comment explaining the semantics of width vs device-width
vs max-device-width]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Now that we know how wide each flash device that makes up the bank is,
return status for each device in the bank. Leave existing code
that treats 32 bit wide banks as composed of two 16 bit devices as otherwise
we may break configurations that do not set the device_width propery.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Message-id: 1386279359-32286-4-git-send-email-roy.franz@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The width of the devices that make up the flash interface
is required to mask certain commands, in particular the
write length for buffered writes. This length will be presented
to each device on the interface by the program writing the flash,
and the flash emulation code needs to be able to determine
the length of the write as recieved by each flash device.
The device-width defaults to the bank width which should
maintain existing behavior for platforms that don't need
this change.
This change is required to support buffered writes on the
vexpress platform that has a 32 bit flash interface with 2
16 bit devices on it.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Message-id: 1386279359-32286-3-git-send-email-roy.franz@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rename the 'width' member of the pflash_t structure
in preparation for adding a bank_width member.
Signed-off-by: Roy Franz <roy.franz@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1386279359-32286-2-git-send-email-roy.franz@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rename variable qdev -> dev since that's what realize's argument is called
by convention.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Return an Error so that it can be propagated later.
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
[AF: Rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This ensures hot-unplug is handled properly by the proxy, and avoids
leaking bus_name which is freed by virtio_device_exit.
Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace the legacy cpu_to_be32wu() with stl_be_p().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-8-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Replace the legacy cpu_to_be16wu() with stw_be_p().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-7-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Adopt error_report() while at it.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This includes some pretty big changes:
- pci master abort support by Marcel
- pci IRQ API rework by Marcel
- acpi generation support by myself
Everything has gone through several revisions, latest versions have been on
list for a while without any more comments, tested by several
people.
Please pull for 1.7.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQEcBAABAgAGBQJSXNO8AAoJECgfDbjSjVRp7VAH/0B73mCOiyVACGx7fazK3SGK
X8TxZWVtG5A77ISqKyrtjLAhK9DCQjEzQTbMNhXHM3Ar6crwo7nJZnQvH2Gh1X2p
34BOQSVc4rtXz5pwDIr48dBLrxeslwXub79chUs+IK1/4RSn3h3nuS3k6JVkmLJN
rcHMj4ljJmi4Hd9vOpmS1jo/a61usi36hhU7CMgcrsXzStZycBBzCozOB3VW8p1X
/iwyf91YjmNPkn9gA3/aViGjszu8jE91dkA0C+ljwvcGbs2yEl3LCWEJfsMvoh5P
2M+k0XXbHwq/P9PFMa/2/lWOo4EO4Oxa+G/6QvovJrteYnktr+E9DqjU8pCT7yI=
=CVfs
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci, pc, acpi fixes, enhancements
This includes some pretty big changes:
- pci master abort support by Marcel
- pci IRQ API rework by Marcel
- acpi generation support by myself
Everything has gone through several revisions, latest versions have been on
list for a while without any more comments, tested by several
people.
Please pull for 1.7.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 15 Oct 2013 07:33:48 AM CEST using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
* mst/tags/for_anthony: (39 commits)
ssdt-proc: update generated file
ssdt: fix PBLK length
i386: ACPI table generation code from seabios
pc: use new api to add builtin tables
acpi: add interface to access user-installed tables
hpet: add API to find it
pvpanic: add API to access io port
ich9: APIs for pc guest info
piix: APIs for pc guest info
acpi/piix: add macros for acpi property names
i386: define pc guest info
loader: allow adding ROMs in done callbacks
i386: add bios linker/loader
loader: use file path size from fw_cfg.h
acpi: ssdt pcihp: updat generated file
acpi: pre-compiled ASL files
acpi: add rules to compile ASL source
i386: add ACPI table files from seabios
q35: expose mmcfg size as a property
q35: use macro for MCFG property name
...
Message-id: 1381818560-18367-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Report from valgrind:
==19521== Source and destination overlap in memcpy(0x31d38938, 0x31d38938, 64)
==19521== at 0x4A0A343: memcpy@@GLIBC_2.14 (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19521== by 0x42774E: virtio_blk_device_init (virtio-blk.c:686)
==19521== by 0x46EE9E: virtio_device_init (virtio.c:1158)
==19521== by 0x25405E: device_realize (qdev.c:178)
==19521== by 0x2559B5: device_set_realized (qdev.c:699)
==19521== by 0x3A819B: property_set_bool (object.c:1315)
==19521== by 0x3A6CE0: object_property_set (object.c:803)
Valgrind is right: blk == &s->blks, so it is a memcpy of 64 byte with
source == destination which can be removed.
Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
pci_set_irq and the other pci irq wrappers use
PCI_INTERRUPT_PIN config register to compute device
INTx pin to assert/deassert.
An irq is allocated using pci_allocate_irq wrapper
only if is needed by non pci devices.
Removed irq related fields from state if not used anymore.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
IF_NONE allows read-only, which makes forbidding it in this place
for other types pretty much pointless.
Instead, make sure that all devices for which the check would have
errored out check in their init function that they don't get a read-only
BlockDriverState. This catches even cases where IF_NONE and -device is
used.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Commit 4472beae modified the semantics of ioreq_{un,}map so that they are
idempotent if called when they're not needed (ie., twice in a row). However,
it neglected to handle the case where batch mapping is not being used (the
default), and one of the grants fails to map. In this case, ioreq_unmap will
be called to unwind and unmap any mappings already performed, but ioreq_unmap
simply returns due to the aforementioned change (the ioreq has not already
been marked as mapped).
The frontend user can therefore force xen_disk to leak grant mappings, a
per-domain limited resource.
Fix by marking the ioreq as mapped before calling ioreq_unmap in this
situation.
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The following sequence happens:
- the SeaBIOS virtio-blk driver does not support the WCE feature, which
causes QEMU to disable writeback caching
- the Linux virtio-blk driver resets the device, finds WCE is available
but writeback caching is disabled; tells block layer to not send cache
flush commands
- the Linux virtio-blk driver sets the DRIVER_OK bit, which causes
writeback caching to be re-enabled, but the Linux virtio-blk driver does
not know of this side effect and cache flushes remain disabled
The bug is at the third step. If the guest does know about CONFIG_WCE,
QEMU should ignore the WCE feature's state. The guest will control the
cache mode solely using configuration space. This change makes Linux
do flushes correctly, but Linux will keep SeaBIOS's writethrough mode.
Hence, whenever the guest is reset, the cache mode of the disk should
be reset to whatever was specified in the "-drive" option. With this
change, the Linux virtio-blk driver finds that writeback caching is
enabled, and tells the block layer to send cache flush commands
appropriately.
Reported-by: Rusty Russell <rusty@au1.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add an Error ** parameter to bdrv_open, bdrv_file_open and associated
functions to allow more specific error messages.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Avoid trying to setup dataplane again if dataplane setup is already in
progress. This may happen if an eventfd is triggered during setup.
I saw this occasionally with an experimental s390 irqfd implementation:
virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> virtio_ccw_set_host_notifier
...
-> virtio_queue_set_host_notifier_fd_handler
-> virtio_queue_host_notifier_read
-> virtio_queue_notify_vq
-> virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> vring_setup
-> hostmem_init
-> memory_listener_register
-> BOOM
As virtio-ccw tries to follow what virtio-pci does, it might be triggerable
for other platforms as well.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We call bdrv_attach_dev when initializing whether or not bs is created
locally, so call bdrv_detach_dev and let the refcnt handle the
lifecycle.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Manage BlockDriverState lifecycle with refcnt, so bdrv_delete() is no
longer public and should be called by bdrv_unref() if refcnt is
decreased to 0.
This is an identical change because effectively, there's no multiple
reference of BDS now: no caller of bdrv_ref() yet, only bdrv_new() sets
bs->refcnt to 1, so all bdrv_unref() now actually delete the BDS.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If PFLASH_DEBUG is enabled then we have some build errors:
hw/block/pflash_cfi02.c: In function ‘pflash_timer’:
hw/block/pflash_cfi02.c:128:5: error: expected ‘)’ before string constant
hw/block/pflash_cfi02.c:128:5: error: too few arguments to function ‘fprintf’
This patch fixes the problem.
Signed-off-by: Antony Pavlov <antonynpavlov@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an autogenerated patch using scripts/switch-timer-api.
Switch the entire code base to using the new timer API.
Note this patch may introduce some line length issues.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The .io_flush() handler no longer exists and has no users. Drop the
io_flush argument to aio_set_fd_handler() and related functions.
The AioFlushEventNotifierHandler and AioFlushHandler typedefs are no
longer used and are dropped too.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Check exit conditions before entering blocking aio_poll(). This is
mainly for consistency since it's unlikely that we are stopping in the
first event loop iteration.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Move the code to hw/i386, the sole remaining property is available
as !pci_enabled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1376069702-22330-4-git-send-email-aliguori@us.ibm.com
Rebased.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
With the new semantics of pc_sysfw (no -pflash implies "old-style" ROM setup,
-pflash implies "new-style" ROM setup), there is no need anymore for a compat
property. Old machines simply will never use -pflash, and thus will always
use old-style setup.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1376069702-22330-3-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The variable is not written anymore.
This cleans up after 9e1c2ec (which accidentally left variable
pc_sysfw_flash_vs_rom_bug_compatible behind, value always zero), and
buries dead code from commit dafb82e (which resurrected the pc_sysfw
code for pc_sysfw_flash_vs_rom_bug_compatible by mistake).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1376069702-22330-2-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce a type constant, use QOM casts and rename the parent field and
prepare for QOM realize.
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Since commit dd3be74207 SUNW,fdtwo's
initfn (realizefn since 940194c236)
was using SYSBUS_FDC() cast. This uses type sysbus-fdc rather than
SUNW,fdtwo.
Fix this by letting SUNW,fdtwo and sysbus-fdc both inherit from an
abstract type base-sysbus-fdc.
This allows to consolidate realizefns by using instance_init functions.
Clean up variable names and variable order while at it.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Hu Tao <hutao@cn.fujitsu.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Andreas Färber <afaerber@suse.de>
# By Stefan Hajnoczi (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
dataplane: refuse to start if device is already in use
dataplane: enable virtio-blk x-data-plane=on live migration
migration: fix spice migration
migration: notify migration state before starting thread
block: Repair the throttling code.
gluster: Add image resize support
Message-id: 1375112172-24863-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The category will be used to sort the devices displayed in
the command line help.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Message-id: 1375107465-25767-4-git-send-email-marcel.a@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Dataplane must check whether a block device is in use before launching
the dataplane thread. This is necessary since the thread does not
synchronize with the main loop and I/O requests could cause corruption.
One example is when a drive is added and a block job is started before
hotplugging the virtio-blk-pci adapter. In this case we must not use
dataplane mode.
Cc: qemu-stable@nongnu.org
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Although the dataplane thread does not cooperate with dirty memory
logging yet it's fairly easy to temporarily disable dataplane during
live migration. This way virtio-blk can live migrate when
x-data-plane=on.
The dataplane thread will restart after migration is cancelled or if the
guest resuming virtio-blk operation after migration completes.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Support backend option "direct-io-safe". This is documented as
follows in the Xen backend specification:
* direct-io-safe
* Values: 0/1 (boolean)
* Default Value: 0
*
* The underlying storage is not affected by the direct IO memory
* lifetime bug. See:
* http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html
*
* Therefore this option gives the backend permission to use
* O_DIRECT, notwithstanding that bug.
*
* That is, if this option is enabled, use of O_DIRECT is safe,
* in circumstances where we would normally have avoided it as a
* workaround for that bug. This option is not relevant for all
* backends, and even not necessarily supported for those for
* which it is relevant. A backend which knows that it is not
* affected by the bug can ignore this option.
*
* This option doesn't require a backend to use O_DIRECT, so it
* should not be used to try to control the caching behaviour.
Also, BDRV_O_NATIVE_AIO is ignored if BDRV_O_NOCACHE, so clarify the
default flags passed to the qemu block layer.
The original proposal for a "cache" backend option has been dropped
because it was believed too wide, especially considering that at the
moment the backend doesn't have a way to tell the toolstack that it is
capable of supporting it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
The firmware commonly used with MIPS Malta boards (YAMON) reads the
status of the pflash with a 32bit memory access. On real hardware
this results in the status byte being mirrored in the upper 16 bits
of the read value. For example if the status byte is represented by
SS then the hardware reads 0x00SS00SS. The YAMON firmware compares the
status against 32bit values expecting the mirrored value and fails
without it.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Rename fdctrl_init_common() to fdctrl_realize_common() and let
fdctrl_connect_drives() propagate an Error through it.
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Introduce type constant and replace FROM_SYSBUS().
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[AF: Renamed parent field]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Introduce type constant and replace FROM_SYSBUS().
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[AF: Renamed parent field]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Introduce type constant and avoid DO_UPCAST(), container_of(),
and use DEVICE() to avoid accessing parent qdev directly.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[AF: Renamed parent field and avoided repeated SYS_BUS_DEVICE() casts]
Signed-off-by: Andreas Färber <afaerber@suse.de>
# By Michael Tokarev (2) and others
# Via Michael Tokarev
* mjt/trivial-patches:
doc: monitor multiplexing rewording
block/m25p80: Update Micron entries
Fix command example in qemu.sasl
slirp: remove mbuf(m_hdr,m_dat) indirection
linux-user: declare sys_futex to have 6 arguments
Message-id: 1374225073-12959-1-git-send-email-mjt@msgid.tls.msk.ru
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
- Split 32Mb and 256Mb parts into a11 and a13 variants.
- Add the 4K sector flag to the 128Mb parts. (These entries were taken from
the Linux kernel list, which is missing the flag.)
- Fill out the table of sizes with entries for 64Mb parts.
Prodded by Peter Crosthwaite.
Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Load the virtio.c state into vring.c when we start dataplane mode and
vice versa when stopping dataplane mode. This patch makes it possible
to start and stop dataplane any time while the guest is running.
This will eventually allow us to go back to QEMU main loop for
bdrv_drain_all() and live migration. In the meantime, this patch makes
the dataplane lifecycle more robust but should make no visible
difference. It may be useful in the virtio-net dataplane effort.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Nand chips are not sysbus devices - they do not have any sense of MMIO,
nor interrupts. Re-parent to TYPE_DEVICE accordingly.
Cc: afaerber@suse.de
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The prescribed transition from Sysbus::init function to a
Device::realize.
Cc: afaerber@suse.de
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Define and use standard QOM cast macro. Remove usages of DO_UPCAST and
direct -> style casting.
Cc: afaerber@suse.de
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Make this code closer to passing checkpatch. Mostly missing braces, but
a few rogue tabs in there as well.
Cc: qemu-trivial@nongnu.org
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The DMAContext is a simple pointer to an AddressSpace that is now always
already available. Make everyone hold the address space directly,
and clean up the DMA API to use the AddressSpace directly.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Initial commit for emulated Non-Volatile-Memory Express (NVMe) pci
storage device.
NVMe is an open, industry driven storage specification defining
an optimized register and command set designed to deliver the full
capabilities of non-volatile memory on PCIe SSDs. Further information
may be found on the organizations website at:
http://www.nvmexpress.org/
This commit implements the minimum from the specification to work with
existing drivers.
Cc: Keith Busch <keith.busch@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Drop ISADeviceClass::init and the resulting no-op initfn and let
children implement their own realizefn. Adapt error handling.
Split off an instance_init where sensible.
Signed-off-by: Andreas Färber <afaerber@suse.de>
We may want to include a driver in the whitelist for read only tasks
such as diagnosing or exporting guest data (with libguestfs as a good
example). This patch introduces a readonly whitelist option, and for
backward compatibility, the old configure option --block-drv-whitelist
is now an alias to rw whitelist.
Drivers in readonly list is only permitted to open file readonly, and
returns -ENOTSUP for RW opening.
E.g. To include vmdk readonly, and others read+write:
./configure --target-list=x86_64-softmmu \
--block-drv-rw-whitelist=qcow2,raw,file,qed \
--block-drv-ro-whitelist=vmdk
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When pc-sysfw.rom_only == 0, flash memory will be
usable with kvm. In order to enable flash memory mode,
a pflash device must be created. (For example, by
using the -pflash command line parameter.)
Usage of a flash memory device with kvm requires
KVM_CAP_READONLY_MEM, and kvm will abort if
a flash device is used with an older kvm which does
not support this capability.
If a flash device is not used, then qemu/kvm will
operate in the original rom-mode.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1369816047-16384-5-git-send-email-jordan.l.justen@intel.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The isapc machine with seabios currently requires the BIOS region
to be read/write memory rather than read-only memory.
KVM currently cannot support the BIOS as a ROM region, but qemu
in non-KVM mode can. Based on this, isapc machine currently only
works with KVM.
To work-around this isapc issue, this change avoids marking the
BIOS as readonly for isapc.
This change also will allow KVM to start supporting ROM mode
via KVM_CAP_READONLY_MEM.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1369816047-16384-2-git-send-email-jordan.l.justen@intel.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
"Readable" is a very unfortunate name for this flag because even a
rom_device region will always be readable from the guest POV. What
differs is the mapping, just like the comments had to explain already.
Also, readable could currently be understood as being a generic region
flag, but it only applies to rom_device regions.
So rename the flag and the function to modify it after the original term
"ROMD" which could also be interpreted as "ROM direct", i.e. ROM mode
with direct access. In any case, the scope of the flag is clearer now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 9953f8822c.
While Markus's analysis is entirely correct, there are 1.6 patches
that fix the bug for real and without requiring machine type hacks.
Let's think of the children who will have to read this code, and
avoid a complicated mess of semantics that differ between <1.5,
1.5, and >1.5.
Conflicts:
hw/i386/pc_piix.c
hw/i386/pc_q35.c
include/hw/i386/pc.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1368189483-7915-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add new devices for various manufacturers, and re-sort Spansion list to
match the order in Linux, which requires chips with a non-zero extended ID
to come first.
With this commit the outstanding differences to Linux rev 55bf75b are:
- Erase size flag differences in s25sl032p, s25sl064p, s25fl016k, s25fl064k
(These devices have only some blocks that support small erase sizes.)
- Linux lacks n25q128
- Devices without a Jedec ID have been excluded
Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Introduce type constant and cast macro to obsolete DO_UPCAST().
Reuse type constant for PC machine compatibility settings.
Prepares for ISA realizefn.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Message-id: 1367093935-29091-4-git-send-email-afaerber@suse.de
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Use of a flash memory device for the BIOS was added in series "[PATCH
v10 0/8] PC system flash support", commit 4732dca..1b89faf, v1.1.
Flash vs. ROM is a guest-visible difference. Thus, flash use had to
be suppressed for machine types pc-1.0 and older. This was
accomplished by adding a dummy device "pc-sysfw" with property
"rom_only":
* Non-zero rom_only means "use ROM". Default for pc-1.0 and older.
* Zero rom_only means "maybe use flash". Default for newer machines.
Not only is the dummy device ugly, it was also retroactively added to
the older machine types! Fortunately, it's not guest-visible (thus no
immediate guest ABI breakage), and has no vmstate (thus no immediate
migration breakage). Breakage occurs only if the user unwisely
enables flash by setting rom_only to zero. Patch review FAIL #1.
Why "maybe use flash"? Flash didn't (and still doesn't) work with
KVM. Therefore, rom_only=0 really means "use flash, except when KVM
is enabled, use ROM". This is a Bad Idea, because it makes enabling/
disabling KVM guest-visible. Patch review FAIL #2.
Aside: it also precludes migrating between KVM on and off, but that's
not possible for other reasons anyway.
Fix as follows:
1. Change the meaning of rom_only=0 to mean "use flash, no ifs, buts,
or maybes" for pc-i440fx-1.5 and pc-q35-1.5. Don't change anything
for older machines (to remain bug-compatible).
2. Change the default value from 0 to 1 for these machines.
Necessary, because 0 doesn't work with KVM. Once it does, we can flip
the default back to 0.
3. Don't revert the retroactive addition of device "pc-sysfw" to older
machine types. Seems not worth the trouble.
4. Add a TODO comment asking for device "pc-sysfw" to be dropped once
flash works with KVM.
Net effect is that you get a BIOS ROM again even when KVM is disabled,
just like for machines predating the introduction of flash.
To get flash instead, use "--global pc-sysfw.rom_only=0".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1365780303-26398-4-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This clean the init and the exit functions and rename virtio_common_cleanup
to virtio_cleanup.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366791683-5350-7-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This remove virtio-bindings, and use class instead.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366791683-5350-6-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This remove the function pointer in VirtIODevice, and use only
VirtioDeviceClass function pointer.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366791683-5350-5-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Drop all the infrastructure for taddr properties (ie ones which
are 'hwaddr' sized). These are now unused, and any further desired
use would be rather questionable since device properties shouldn't
generally depend on a type that is conceptually variable based on
the target CPU. 32 or 64 bit integer properties should be used instead
as appropriate for the specific device.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
I think in the early revisions of this we had an instantiation helper
for the device in devices.h. This was later removed and this header was
left over. Removed
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If there is no backing bdrv, let the debugging developer know about it.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The debug printfs on every page program/read is extremely verbose. Add
a second level of debug for this.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Some of the debug printfs in m25p80 are really guest errors.
Changed over to qemu_log_mask(LOG_GUEST_ERROR accordingly.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Some dodgy casts were making a mess of these msgs.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The recent rearrangement of include files had some minor errors:
devices.h is not ARM specific and should not be in arm/
arm.h should be in arm/
Move these two headers to correct this.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Minor fixes to documentation and code comments.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>