Note that resetting bits in the dirty bitmap is done _before_ actually
processing the request. Writes, instead, set bits after the request
is completed.
This way, when there are concurrent write and discard requests, the
outcome will always be that the blocks are marked dirty. This scenario
should never happen, but it is safer to do it this way.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that discard can take a long time, make it asynchronous.
Each LBA range entry is processed separately because discard
can be an expensive operation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
ATA-ACS-3 says "If the two byte range length is zero, then the LBA
Range Entry shall be discarded as padding." iovecs are used as if
they are linearized, so it is incorrect to discard the rest of
this iovec.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is easy with the thread pool, because we can use s->is_xfs and
s->has_discard from the worker function.
QEMU has a widespread assumption that each I/O operation writes less
than 2^32 bytes. This patch doesn't fix it throughout of course,
but it starts correcting struct RawPosixAIOData so that there is
no regression with respect to the synchronous discard implementation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block devices use a ioctl instead of fallocate, so add a separate
implementation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Avoid sending system calls repeatedly if they shall fail. This
does not apply to XFS: if the filesystem-specific ioctl fails,
something weird is happening.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Linux 2.6.38 introduced the filesystem independent interface to
deallocate part of a file. As of Linux 3.7, btrfs, ext4, ocfs2,
tmpfs and xfs support it.
Even though the system calls here are in practice issued on Linux,
the code is structured to allow plugging in alternatives for other Unix
variants. EOPNOTSUPP is used unconditionally in this patch, but it is
supported in both OpenBSD and Mac OS X since forever (see for example
http://lists.debian.org/debian-glibc/2006/02/msg00337.html).
Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_io_limits_enable() starts a new slice, but does not set io_base
correctly for that slice.
Here is how io_base is used:
bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write];
bytes_res = (unsigned) nb_sectors * BDRV_SECTOR_SIZE;
if (bytes_base + bytes_res <= bytes_limit) {
/* no wait */
} else {
/* operation needs to be throttled */
}
As a result, any I/O operations that are triggered between now and
bs->slice_end are incorrectly limited. If 10 MB of data has been
written since the VM was started, QEMU thinks that 10 MB of data has
been written in this slice. This leads to a I/O lockup in the guest.
We fix this by delaying the start of a new slice to the next
call of bdrv_exceed_io_limits().
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
One of the recent refactoring patches (commit f50f88b9) didn't take care
to initialise l2meta properly, so with zero-length writes, which don't
even enter the write loop, qemu just segfaulted.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fix the compilation error introduced by msg new field.
CC hw/9pfs/virtio-9p.o
In file included from /home/konradf/Documents/safe/greensocs/virtio-project/x86-qemu/qemu/hw/9pfs/virtio-9p.c:17:0:
/home/konradf/Documents/safe/greensocs/virtio-project/x86-qemu/qemu/hw/virtio-pci.h:30:16: erreur: field ‘msg’ has incomplete type
make: *** [hw/9pfs/virtio-9p.o] Erreur 1
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
virtio_pci_set_guest_notifiers() now takes an additional argument to
specify the number of virtqueues to assign a guest notifier for. This
causes a build breakage for CONFIG_VIRTIO_BLK_DATA_PLANE builds:
/home/mdroth/w/qemu2.git/hw/dataplane/virtio-blk.c: In function
‘virtio_blk_data_plane_start’:
/home/mdroth/w/qemu2.git/hw/dataplane/virtio-blk.c:451:47: error: too
few arguments to function ‘s->vdev->binding->set_guest_notifiers’
/home/mdroth/w/qemu2.git/hw/dataplane/virtio-blk.c: In function
‘virtio_blk_data_plane_stop’:
/home/mdroth/w/qemu2.git/hw/dataplane/virtio-blk.c:511:5: error: too few
arguments to function ‘s->vdev->binding->set_guest_notifiers’
make[1]: *** [hw/dataplane/virtio-blk.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [subdir-x86_64-softmmu] Error 2
Fix this by passing 1 as the number of virtqueues to assign notifiers
for.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Fixes the following:
/home/mdroth/w/qemu2.git/hw/virtio-pci.c: In function
‘kvm_virtio_pci_vector_unmask’:
/home/mdroth/w/qemu2.git/hw/virtio-pci.c:673:12: error: ‘ret’ may be
used uninitialized in this function [-Werror=uninitialized]
cc1: all warnings being treated as errors
make: *** [hw/virtio-pci.o] Error 1
make: *** Waiting for unfinished jobs....
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The GE IP-Octal 232 is an IndustryPack module that implements eight
RS-232 serial ports, each one of which can be redirected to a
character device in the host.
Signed-off-by: Alberto Garcia <agarcia@igalia.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The TPCI200 is a PCI board that supports up to 4 IndustryPack modules.
A new bus type called 'IndustryPack' has been created so any
compatible module can be attached to this board.
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alberto Garcia <agarcia@igalia.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* kraxel/usb.76:
usb-host: Initialize dev->port the obviously safe way
usb-host: Drop superfluous null test from usb_host_auto_scan()
ehci: Assert state machine is sane w.r.t. EHCIQueue
xhci: nuke transfe5rs on detach
xhci: call xhci_detach_slot on root port detach too
xhci: create xhci_detach_slot helper function
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This further optimizes MSIX handling in virtio-pci.
Also included is pci cleanup by Paolo, and pci device
assignment fix by Alex.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJQ7ZaiAAoJECgfDbjSjVRpFhcIAJkY4VQ3i7TLnLsnEDOR+FrP
66YLEDwCSiKZ/UW7WERGN3p3tm0hAXLhPoHFqMGRPPV9pdcXI+Eb8v+u0IHVlt+7
DsQ9TIemZkpSMuUJjQbu/RF8k9JV8+X7M6CKnWahq68p0UD/vDX+OgCiGKO/l/zY
tENJhwD6M1MMzbxyzd4nCnkf3CPrHFvpPt2VAqQnkCw3wLAtR34SucBjr/dXcjuT
arPiV8dNmXHTosdKvcodAWA+0YLLE7Bhz0nLK6eTt5L/UsfdbRN8q9Xdhd5nJjji
DjKBJBfwdG5n3r96g7dlb/XdHuQjbFBq3uLmc8H2OdWOrk5PyqeoUA5fdBQxkb8=
=vKSI
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci,virtio
This further optimizes MSIX handling in virtio-pci.
Also included is pci cleanup by Paolo, and pci device
assignment fix by Alex.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* mst/tags/for_anthony:
pci-assign: Enable MSIX on device to match guest
pci: use constants for devices under the 1B36 device ID, document them
ivshmem: use symbolic constant for PCI ID, add to pci-ids.txt
virtio-9p: use symbolic constant, add to pci-ids.txt
reorganize pci-ids.txt
docs: move pci-ids.txt to docs/specs/
vhost: backend masking support
vhost: set started flag while start is in progress
virtio-net: set/clear vhost_started in reverse order
virtio: backend virtqueue notifier masking
virtio-pci: cache msix messages
kvm: add stub for update msi route
msix: add api to access msix message
virtio: don't waste irqfds on control vqs
* bonzini/stub: (27 commits)
build: improve quiet output for .stp rules
build: fold trace-obj-y into libqemuutil.a
build: some simplifications for "trace/Makefile.objs"
build: remove coroutine-obj-y
build: move version-obj-y to the generic LINK rule
build: move base QAPI files to libqemuutil.a
build: move QAPI definitions for QEMU out of qapi-obj-y
build: consolidate multiple variables into universal-obj-y
build: move qobject files to qobject/ and libqemuutil.a
build: move libqemuutil.a components to util/
build: move files away from tools-obj-y, common-obj-y, user-obj-y
build: move util-obj-y to libqemuutil.a
build: rename oslib-obj-y to util-obj-y
libcacard: list oslib-obj-y file explicitly
libcacard: link vscclient to dynamic library
libcacard: rewrite Makefile in non-recursive style
libcacard: add list of exported symbols
libcacard: use per-target variable definitions
libcacard: prepare to use -y trick in the Makefile
libcacard: require libtool to build it
...
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* qmp/queue/qmp:
monitor: assert monitor_puts()'s loop invariant
target-i386: fix bits 39:32 of the final physical address when using 4M page
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Store --extra-cflags and --extra-ldflags in config-host.mak,
then pass them on to the pixman configure script.
Cc: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Coverity worries the strcpy() could overrun the destination. It
can't, because the source always points to usb_host_scan()'s auto
port[], which has the same size. Use pstrcpy() anyway, to hush the
checker.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Coverity points out that port is later passed to usb_host_open(),
which dereferences it. It actually can't be null: it always points to
usb_host_scan()'s auto port[]. Drop the superfluous port == NULL
test.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Coverity worries the EHCIQueue pointer could be null when we pass it
to functions that reference it. The state machine ensures it can't be
null then. Assert that, to hush the checker.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
O_DIRECT on Linux has alignment requirements on I/O buffers and
misaligned requests result in -EINVAL. The Linux virtio_blk guest
driver usually submits aligned requests so I forgot to handle misaligned
requests.
It turns out that virtio-win guest drivers submit misaligned requests.
Handle them using a bounce buffer that meets alignment requirements.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Extract code for read/write command processing into do_rdwr_cmd(). This
brings together pieces that are spread across process_request().
The real motivation is to set the stage for handling misaligned
requests, which the next patch tackles.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The qiov_is_aligned() function checks whether a QEMUIOVector meets a
BlockDriverState's alignment requirements. This is needed by
virtio-blk-data-plane so:
1. Move the function from block/raw-posix.c to block/block.c.
2. Make it public in block/block.h.
3. Rename to bdrv_qiov_is_aligned().
4. Change return type from int to bool.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When the raw-posix aio=thread code was moved from posix-aio-compat.c
to block/raw-posix.c, there was an unintended change to the ioctl code.
The code used to return the ioctl command, which posix_aio_read()
would later morph into a zero. This hack is not necessary anymore,
and in fact breaks scsi-generic (which expects a zero return code).
Remove it.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sheepdog supports both writeback/writethrough write but has not yet supported
DIRECTIO semantics which bypass the cache completely even if Sheepdog daemon is
set up with cache enabled.
Suppose cache is enabled on Sheepdog daemon size, the new cache control is
cache=writeback # enable the writeback semantics for write
cache=writethrough # enable the emulated writethrough semantics for write
cache=directsync # disable cache competely
Guest WCE toggling on the run time to toggle writeback/writethrough is also
supported.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A blank CD or DVD is visible as a zero-sized disks. Probing such
disks will lead to an EIO and a failure to start the VM. Treating
them as raw is a better solution.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
interface_set_client_capabilities() copies only the first few bits,
because it falls into a Classic C trap: you can declare a parameter
uint8_t caps[58], but the resulting parameter type is uint8_t *, not
uint8_t[58]. In particular, sizeof(caps) is sizeof(uint8_t *), not
the intended sizeof(uint8_t[58]).
Harmless, because the bits aren't used, yet. Broken in commit
c10018d6. Spotted by Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The pointer arithmetic there is safe, but ugly. Coverity grouses
about it. However, the actual comparison is off by one: <= end
instead of < end. Fix by rewriting the check in a cleaner way.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
There is no reason why for example qemu-ga should include all the
definitions for the QEMU monitor. However, there are a few
that are needed (qapi_free_SocketAddress, qapi_free_InetSocketAddress,
ErrorClass_lookup). These should be moved to a separate "core"
.json schema that goes into libqemuutil.a.
For now, make this clearer by moving the qapi-*.o definitions out
of libqemuutil.a. Once the above refactoring is done, qga-obj-y
should not include anymore qapi-types.o and qapi-visit.o.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The directory descent mechanism, and a less-flat tree both helped
in making some *-obj-y definitions very short. Many of these
often end up in universal-obj-y, and used to be separate only
because of libuser (which is now part of history...).
Consolidate these variables in a single one.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split them between libqemuutil.a and, for those used by qemu-img/io/nbd,
block-obj-y.
Static libraries ensure that binaries such as qemu-ga do not include
unused modules.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>