Historically, sending all memory regions to vhost-user backends in a
single message imposed a limitation on the number of times memory
could be hot-added to a VM with a vhost-user device. Now that backends
which support the VHOST_USER_PROTOCOL_F_CONFIGURE_SLOTS send memory
regions individually, we no longer need to impose this limitation on
devices which support this feature.
With this change, VMs with a vhost-user device which supports the
VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS can support a configurable
number of memory slots, up to the maximum allowed by the target
platform.
Existing backends which do not support
VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS are unaffected.
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Suggested-by: Mike Cui <cui@nutanix.com>
Message-Id: <1588533678-23450-6-git-send-email-raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
With this change, when the VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS
protocol feature has been negotiated, Qemu no longer sends the backend
all the memory regions in a single message. Rather, when the memory
tables are set or updated, a series of VHOST_USER_ADD_MEM_REG and
VHOST_USER_REM_MEM_REG messages are sent to transmit the regions to map
and/or unmap instead of sending send all the regions in one fixed size
VHOST_USER_SET_MEM_TABLE message.
The vhost_user struct maintains a shadow state of the VM’s memory
regions. When the memory tables are modified, the
vhost_user_set_mem_table() function compares the new device memory state
to the shadow state and only sends regions which need to be unmapped or
mapped in. The regions which must be unmapped are sent first, followed
by the new regions to be mapped in. After all the messages have been
sent, the shadow state is set to the current virtual device state.
Existing backends which do not support
VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS are unaffected.
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Suggested-by: Mike Cui <cui@nutanix.com>
Message-Id: <1588533678-23450-5-git-send-email-raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This change introduces a new feature to the vhost-user protocol allowing
a backend device to specify the maximum number of ram slots it supports.
At this point, the value returned by the backend will be capped at the
maximum number of ram slots which can be supported by vhost-user, which
is currently set to 8 because of underlying protocol limitations.
The returned value will be stored inside the VhostUserState struct so
that on device reconnect we can verify that the ram slot limitation
has not decreased since the last time the device connected.
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Message-Id: <1588533678-23450-4-git-send-email-raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of the compression ratio in comparison with
zlib, which, at the moment, is the only compression
method available.
The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G
The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.
compress cmd:
time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
src.img [zlib|zstd]_compressed.img
decompress cmd
time ./qemu-img convert -O qcow2
[zlib|zstd]_compressed.img uncompressed.img
compression decompression
zlib zstd zlib zstd
------------------------------------------------------------
real 65.5 16.3 (-75 %) 1.9 1.6 (-16 %)
user 65.0 15.8 5.3 2.5
sys 3.3 0.2 2.0 2.0
Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
QAPI part:
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200507082521.29210-4-dplotnikov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Backing files and raw external data files are mutually exclusive.
The documentation of the raw external data bit (in autoclear_features)
already indicates that, but we should also mention it on the other
side.
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20200410121816.8334-1-berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The feature table is supposed to advertise the name of all feature
bits that we support; however, we forgot to update the table for
autoclear bits. While at it, move the table to read-only memory in
code, and tweak the qcow2 spec to name the second autoclear bit.
Update iotests that are affected by the longer header length.
Fixes: 88ddffae
Fixes: 93c24936
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200324174233.1622067-3-eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Although qemu-ga has supported vsock since 2016 it was not documented on
the man page.
Also add the socket address representation to the qga --help output.
Fixes: 586ef5dee7
("qga: add vsock-listen method")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Description copied from Linux kernel commit from Gustavo A. R. Silva
(see [3]):
--v-- description start --v--
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to
declare variable-length types such as these ones is a flexible
array member [1], introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler
warning in case the flexible array does not occur last in the
structure, which will help us prevent some kind of undefined
behavior bugs from being unadvertenly introduced [2] to the
Linux codebase from now on.
--^-- description end --^--
Do the similar housekeeping in the QEMU codebase (which uses
C99 since commit 7be41675f7).
All these instances of code were found with the help of the
following command (then manual analysis, without modifying
structures only having a single flexible array member, such
QEDTable in block/qed.h):
git grep -F '[0];'
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76497732932f
[3] https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/commit/?id=17642a2fbd2c1
Inspired-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For good reason, vhost-user is currently built asynchronously, that
way better performance can be obtained. However, for certain use
cases such as simulation, this is problematic.
Consider an event-based simulation in which both the device and CPU
have scheduled according to a simulation "calendar". Now, consider
the CPU sending I/O to the device, over a vring in the vhost-user
protocol. In this case, the CPU must wait for the vring interrupt
to have been processed by the device, so that the device is able to
put an entry onto the simulation calendar to obtain time to handle
the interrupt. Note that this doesn't mean the I/O is actually done
at this time, it just means that the handling of it is scheduled
before the CPU can continue running.
This cannot be done with the asynchronous eventfd based vring kick
and call design.
Extend the protocol slightly, so that a message can be used for kick
and call instead, if VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS is
negotiated. This in itself doesn't guarantee synchronisation, but both
sides can also negotiate VHOST_USER_PROTOCOL_F_REPLY_ACK and thus get
a reply to this message by setting the need_reply flag, and ensure
synchronisation this way.
To really use it in both directions, VHOST_USER_PROTOCOL_F_SLAVE_REQ
is also needed.
Since it is used for simulation purposes and too many messages on
the socket can lock up the virtual machine, document that this should
only be used together with the mentioned features.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Message-Id: <20200123081708.7817-6-johannes@sipsolutions.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move the following tools documentation files to the new tools manual:
docs/interop/qemu-img.rst
docs/interop/qemu-nbd.rst
docs/interop/virtfs-proxy-helper.rst
docs/interop/qemu-trace-stap.rst
docs/interop/virtiofsd.rst
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200217155415.30949-4-peter.maydell@linaro.org
The qemu-option-trace.rst.inc file contains a rST documentation
fragment which describes trace options common to qemu-nbd and
qemu-img. We put this file into interop/, but we'd like to move the
qemu-nbd and qemu-img files into the tools/ manual. We could move
the .rst.inc file along with them, but we're eventually going to want
to use it for the main QEMU binary options documentation too, and
that will be in system/. So move qemu-option-trace.rst.inc to the
top-level docs/ directory, where all these files can include it via
.. include:: ../qemu-option-trace.rst.inc
This does have the slight downside that we now need to explicitly
tell Make which manuals use this file rather than relying on
a wildcard for all .rst.inc in the manual.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200217155415.30949-3-peter.maydell@linaro.org
In many cases the target of a convert operation is a newly provisioned
target that the user knows is blank (reads as zero). In this situation
there is no requirement for qemu-img to wastefully zero out the entire
device.
Add a new option, --target-is-zero, allowing the user to indicate that
an existing target device will return zeros for all reads.
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20200205110248.2009589-2-david.edmondson@oracle.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The patch adds a new additional field to the qcow2 header: compression_type,
which specifies compression type. If field is absent or zero, default
compression type is set: ZLIB, which corresponds to current behavior.
New compression type (ZSTD) is to be added in further commit.
Suggested-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200131142219.3264-3-vsementsov@virtuozzo.com>
[mreitz: s/Bits 3-63: Reserved/Bits 4-63: Reserved/]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Make it more obvious how to add new fields to the version 3 header and
how to interpret them.
The specification is adjusted so that for new defined optional fields:
1. Software may support some of these optional fields and ignore the
others, which means that features may be backported to downstream
Qemu independently.
2. If we want to add incompatible field (or a field, for which some of
its values would be incompatible), it must be accompanied by
incompatible feature bit.
Also the concept of "default is zero" is clarified, as it's strange to
say that the value of the field is assumed to be zero for the software
version which don't know about the field at all and don't know how to
treat it be it zero or not.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200131142219.3264-2-vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
[mreitz: s/some its/some of its/]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Document the virtiofsd(1) program and its command-line options. This
man page is a rST conversion of the original texi documentation that I
wrote.
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The '-i AIO' option was accidentally placed after '-n' and '-t'. Move it
after '--flush-interval'.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200205163008.204493-1-jusual@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The option was deprecated in 4.0.0 (commit 0ae2d546); it's now been
long enough with no complaints to follow through with that process.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200123164650.1741798-3-eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virtfs-proxy-helper documentation is currently in
fsdev/qemu-trace-stap.texi in Texinfo format, which we
present to the user as:
* a virtfs-proxy-helper manpage
* but not (unusually for QEMU) part of the HTML docs
Convert the documentation to rST format that lives in
the docs/ subdirectory, and present it to the user as:
* a virtfs-proxy-helper manpage
* part of the interop/ Sphinx manual
There are minor formatting changes to suit Sphinx, but no
content changes. In particular I've split the -u and -g
options into each having their own description text.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Greg Kurz <groug@kaod.org>
Message-id: 20200124162606.8787-9-peter.maydell@linaro.org
The qemu-trace-stap documentation is currently in
scripts/qemu-trace-stap.texi in Texinfo format, which we
present to the user as:
* a qemu-trace-stap manpage
* but not (unusually for QEMU) part of the HTML docs
Convert the documentation to rST format that lives in
the docs/ subdirectory, and present it to the user as:
* a qemu-trace-stap manpage
* part of the interop/ Sphinx manual
There are minor formatting changes to suit Sphinx, but no
content changes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200124162606.8787-8-peter.maydell@linaro.org
The qemu-img documentation is currently in qemu-nbd.texi in Texinfo
format, which we present to the user as:
* a qemu-img manpage
* a section of the main qemu-doc HTML documentation
Convert the documentation to rST format, and present it to the user as:
* a qemu-img manpage
* part of the interop/ Sphinx manual
The qemu-img rST document uses the new hxtool extension
to handle pulling rST fragments out of qemu-img-cmds.hx.
The documentation of the various options and commands is rather
muddled, with some options being described inside the relevant
command description and some in a more general section near the start
of the manual. All the command synopses are replicated in the .hx
file and then again in the manual. A lot of text is also duplicated
in the qemu-img.c code for the help text. I have not attempted to
deal with any of this, but have simply transposed the existing
structure into rST.
As usual, there are some minor formatting changes but no
textual changes, except that as with one or two other conversions
I have dropped the 'see also' section since it's not very
informative and looks odd in the HTML.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200124162606.8787-6-peter.maydell@linaro.org
It's been deprecated since QEMU v3.1. The 40p machine should be
used nowadays instead.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200114114617.28854-1-thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Import our virtiofsd.
This pulls in the daemon to drive a file system connected to the
existing qemu virtiofsd device.
It's derived from upstream libfuse with lots of changes (and a lot
trimmed out).
The daemon lives in the newly created qemu/tools/virtiofsd
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
v2
drop the docs while we discuss where they should live
and we need to redo the manpage in anything but texi
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAl4pzZ4ACgkQBRYzHrxb
/eelUg//evho+RwlOK4TjOjLJyGfMqZDQO5TFR2S2NmiCP7makND4BWll2A5Zu26
oqzZw5pcsKZmpYJ81tqe1bnlCa6SCUx6cNGt+5n2cA0MYSjpPDeB2OegjS57NUoE
eGXXIE7GOrGShHx1fW7BuA3Pi0hmFSHGRHCs006WsVktb1rP1w+7/NBohgLFkYob
fqytP/K9ACEySETPGgDUEh6ZmmalrY1WeD+a11RZstOSA+2YhR3WbyN0z8fc6lCE
puFHNEs2L0zVIUicSyJ4ux9+rbxdZIelLD91mGZhxrWy0H0AIox4bYURUJlbajI7
Yl/FInQRMhStsKn3UN25MSYgGS8ZAM3IcG605vrC4HoQh9r8mVC/H19buWFCycvL
1naK6LTqFkL0igAKTeg+DUk3tNP3i+j8JaMnopvKIfEHwV1lpVEVHI7zUynBA85d
2xfOllkJreFtniYg5nfdfhVixKHLAId0x9ZvYw3wefLDF3ugXLHbrtj0hPcJiAny
TINAzZCbxZsCEdZsrPq4Ldf7Pmb8vI8pxJVsoD28gRcHNRQvPWef07mtW370IAdJ
SJXWLlsFh/rPJx51lVIMQf6d4qLePyHfB81VQ25qlrS5CW87XMmTyr5rngGFlJ2e
vJnMb+DgwG1gf+HV4W5Y3l0wehou5GxgbLT478s+r3YzfdV13d4=
=UUVN
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20200123b' into staging
virtiofsd first pull v2
Import our virtiofsd.
This pulls in the daemon to drive a file system connected to the
existing qemu virtiofsd device.
It's derived from upstream libfuse with lots of changes (and a lot
trimmed out).
The daemon lives in the newly created qemu/tools/virtiofsd
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
v2
drop the docs while we discuss where they should live
and we need to redo the manpage in anything but texi
# gpg: Signature made Thu 23 Jan 2020 16:45:18 GMT
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert-gitlab/tags/pull-virtiofs-20200123b: (108 commits)
virtiofsd: add some options to the help message
virtiofsd: stop all queue threads on exit in virtio_loop()
virtiofsd/passthrough_ll: Pass errno to fuse_reply_err()
virtiofsd: Convert lo_destroy to take the lo->mutex lock itself
virtiofsd: add --thread-pool-size=NUM option
virtiofsd: fix lo_destroy() resource leaks
virtiofsd: prevent FUSE_INIT/FUSE_DESTROY races
virtiofsd: process requests in a thread pool
virtiofsd: use fuse_buf_writev to replace fuse_buf_write for better performance
virtiofsd: add definition of fuse_buf_writev()
virtiofsd: passthrough_ll: Use cache_readdir for directory open
virtiofsd: Fix data corruption with O_APPEND write in writeback mode
virtiofsd: Reset O_DIRECT flag during file open
virtiofsd: convert more fprintf and perror to use fuse log infra
virtiofsd: do not always set FUSE_FLOCK_LOCKS
virtiofsd: introduce inode refcount to prevent use-after-free
virtiofsd: passthrough_ll: fix refcounting on remove/rename
libvhost-user: Fix some memtable remap cases
virtiofsd: rename inode->refcount to inode->nlookup
virtiofsd: prevent races with lo_dirp_put()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the --print-capabilities option as per vhost-user.rst "Backend
programs conventions". Currently there are no advertised features.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The qemu-nbd documentation is currently in qemu-nbd.texi in Texinfo
format, which we present to the user as:
* a qemu-nbd manpage
* a section of the main qemu-doc HTML documentation
Convert the documentation to rST format, and present it to the user as:
* a qemu-nbd manpage
* part of the interop/ Sphinx manual
This follows the same pattern as commit 27a296fce9 did for the
qemu-ga manpage.
All the content of the old manpage is retained, except that I have
dropped the "This is free software; see the source for copying
conditions. There is NO warranty..." text that was in the old AUTHOR
section; Sphinx's manpage builder doesn't expect that much text in
the AUTHOR section, and since none of our other manpages have it it
seems easiest to delete it rather than try to figure out where else
in the manpage to put it.
The only other textual change is that I have had to give the
--nocache option its own description ("Equivalent to --cache=none")
because Sphinx doesn't have an equivalent of using item/itemx
to share a description between two options.
Some minor aspects of the formatting have changed, to suit what is
easiest for Sphinx to output. (The most notable is that Sphinx
option section option syntax doesn't support '--option foo=bar'
with bar underlined rather than bold, so we have to switch to
'--option foo=BAR' instead.)
The contents of qemu-option-trace.texi are now duplicated in
docs/interop/qemu-option-trace.rst.inc, until such time as we complete
the conversion of the other files which use it; since it has had only
3 changes in 3 years, this shouldn't be too awkward a burden.
(We use .rst.inc because if this file fragment has a .rst extension
then Sphinx complains about not seeing it in a toctree.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200116141511.16849-2-peter.maydell@linaro.org
Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl4TaMEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpvzgH/2LyDAzCa9h93ikSJjmyUk5FUaqve38daEb3
S3JYjwKxQx7u1ydooKhvBQnBCZ2i3S+k62gfYyKB+nBv8xvjs0Eg5D1YJ5E8hciy
lf5OFGWWtX2iPDjZwQwT13kiJe0o3JRGxJJ6XqTEG+1EYOp7cky/FEv4PD030b9m
I2wROZ/Am+onB9YJX8c0Vv1CG+AryuJNXnvwQzTXEjj4U7bEYUyJwVZaCRyAdWQ3
uYXIZN9VwjVX6BFvy9ZAJbEsUVJvOM1/aQaDqcrLz+VlzRT7bRkKHi2G3vakrm1I
r5OpgyLo84132awCncbSykKDH5o8WaxLaJBjGmuBfasMz9wPzAg=
=uL1o
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci, pc: fixes, features
Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 06 Jan 2020 17:05:05 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (32 commits)
intel_iommu: add present bit check for pasid table entries
intel_iommu: a fix to vtd_find_as_from_bus_num()
virtio-net: delete also control queue when TX/RX deleted
virtio: reset region cache when on queue deletion
virtio-mmio: update queue size on guest write
tests: add virtio-scsi and virtio-blk seg_max_adjust test
virtio: make seg_max virtqueue size dependent
hw: fix using 4.2 compat in 5.0 machine types for i440fx/q35
vhost-user-scsi: reset the device if supported
vhost-user: add VHOST_USER_RESET_DEVICE to reset devices
hw/pci/pci_host: Let pci_data_[read/write] use unsigned 'size' argument
hw/pci/pci_host: Remove redundant PCI_DPRINTF()
virtio-mmio: Clear v2 transport state on soft reset
ACPI: add expected files for HMAT tests (acpihmat)
tests/bios-tables-test: add test cases for ACPI HMAT
tests/numa: Add case for QMP build HMAT
hmat acpi: Build Memory Side Cache Information Structure(s)
hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)
hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
numa: Extend CLI to provide memory side cache information
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When instantiated, this object will connect to the given D-Bus bus
"addr". During migration, it will take/restore the data from
org.qemu.VMState1 instances. See documentation for details.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add a VHOST_USER_RESET_DEVICE message which will reset the vhost user
backend. Disabling all rings, and resetting all internal state, ready
for the backend to be reinitialized.
A backend has to report it supports this features with the
VHOST_USER_PROTOCOL_F_RESET_DEVICE protocol feature bit. If it does
so, the new message is used instead of sending a RESET_OWNER which has
had inconsistent implementations.
Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <1572385083-5254-2-git-send-email-raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch is to add standard commands defined in docs/interop/vhost-user.rst
For vhost-user-* program
Signed-off-by: Micky Yun Chan (michiboo) <chanmickyyun@gmail.com>
Message-Id: <20191209015331.5455-1-chanmickyyun@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
'the' has a tendency to double up; squash them back down.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191104185202.102504-1-dgilbert@redhat.com>
[lv: removed disas/libvixl/vixl/invalset.h change]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The qemu-ga documentation is currently in qemu-ga.texi in
Texinfo format, which we present to the user as:
* a qemu-ga manpage
* a section of the main qemu-doc HTML documentation
Convert the documentation to rST format, and present it to
the user as:
* a qemu-ga manpage
* part of the interop/ Sphinx manual
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-id: 20190905131040.8350-1-peter.maydell@linaro.org
Commit fe0480d6 and friends added BDRV_REQ_NO_FALLBACK as a way to
avoid wasting time on a preliminary write-zero request that will later
be rewritten by actual data, if it is known that the write-zero
request will use a slow fallback; but in doing so, could not optimize
for NBD. The NBD specification is now considering an extension that
will allow passing on those semantics; this patch updates the new
protocol bits and 'qemu-nbd --list' output to recognize the bit, as
well as the new errno value possible when using the new flag; while
upcoming patches will improve the client to use the feature when
present, and the server to advertise support for it.
The NBD spec recommends (but not requires) that ENOTSUP be avoided for
all but failures of a fast zero (the only time it is mandatory to
avoid an ENOTSUP failure is when fast zero is supported but not
requested during write zeroes; the questionable use is for ENOTSUP to
other actions like a normal write request). However, clients that get
an unexpected ENOTSUP will either already be treating it the same as
EINVAL, or may appreciate the extra bit of information. We were
equally loose for returning EOVERFLOW in more situations than
recommended by the spec, so if it turns out to be a problem in
practice, a later patch can tighten handling for both error codes.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190823143726.27062-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: tweak commit message, also handle EOPNOTSUPP]
The NBD specification defines NBD_FLAG_CAN_MULTI_CONN, which can be
advertised when the server promises cache consistency between
simultaneous clients (basically, rules that determine what FUA and
flush from one client are able to guarantee for reads from another
client). When we don't permit simultaneous clients (such as qemu-nbd
without -e), the bit makes no sense; and for writable images, we
probably have a lot more work before we can declare that actions from
one client are cache-consistent with actions from another. But for
read-only images, where flush isn't changing any data, we might as
well advertise multi-conn support. What's more, advertisement of the
bit makes it easier for clients to determine if 'qemu-nbd -e' was in
use, where a second connection will succeed rather than hang until the
first client goes away.
This patch affects qemu as server in advertising the bit. We may want
to consider patches to qemu as client to attempt parallel connections
for higher throughput by spreading the load over those connections
when a server advertises multi-conn, but for now sticking to one
connection per nbd:// BDS is okay.
See also: https://bugzilla.redhat.com/1708300
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190815185024.7010-1-eblake@redhat.com>
[eblake: tweak blockdev-nbd.c to not request shared when writable,
fix iotest 233]
Reviewed-by: John Snow <jsnow@redhat.com>
Move query-target and its return type TargetInfo from misc.json to
machine.json, where they are covered by MAINTAINERS section "Machine
core". Also move its implementation from arch_init.c to
hw/core/machine-qmp-cmds, where it is likewise covered.
All users of SysEmuTarget are now in machine.json. Move it there from
common.json.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-3-armbru@redhat.com>
The vhost-user specification does not explain when
VHOST_USER_PROTOCOL_F_MQ must be implemented. This may lead
implementors of vhost-user masters to believe that this protocol feature
is required for any device that has multiple virtqueues. That would be
a mistake since existing vhost-user slaves offer multiple virtqueues but
do not advertise VHOST_USER_PROTOCOL_F_MQ.
For example, a vhost-net device with one rx/tx queue pair is not
multiqueue. The slave does not need to advertise
VHOST_USER_PROTOCOL_F_MQ. Therefore the master must assume it has these
virtqueues and cannot rely on askingt the slave how many virtqueues
exist.
Extend the specification to explain the different between true
multiqueue and regular devices with a fixed virtqueue layout.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190624091304.666-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The annotated style json we use in QMP documentation is not strict json
and depending on the version of Sphinx (2.0+) or Pygments installed,
might cause the build to fail.
Use the new QMP lexer.
Further, some versions of Sphinx can not apply custom lexers to "code"
directives and require the use of "code-block" directives instead, so
make that change at this time as well.
Tested under:
- Sphinx 1.3.6 and Pygments 2.4
- Sphinx 1.7.6 and Pygments 2.2 (Fedora 29 packages)
- Sphinx 2.0.1 and Pygments 2.4
- Sphinx 3.0.0+/f396b3a783 and Pygments 2.4 (From Sphinx git c4f44bdd)
Reported-by: Aarushi Mehta <mehta.aaru20@gmail.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 20190603214653.29369-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Pygments and Sphinx get pickier all the time; Sphinx 2.1+ now catches
these errors.
Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Aarushi Mehta <mehta.aaru20@gmail.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190603214653.29369-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
The "Multiple queue support" section makes references to vhost-user-net
"queue pairs". This is confusing for two reasons:
1. This actually applies to all device types, not just vhost-user-net.
2. VHOST_USER_GET_QUEUE_NUM returns the number of virtqueues, not the
number of queue pairs.
Reword the section so that the vhost-user-net specific part is relegated
to the very end: we acknowledge that vhost-user-net historically
automatically enabled the first queue pair.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190626074815.19994-5-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190605131221.29432-1-marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a new vhost-user message to give a unix socket to a vhost-user
backend for GPU display updates.
Back when I started that work, I added a new GPU channel because the
vhost-user protocol wasn't bidirectional. Since then, there is a
vhost-user-slave channel for the slave to send requests to the master.
We could extend it with GPU messages. However, the GPU protocol is
quite orthogonal to vhost-user, thus I chose to have a new dedicated
channel.
See vhost-user-gpu.rst for the protocol details.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-2-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190315180735.13096-1-marcandre.lureau@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This just about rewrites the entirety of the bitmaps.rst document to
make it consistent with the 4.0 release. I have added new features seen
in the 4.0 release, as well as tried to clarify some points that keep
coming up when discussing this feature both in-house and upstream.
It does not yet cover pull backups or migration details, but I intend to
keep extending this document to cover those cases.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190426221528.30293-3-jsnow@redhat.com
[Adjusted commit message. --js]
Signed-off-by: John Snow <jsnow@redhat.com>
This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD
and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared
buffer between qemu and backend.
Firstly, qemu uses VHOST_USER_GET_INFLIGHT_FD to get the
shared buffer from backend. Then qemu should send it back
through VHOST_USER_SET_INFLIGHT_FD each time we start vhost-user.
This shared buffer is used to track inflight I/O by backend.
Qemu should retrieve a new one when vm reset.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Chai Wen <chaiwen@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-2-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As discussed during "[PATCH v4 00/29] vhost-user for input & GPU"
review, let's define a common set of backend conventions to help with
management layer implementation, and interoperability.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190308140454.32437-3-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We already use (we didn't notice it) IN_USE flag for marking bitmap
metadata outdated, such as AUTO flag, which mirrors enabled/disabled
bitmaps. Now we are going to support bitmap resize, so it's good to
write IN_USE meaning with more details.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190311185147.52309-2-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
The previous commit added a way to configure firmware with -blockdev
rather than -drive if=pflash. Document it as the preferred way.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190308131445.17502-13-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>