virtio_device_stop_ioeventfd() has not been used since commit
310837de6c ("virtio: introduce
grab/release_ioeventfd to fix vhost") in 2016.
Nowadays ioeventfd is stopped implicitly by the virtio transport when
lifecycle events such as the VM pausing or device unplug occur.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20191021150343.30742-1-stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20191025083527.30803-9-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch implements basic support for the packed virtqueue. Compare
the split virtqueue which has three rings, packed virtqueue only have
one which is supposed to have better cache utilization and more
hardware friendly.
Please refer virtio specification for more information.
Signed-off-by: Wei Xu <wexu@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20191025083527.30803-6-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Put QOM and main struct definition in a separate header file, so it
can be accessed from other components.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
The virtio-fs virtio device provides shared file system access using
the FUSE protocol carried over virtio.
The actual file server is implemented in an external vhost-user-fs device
backend process.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190930105135.27244-3-dgilbert@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The default backend is only used within virtio_rng_device_realize().
Replace VirtIORNGConf member default_backend by a local variable.
Adjust its type to reduce conversions.
While there, pass &error_abort instead of NULL when failure would be a
programming error.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20190820160615.14616-3-lvivier@redhat.com>
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 1800 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h, down from 5400 due to the
previous commit).
Several headers include sysemu/sysemu.h just to get typedef
VMChangeStateEntry. Move it from sysemu/sysemu.h to qemu/typedefs.h.
Spell its structure tag the same while there. Drop the now
superfluous includes of sysemu/sysemu.h from headers.
Touching sysemu/sysemu.h now recompiles some 1100 objects.
qemu/uuid.h also drops from 1800 to 1100, and
qapi/qapi-types-run-state.h from 5000 to 4400.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-29-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
Almost a third of its inclusions are actually superfluous. Delete
them. Downgrade two more to qapi/qapi-types-run-state.h, and move one
from char/serial.h to char/serial.c.
hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and
stubs/semihost.c define variables declared in sysemu/sysemu.h without
including it. The compiler is cool with that, but include it anyway.
This doesn't reduce actual use much, as it's still included into
widely included headers. The next commit will tackle that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-27-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Move the HostMemoryBackend typedef from sysemu/hostmem.h to
qemu/typedefs.h. This renders a few inclusions of sysemu/hostmem.h
superfluous; drop them.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-25-armbru@redhat.com>
In my "build everything" tree, changing hw/qdev-properties.h triggers
a recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
Many places including hw/qdev-properties.h (directly or via hw/qdev.h)
actually need only hw/qdev-core.h. Include hw/qdev-core.h there
instead.
hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h
and hw/qdev-properties.h, which in turn includes hw/qdev-core.h.
Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h.
While there, delete a few superfluous inclusions of hw/qdev-core.h.
Touching hw/qdev-properties.h now recompiles some 1200 objects.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190812052359.30071-22-armbru@redhat.com>
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
The previous commits have left only the declaration of hw_error() in
hw/hw.h. This permits dropping most of its inclusions. Touching it
now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Drop unnecessary inclusions from headers. Downgrade a few more to
exec/hwaddr.h.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-17-armbru@redhat.com>
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
hw/hw.h supposedly includes it for convenience. Several other headers
include it just to get VMStateDescription. The previous commit made
that unnecessary.
Include migration/vmstate.h only where it's still needed. Touching it
now recompiles only some 1600 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Some of the generated qapi-types-MODULE.h are included all over the
place. Changing a QAPI type can trigger massive recompiling. Top
scorers recompile more than 1000 out of some 6600 objects (not
counting tests and objects that don't depend on qemu/osdep.h):
6300 qapi/qapi-builtin-types.h
5700 qapi/qapi-types-run-state.h
3900 qapi/qapi-types-common.h
3300 qapi/qapi-types-sockets.h
3000 qapi/qapi-types-misc.h
3000 qapi/qapi-types-crypto.h
3000 qapi/qapi-types-job.h
3000 qapi/qapi-types-block-core.h
2800 qapi/qapi-types-block.h
1300 qapi/qapi-types-net.h
Clean up headers to include generated QAPI headers only where needed.
Impact is negligible except for hw/qdev-properties.h.
This header includes qapi/qapi-types-block.h and
qapi/qapi-types-misc.h. They are used only in expansions of property
definition macros such as DEFINE_PROP_BLOCKDEV_ON_ERROR() and
DEFINE_PROP_OFF_AUTO(). Moving their inclusion from
hw/qdev-properties.h to the users of these macros avoids pointless
recompiles. This is how other property definition macros, such as
DEFINE_PROP_NETDEV(), already work.
Improves things for some of the top scorers:
3600 qapi/qapi-types-common.h
2800 qapi/qapi-types-sockets.h
900 qapi/qapi-types-misc.h
2200 qapi/qapi-types-crypto.h
2100 qapi/qapi-types-job.h
2100 qapi/qapi-types-block-core.h
270 qapi/qapi-types-block.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-3-armbru@redhat.com>
Back in 2016, we discussed[1] rules for headers, and these were
generally liked:
1. Have a carefully curated header that's included everywhere first. We
got that already thanks to Peter: osdep.h.
2. Headers should normally include everything they need beyond osdep.h.
If exceptions are needed for some reason, they must be documented in
the header. If all that's needed from a header is typedefs, put
those into qemu/typedefs.h instead of including the header.
3. Cyclic inclusion is forbidden.
This patch gets include/ closer to obeying 2.
It's actually extracted from my "[RFC] Baby steps towards saner
headers" series[2], which demonstrates a possible path towards
checking 2 automatically. It passes the RFC test there.
[1] Message-ID: <87h9g8j57d.fsf@blackfin.pond.sub.org>
https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg03345.html
[2] Message-Id: <20190711122827.18970-1-armbru@redhat.com>
https://lists.nongnu.org/archive/html/qemu-devel/2019-07/msg02715.html
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-2-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
We still have multiple issues in the current code
- The PBP is not freed during unrealize()
- The PBP is not reset on device resets: After a reset, the PBP is stale.
- We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
guests (esp. legacy guests) will reuse pages without deflating,
turning the PBP stale. Adding that would require compat handling.
Instead, let's use the PBP only temporarily, when processing one bulk of
inflation requests. This will keep guest_page_size > 4k working (with
Linux guests). There is nothing to do for deflation requests anymore.
The pbp is only used for a limited amount of time.
Fixes: ed48c59875 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
The argument is not used and passing it clutters error propagation in the
callers. So, get rid of it.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The virtio-balloon config size changed in QEMU 4.0 even for existing
machine types. Migration from QEMU 3.1 to 4.0 can fail in some
circumstances with the following error:
qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10 read: a1 device: 1 cmask: ff wmask: c0 w1cmask:0
This happens because the virtio-balloon config size affects the VIRTIO
Legacy I/O Memory PCI BAR size.
Introduce a qdev property called "qemu-4-0-config-size" and enable it
only for the QEMU 4.0 machine types. This way <4.0 machine types use
the old size, 4.0 uses the larger size, and >4.0 machine types use the
appropriate size depending on enabled virtio-balloon features.
Live migration to and from old QEMUs to QEMU 4.1 works again as long as
a versioned machine type is specified (do not use just "pc"!).
Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190710141440.27635-1-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Besides virtio 1.0 transitional devices, we should also
set "start_on_kick" flag for legacy devices (virtio 0.9).
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20190626023130.31315-3-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to avoid migration issues, we introduce a "use-started"
property to the base virtio device to indicate whether use
"started" flag or not. This property will be true by default and
set to false when machine type <= 4.0.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Message-Id: <20190626023130.31315-2-xieyongji@baidu.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is the implementation of virtio-pmem device. Support will require
machine changes for the architectures that will support it, so it will
not yet be compiled. It can be unlocked with VIRTIO_PMEM_SUPPORTED per
machine and disabled globally via VIRTIO_PMEM.
We cannot use the "addr" property as that is already used e.g. for
virtio-pci/pci devices. And we will have e.g. virtio-pmem-pci as a proxy.
So we have to choose a different one (unfortunately). "memaddr" it is.
That name should ideally be used by all other virtio-* based memory
devices in the future.
-device virtio-pmem-pci,id=p0,bus=bux0,addr=0x01,memaddr=0x1000000...
Acked-by: Markus Armbruster <armbru@redhat.com>
[ QAPI bits ]
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
[ MemoryDevice/MemoryRegion changes, cleanups, addr property "memaddr",
split up patches, unplug handler ]
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190619094907.10131-2-pagupta@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
stricter rules for acpi tables: we now fail
on any difference that isn't whitelisted.
vhost-scsi migration.
some cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJc+B4YAAoJECgfDbjSjVRpq1EIAJR7tCxcpu9GggVlinmUA8G4
tmSAe06IryH7+nF3RsnINuGu7ius9qC2/E2y0uJUHhTqiU/RWOfWZ7PPM0EcYZaA
TLPaCe2NUF6/8afeqmvE9Usk7VspI5TDZRms+bonmZz2xP1lHIMN0qW4s7HHLWr8
sZKDtCJ+9cYII93VQwtlR0qiHgv5f0kzcuZeJaZHsAHH6XZGqRuQjI6txcFa4o53
lkdLCEwTnRuwu2wyL84eL5p+E8SzOgR/x1QI+nffrJfsvnmiT7lnOrkjnQlWAp5G
xqwqsUrUxUCuQ+zitwJqmv+H6nx79MwAM7fTHAETCWX703N5o9tZxAnHHqLoa8I=
=cQNg
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci, pc: cleanups, features
stricter rules for acpi tables: we now fail
on any difference that isn't whitelisted.
vhost-scsi migration.
some cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 05 Jun 2019 20:55:04 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
bios-tables-test: ignore identical binaries
tests: acpi: add simple arm/virt testcase
tests: add expected ACPI tables for arm/virt board
bios-tables-test: list all tables that differ
vhost-scsi: Allow user to enable migration
vhost-scsi: Add VMState descriptor
vhost-scsi: The vhost backend should be stopped when the VM is not running
bios-tables-test: add diff allowed list
vhost: fix memory leak in vhost_user_scsi_realize
vhost: fix incorrect print type
vhost: remove the dead code
docs: smbios: remove family=x from type2 entry description
pci: Fold pci_get_bus_devfn() into its sole caller
pci: Make is_bridge a bool
pcie: Simplify pci_adjust_config_limit()
acpi: pci: use build_append_foo() API to construct MCFG
hw/acpi: Consolidate build_mcfg to pci.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In order to perform a valid migration of a vhost-scsi device,
the following requirements must be met:
(1) The virtio-scsi device state needs to be saved & loaded.
(2) The vhost backend must be stopped before virtio-scsi device state
is saved:
(2.1) Sync vhost backend state to virtio-scsi device state.
(2.2) No further I/O requests are made by vhost backend to target
SCSI device.
(2.3) No further guest memory access takes place after VM is stopped.
(3) Requests in-flight to target SCSI device are completed before
migration handover.
(4) Target SCSI device state needs to be saved & loaded into the
destination host target SCSI device.
Previous commit ("vhost-scsi: Add VMState descriptor")
add support to save & load the device state using VMState.
This meets requirement (1).
When VM is stopped by migration thread (On Pre-Copy complete), the
following code path is executed:
migration_completion() -> vm_stop_force_state() -> vm_stop() ->
do_vm_stop().
do_vm_stop() calls first pause_all_vcpus() which pause all guest
vCPUs and then call vm_state_notify().
In case of vhost-scsi device, this will lead to the following code path
to be executed:
vm_state_notify() -> virtio_vmstate_change() ->
virtio_set_status() -> vhost_scsi_set_status() -> vhost_scsi_stop().
vhost_scsi_stop() then calls vhost_scsi_clear_endpoint() and
vhost_scsi_common_stop().
vhost_scsi_clear_endpoint() sends VHOST_SCSI_CLEAR_ENDPOINT ioctl to
vhost backend which will reach kernel's vhost_scsi_clear_endpoint()
which process all pending I/O requests and wait for them to complete
(vhost_scsi_flush()). This meets requirement (3).
vhost_scsi_common_stop() will stop the vhost backend.
As part of this stop, dirty-bitmap is synced and vhost backend state is
synced with virtio-scsi device state. As at this point guest vCPUs are
already paused, this meets requirement (2).
At this point we are left with requirement (4) which is target SCSI
device specific and therefore cannot be done by QEMU. Which is the main
reason why vhost-scsi adds a migration blocker.
However, as this can be handled either by an external orchestrator or
by using shared-storage (i.e. iSCSI), there is no reason to limit the
orchestrator from being able to explictly specify it wish to enable
migration even when VM have a vhost-scsi device.
Considering all the above, this commit allows orchestrator to explictly
specify that it is responsbile for taking care of requirement (4) and
therefore vhost-scsi should not add a migration blocker.
Reviewed-by: Nir Weiner <nir.weiner@oracle.com>
Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190416125912.44001-4-liran.alon@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Add new virtio-gpu devices with a "vhost-user" property. The
associated vhost-user backend is used to handle the virtio rings and
provide rendering results thanks to the vhost-user-gpu protocol.
Example usage:
-object vhost-user-backend,id=vug,cmd="./vhost-user-gpu"
-device vhost-user-vga,vhost-user=vug
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-10-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a base class that is common to virtio-gpu and vhost-user-gpu
devices.
The VirtIOGPUBase base class provides common functionalities necessary
for both virtio-gpu and vhost-user-gpu:
- common configuration (max-outputs, initial resolution, flags)
- virtio device initialization, including queue setup
- device pre-conditions checks (iommu)
- migration blocker
- virtio device callbacks
- hooking up to qemu display subsystem
- a few common helper functions to reset the device, retrieve display
informations
- a class callback to unblock the rendering (for GL updates)
What is left to the virtio-gpu subdevice to take care of, in short,
are all the virtio queues handling, command processing and migration.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-8-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This will allow to share the format conversion function with
vhost-user-gpu.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-4-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The helper functions are useful to build the vhost-user-gpu backend.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-3-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a new vhost-user message to give a unix socket to a vhost-user
backend for GPU display updates.
Back when I started that work, I added a new GPU channel because the
vhost-user protocol wasn't bidirectional. Since then, there is a
vhost-user-slave channel for the slave to send requests to the master.
We could extend it with GPU messages. However, the GPU protocol is
quite orthogonal to vhost-user, thus I chose to have a new dedicated
channel.
See vhost-user-gpu.rst for the protocol details.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-2-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Since we now support the message VHOST_USER_GET_INFLIGHT_FD
and VHOST_USER_SET_INFLIGHT_FD. The backend is able to restart
safely because it can track inflight I/O in shared memory.
This patch allows qemu to reconnect the backend after
connection closed.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Ni Xun <nixun@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190320112646.3712-7-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio 1.0 transitional devices support driver uses the device
before setting the DRIVER_OK status bit. So we introduce a started
flag to indicate whether driver has started the device or not.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190320112646.3712-2-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We used to set backend unconditionally, this won't work for some
guests (e.g windows driver) who may not initialize all virtqueues. For
kernel backend, this will fail since it may try to validate the rings
during setting backend.
Fixing this by simply skipping the backend set when we find desc is
not ready.
Reviewed-by: Michael S. Tsirkin<mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Leading underscores are ill-advised because such identifiers are
reserved. Trailing underscores are merely ugly. Strip both.
Our header guards commonly end in _H. Normalize the exceptions.
Done with scripts/clean-header-guards.pl.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190315145123.28030-7-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Changes to slirp/ dropped, as we're about to spin it off]
Add a new virtio-input device, which connects to a vhost-user
backend.
Instead of reading configuration directly from an input device /
evdev (like virtio-input-host), it reads it over vhost-user protocol
with {SET,GET}_CONFIG messages. The vhost-user-backend handles the
queues & events setup.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190503130034.24916-5-marcandre.lureau@redhat.com
[ kraxel: drop -{non-,}transitional variants ]
[ kraxel: fix "make check" on !linux ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If renderer_blocked is set do not call virtio_gpu_virgl_reset().
Instead set a flag indicating that virglrenderer needs a reset.
When renderer_blocked gets cleared do the actual reset call.
Without this we can trigger an assert in spice due to calling
spice_qxl_gl_scanout() while another operation is still running:
spice_qxl_gl_scanout: condition `qxl_state->gl_draw_cookie == GL_DRAW_COOKIE_INVALID' failed
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190314115358.26678-2-kraxel@redhat.com
This patch adds support for vhost-user-blk device to get/set
inflight buffer from/to backend.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-6-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD
and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared
buffer between qemu and backend.
Firstly, qemu uses VHOST_USER_GET_INFLIGHT_FD to get the
shared buffer from backend. Then qemu should send it back
through VHOST_USER_SET_INFLIGHT_FD each time we start vhost-user.
This shared buffer is used to track inflight I/O by backend.
Qemu should retrieve a new one when vm reset.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Chai Wen <chaiwen@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-2-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Take a VhostUserState* that can be pre-allocated, and initialize it
with the associated chardev.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Message-Id: <20190308140454.32437-4-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20190307080244.9011-4-kraxel@redhat.com
The new feature enables the virtio-balloon device to receive hints of
guest free pages from the free page vq.
A notifier is registered to the migration precopy notifier chain. The
notifier calls free_page_start after the migration thread syncs the dirty
bitmap, so that the free page optimization starts to clear bits of free
pages from the bitmap. It calls the free_page_stop before the migration
thread syncs the bitmap, which is the end of the current round of ram
save. The free_page_stop is also called to stop the optimization in the
case when there is an error occurred in the process of ram saving.
Note: balloon will report pages which were free at the time of this call.
As the reporting happens asynchronously, dirty bit logging must be
enabled before this free_page_start call is made. Guest reporting must be
disabled before the migration dirty bitmap is synchronized.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-8-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: Dropped kernel header update, fixed up CMD_ID_* name change
Switch virtio's self announcement to use the AnnounceTimer.
It keeps it's own AnnounceTimer (per device), and starts running it
using a migration post-load and a virtual clock; that way the
announce happens once the guest is actually running.
The timer uses the migration parameters to set the timing of
the repeats.
Based on earlier patches by myself and
Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Lots of work on tests: BiosTablesTest UEFI app,
vhost-user testing for non-Linux hosts.
Misc cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJccBqMAAoJECgfDbjSjVRpvSEIAKYPRNdCBX/SSS/L/tmJS5Zt
8IyU/HW1YJ249vO+aT6z4Q3QPgqNC3KjXC3brx/WRoPZnRroen4rv2Kqnk6SayPa
a52d2ubXKWxb3swdG1CAVzFRhq/ABpgAPx0dr1JW+RXgo2lxpJ4GNYxKMosQTaPE
hRNeXl1XlcIK525kJhFH3Hlij9mTRuY6T7ydpPQd8dUq2dBRaL9RrzZRrkZxCy6l
gQPUqNzPhG0XXyOiJmwYyVX0zGzbYrMLrMQAor2SBIYmU+zv2eZGPJUYxoMTUMzt
YR0WCpvkvPITlAryaBoozAIDYVz8PxBRT1KRwpDal+2rzlm6o+veKDiF8R46gn0=
=GzUz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio: fixes, cleanups, tests
Lots of work on tests: BiosTablesTest UEFI app,
vhost-user testing for non-Linux hosts.
Misc cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 22 Feb 2019 15:51:40 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (26 commits)
pci: Sanity test minimum downstream LNKSTA
hw/smbios: fix offset of type 3 sku field
pci: Move NVIDIA vendor id to the rest of ids
virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size
virtio-balloon: Use ram_block_discard_range() instead of raw madvise()
virtio-balloon: Rework ballon_page() interface
virtio-balloon: Corrections to address verification
virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate
i386/kvm: ignore masked irqs when update msi routes
contrib/vhost-user-blk: fix the compilation issue
Revert "contrib/vhost-user-blk: fix the compilation issue"
pc-dimm: use same mechanism for [get|set]_addr
tests/data: introduce "uefi-boot-images" with the "bios-tables-test" ISOs
tests/uefi-test-tools: add build scripts
tests: introduce "uefi-test-tools" with the BiosTablesTest UEFI app
roms: build the EfiRom utility from the roms/edk2 submodule
roms: add the edk2 project as a git submodule
vhost-user-test: create a temporary directory per TestServer
vhost-user-test: small changes to init_hugepagefs
vhost-user-test: create a main loop per TestServer
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJccAIFAAoJEJykq7OBq3PImM0IAMcC92GYwSL6MgQ4NrfbVfDQ
P+qgIoxlXcYNwa12kGY6rE4tgSDab9Mc5ACpmuFdr8Xj7/JOp97AccmKZ+RzYQSj
IFlkvd8GngQR1YnFGV6PIWSt7hRhKuUZMqSIDrWro/MTdiJFEMI8/e7QLGxmaEX1
gWCNSopxJUeACSJiRyfZvBGNCs23R9ptFKBFhIXS98KPtEtF8LQV0JnQXoRUDiBL
G9C/xggdGDvct3Id4yOBCh43ErssyOrlYwjzIRWB2AFfIwHGytJcL6JBjAqy4Z7M
ClMNXaQSbfSCXrc3osF9nO6KaFduhfpUH44lL5JBHxUgH6pN2xGradRANZwysQM=
=DSBn
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
# gpg: Signature made Fri 22 Feb 2019 14:07:01 GMT
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request: (27 commits)
tests/virtio-blk: add test for DISCARD command
tests/virtio-blk: add test for WRITE_ZEROES command
tests/virtio-blk: add virtio_blk_fix_dwz_hdr() function
tests/virtio-blk: change assert on data_size in virtio_blk_request()
virtio-blk: add DISCARD and WRITE_ZEROES features
virtio-blk: set config size depending on the features enabled
virtio-net: make VirtIOFeature usable for other virtio devices
virtio-blk: add "discard" and "write-zeroes" properties
virtio-blk: add host_features field in VirtIOBlock
virtio-blk: add acct_failed param to virtio_blk_handle_rw_error()
hw/ide: drop iov field from IDEDMA
hw/ide: drop iov field from IDEBufferedRequest
hw/ide: drop iov field from IDEState
tests/test-bdrv-drain: use QEMU_IOVEC_INIT_BUF
migration/block: use qemu_iovec_init_buf
qemu-img: use qemu_iovec_init_buf
block/vmdk: use qemu_iovec_init_buf
block/qed: use qemu_iovec_init_buf
block/qcow2: use qemu_iovec_init_buf
block/qcow: use qemu_iovec_init_buf
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The virtio-balloon always works in units of 4kiB (BALLOON_PAGE_SIZE), but
we can only actually discard memory in units of the host page size.
Now, we handle this very badly: we silently ignore balloon requests that
aren't host page aligned, and for requests that are host page aligned we
discard the entire host page. The latter can corrupt guest memory if its
page size is smaller than the host's.
The obvious choice would be to disable the balloon if the host page size is
not 4kiB. However, that would break the special case where host and guest
have the same page size, but that's larger than 4kiB. That case currently
works by accident[1] - and is used in practice on many production POWER
systems where 64kiB has long been the Linux default page size on both host
and guest.
To make the balloon safe, without breaking that useful special case, we
need to accumulate 4kiB balloon requests until we have a whole contiguous
host page to discard.
We could in principle do that across all guest memory, but it would require
a large bitmap to track. This patch represents a compromise: we track
ballooned subpages for a single contiguous host page at a time. This means
that if the guest discards all 4kiB chunks of a host page in succession,
we will discard it. This is the expected behaviour in the (host page) ==
(guest page) != 4kiB case we want to support.
If the guest scatters 4kiB requests across different host pages, we don't
discard anything, and issue a warning. Not ideal, but at least we don't
corrupt guest memory as the previous version could.
Warning reporting is kind of a compromise here. Determining whether we're
in a problematic state at realize() time is tricky, because we'd have to
look at the host pagesizes of all memory backends, but we can't really know
if some of those backends could be for special purpose memory that's not
subject to ballooning.
Reporting only when the guest tries to balloon a partial page also isn't
great because if the guest page size happens to line up it won't indicate
that we're in a non ideal situation. It could also cause alarming repeated
warnings whenever a migration is attempted.
So, what we do is warn the first time the guest attempts balloon a partial
host page, whether or not it will end up ballooning the rest of the page
immediately afterwards.
[1] Because when the guest attempts to balloon a page, it will submit
requests for each 4kiB subpage. Most will be ignored, but the one
which happens to be host page aligned will discard the whole lot.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190214043916.22128-6-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds the support of DISCARD and WRITE_ZEROES commands,
that have been introduced in the virtio-blk protocol to have
better performance when using SSD backend.
We support only one segment per request since multiple segments
are not widely used and there are no userspace APIs that allow
applications to submit multiple segments in a single call.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-7-sgarzare@redhat.com
Message-Id: <20190221103314.58500-7-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Starting from DISABLE and WRITE_ZEROES features, we use an array of
VirtIOFeature (as virtio-net) to properly set the config size
depending on the features enabled.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-6-sgarzare@redhat.com
Message-Id: <20190221103314.58500-6-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In order to use VirtIOFeature also in other virtio devices, we move
its declaration and the endof() macro (renamed in virtio_endof())
in virtio.h.
We add virtio_feature_get_config_size() function to iterate the array
of VirtIOFeature and to return the config size depending on the
features enabled. (as virtio_net_set_config_size() did)
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-5-sgarzare@redhat.com
Message-Id: <20190221103314.58500-5-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since configurable features for virtio-blk are growing, this patch
adds host_features field in the struct VirtIOBlock. (as in virtio-net)
In this way, we can avoid to add new fields for new properties and
we can directly set VIRTIO_BLK_F* flags in the host_features.
We update "config-wce" and "scsi" property definition to use the new
host_features field without change the behaviour.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-3-sgarzare@redhat.com
Message-Id: <20190221103314.58500-3-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch adds EDID support to the family of virtio-gpu devices. It is
turned off by default, use the new edid property to enable it.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190221081054.13853-1-kraxel@redhat.com
Commit 2974e916df introduced the VirtioNetRscChain structure which
refer to a VirtIONet, declared later, thus required VirtIONet typedef
to use a forward declaration.
However, when compiling with Clang in -std=gnu99 mode, this triggers
the following warning/error:
CC hw/net/virtio-net.o
In file included from qemu/hw/net/virtio-net.c:22:
include/hw/virtio/virtio-net.h:189:3: error: redefinition of typedef 'VirtIONet' is a C11 feature [-Werror,-Wtypedef-redefinition]
} VirtIONet;
^
include/hw/virtio/virtio-net.h:110:26: note: previous definition is here
typedef struct VirtIONet VirtIONet;
^
1 error generated.
make: *** [rules.mak:69: hw/net/virtio-net.o] Error 1
Fix it by removing the duplicate typedef definition.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This commit adds implementation of RX packets
coalescing, compatible with requirements of Windows
Hardware compatibility kit.
The device enables feature VIRTIO_NET_F_RSC_EXT in
host features if it supports extended RSC functionality
as defined in the specification.
This feature requires at least one of VIRTIO_NET_F_GUEST_TSO4,
VIRTIO_NET_F_GUEST_TSO6. Windows guest driver acks
this feature only if VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
is also present.
If the guest driver acks VIRTIO_NET_F_RSC_EXT feature,
the device coalesces TCPv4 and TCPv6 packets (if
respective VIRTIO_NET_F_GUEST_TSO feature is on,
populates extended RSC information in virtio header
and sets VIRTIO_NET_HDR_F_RSC_INFO bit in header flags.
The device does not recalculate checksums in the coalesced
packet, so they are not valid.
In this case:
All the data packets in a tcp connection are cached
to a single buffer in every receive interval, and will
be sent out via a timer, the 'virtio_net_rsc_timeout'
controls the interval, this value may impact the
performance and response time of tcp connection,
50000(50us) is an experience value to gain a performance
improvement, since the whql test sends packets every 100us,
so '300000(300us)' passes the test case, it is the default
value as well, tune it via the command line parameter
'rsc_interval' within 'virtio-net-pci' device, for example,
to launch a guest with interval set as '500000':
'virtio-net-pci,netdev=hostnet1,bus=pci.0,id=net1,mac=00,
guest_rsc_ext=on,rsc_interval=500000'
The timer will only be triggered if the packets pool is not empty,
and it'll drain off all the cached packets.
'NetRscChain' is used to save the segments of IPv4/6 in a
VirtIONet device.
A new segment becomes a 'Candidate' as well as it passed sanity check,
the main handler of TCP includes TCP window update, duplicated
ACK check and the real data coalescing.
An 'Candidate' segment means:
1. Segment is within current window and the sequence is the expected one.
2. 'ACK' of the segment is in the valid window.
Sanity check includes:
1. Incorrect version in IP header
2. An IP options or IP fragment
3. Not a TCP packet
4. Sanity size check to prevent buffer overflow attack.
5. An ECN packet
Even though, there might more cases should be considered such as
ip identification other flags, while it breaks the test because
windows set it to the same even it's not a fragment.
Normally it includes 2 typical ways to handle a TCP control flag,
'bypass' and 'finalize', 'bypass' means should be sent out directly,
while 'finalize' means the packets should also be bypassed, but this
should be done after search for the same connection packets in the
pool and drain all of them out, this is to avoid out of order fragment.
All the 'SYN' packets will be bypassed since this always begin a new'
connection, other flags such 'URG/FIN/RST/CWR/ECE' will trigger a
finalization, because this normally happens upon a connection is going
to be closed, an 'URG' packet also finalize current coalescing unit.
Statistics can be used to monitor the basic coalescing status, the
'out of order' and 'out of window' means how many retransmitting packets,
thus describe the performance intuitively.
Difference between ip v4 and v6 processing:
Fragment length in ipv4 header includes itself, while it's not
included for ipv6, thus means ipv6 can carry a real 65535 payload.
Note that main goal of implementing this feature in software
is to create reference setup for certification tests. In such
setups guest migration is not required, so the coalesced packets
not yet delivered to the guest will be lost in case of migration.
Signed-off-by: Wei Xu <wexu@redhat.com>
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
No functional change, just preparation for a followup patch
which needs a VirtIOGPU pointer.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20180829122101.29852-2-kraxel@redhat.com>
This reverts commit 93f874fe9d.
Now with virtio-vga being resetted properly the
crash workaround is not needed any more.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180821111313.27792-3-kraxel@redhat.com
We must call the reset functions for both virtio-gpu
and vga to properly reset the combo device.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180821111313.27792-2-kraxel@redhat.com
In preparation for having vhost-scsi also make use of host_features,
move it from struct VHostUserSCSI into struct VHostSCSICommon.
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Message-Id: <20180808195235.5843-2-gedwards@ddn.com>
Reviewed-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With vga=775 on the Linux command line a first boot of the VM running
Linux works fine. After a warm reboot it crashes during Linux boot.
Before that, valgrind points out bad memory write to console
surface. The VGA code is not aware that virtio-gpu got a message
surface scanout when the display is disabled. Let's reset VGA graphic
mode when it is the case, so that a new display surface is created
when doing further VGA operations.
https://bugs.launchpad.net/qemu/+bug/1784900/
Reported-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Message-id: 20180803153235.4134-1-marcandre.lureau@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It eases code review, unit is explicit.
Patch generated using:
$ git grep -E '(1024|2048|4096|8192|(<<|>>).?(10|20|30))' hw/ include/hw/
and modified manually.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180625124238.25339-39-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In commit a8bff79e9f we added a definition to hw/virtio/virtio-gpu.h
for VIRTIO_GPU_CAPSET_VIRGL2, as a workaround for it not yet being
in the Linux kernel headers. In commit 77d361b13c we updated our
kernel headers to a version which does define the macro, so we can
now remove our workaround.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180622173249.29963-1-peter.maydell@linaro.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbEXNvAAoJECgfDbjSjVRpc8gH/R8xrcFrV+k9wwbgYcOcGb6Y
LWjseE31pqJcxRV80vLOdzYEuLStZQKQQY7xBDMlA5vdyvZxIA6FLO2IsiJSbFAk
EK8pclwhpwQAahr8BfzenabohBv2UO7zu5+dqSvuJCiMWF3jGtPAIMxInfjXaOZY
odc1zY2D2EgsC7wZZ1hfraRbISBOiRaez9BoGDKPOyBY9G1ASEgxJgleFgoBLfsK
a1XU+fDM6hAVdxftfkTm0nibyf7PWPDyzqghLqjR9WXLvZP3Cqud4p8N29mY51pR
KSTjA4FYk6Z9EVMltyBHfdJs6RQzglKjxcNGdlrvacDfyFi79fGdiosVllrjfJM=
=3+V0
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
acpi, vhost, misc: fixes, features
vDPA support, fix to vhost blk RO bit handling, some include path
cleanups, NFIT ACPI table.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 01 Jun 2018 17:25:19 BST
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (31 commits)
vhost-blk: turn on pre-defined RO feature bit
ACPI testing: test NFIT platform capabilities
nvdimm, acpi: support NFIT platform capabilities
tests/.gitignore: add entry for generated file
arch_init: sort architectures
ui: use local path for local headers
qga: use local path for local headers
colo: use local path for local headers
migration: use local path for local headers
usb: use local path for local headers
sd: fix up include
vhost-scsi: drop an unused include
ppc: use local path for local headers
rocker: drop an unused include
e1000e: use local path for local headers
ioapic: fix up includes
ide: use local path for local headers
display: use local path for local headers
trace: use local path for local headers
migration: drop an unused include
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Read only feature shouldn't be negotiable, because if the
backend device reported Read only feature supported, QEMU
host driver shouldn't change backend's RO attribute. While
here, also enable the vhost-user-blk test utility to test
RO feature.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 5643cc94ac ("virtio-gpu-3d: add support for second capability
set (v4)") updated virtio_gpu.h with a define that does not yet(?)
exist upstream resulting in build breakage every time Linux headers
are updated via the standard update script. Conditionally define this
within QEMU code instead to avoid future breakage.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Fixes: 5643cc94ac ("virtio-gpu-3d: add support for second capability set (v4)")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180525132755.21839-2-peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch introduces VHOST_USER_PROTOCOL_F_HOST_NOTIFIER.
With this feature negotiated, vhost-user backend can register
memory region based host notifiers. And it will allow the guest
driver in the VM to notify the hardware accelerator at the
vhost-user backend directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When multi queue is enabled e.g. for a virtio-net device,
each queue pair will have a vhost_dev, and the only thing
shared between vhost devs currently is the chardev. This
patch introduces a vhost-user state structure which will
be shared by all vhost devs of the same virtio device.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces a vhost op for vhost backends to allow
them to filter the memory sections that they can handle.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces the support for setting memory region
based host notifiers for virtio device. This is helpful when
using a hardware accelerator for a virtio device, because
hardware heavily depends on the notification, this will allow
the guest driver in the VM to notify the hardware directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is no need to include pci.h in these files.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Although linkspeed and duplex can be set in a linux guest via 'ethtool -s',
this requires custom ethtool commands for virtio-net by default.
Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
the hypervisor to export a linkspeed and duplex setting. The user can
subsequently overwrite it later if desired via: 'ethtool -s'.
Linkspeed and duplex settings can be set as:
'-device virtio-net,speed=10000,duplex=full'
where speed is [0...INT_MAX], and duplex is ["half"|"full"].
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtio-dev@lists.oasis-open.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In prepartion for using some of the high order feature bits, make sure that
virtio-net uses 64-bit values everywhere.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtio-dev@lists.oasis-open.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce two vhost-user meassges: VHOST_USER_CREATE_CRYPTO_SESSION
and VHOST_USER_CLOSE_CRYPTO_SESSION. At this point, the QEMU side
support crypto operation in cryptodev host-user backend.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Impliment the vhost-crypto's funtions, such as startup,
stop and notification etc. Introduce an enum
QCryptoCryptoDevBackendOptionsType in order to
identify the cryptodev vhost backend is vhost-user
or vhost-kernel-module (If exist).
At this point, the cryptdoev-vhost-user works.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Due to a kernel bug we can never increase the size of capability
set 1, so introduce a new capability set in parallel, old userspace
will continue to use the old set, new userspace will start using
the new one when it detects a fixed kernel.
v2: don't use a define from virglrenderer, just probe it.
v3: fix compilation when virglrenderer disabled
v4: fix style warning, just use ?: op instead.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Message-id: 20180223023814.24459-1-airlied@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Remove the old update mechanism, vhost_set_memory, and the functions
and flags it used.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor spotted that there's a race, where a region that's unref'd
in a _del callback might be free'd before the set_mem_table call in
the _commit callback, and thus the vhost might end up using free memory.
Fix this by building a complete temporary sections list, ref'ing every
section (during add and nop) and then unref'ing the whole list right
at the end of commit.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio_bus_set_host_notifier function no longer calls
event_notifier_cleanup when a event notifier is removed.
The commit updates the code to match the new behavior and calls
virtio_bus_cleanup_host_notifier after the notifier was de-assign
and no longer in use.
This change is a preparation to allow executing the
virtio_bus_set_host_notifier function in a memory region
transaction.
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit introduces a new vhost-user device for block, it uses a
chardev to connect with the backend, same with Qemu virito-blk device,
Guest OS still uses the virtio-blk frontend driver.
To use it, start QEMU with command line like this:
qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket \
-device vhost-user-blk-pci,chardev=char0,num-queues=2, \
bootindex=2... \
Users can use different parameters for `num-queues` and `bootindex`.
Different with exist Qemu virtio-blk host device, it makes more easy
for users to implement their own I/O processing logic, such as all
user space I/O stack against hardware block device. It uses the new
vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
information from backend process.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
used for live migration of vhost user devices, also vhost user devices
can benefit from the messages to get/set virtio config space from/to the
I/O target. For the purpose to support virtio config space change,
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG message is added as the event notifier
in case virtio config space change in the slave I/O target.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Depending on the configuration, it can be beneficial to adjust the virtio-blk
queue size to something other than the current default of 128. Add a new
property to make the queue size configurable.
Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Ameya More <ameya.more@oracle.com>
Message-id: 52e6d742811f10dbd16e996e86cf375b9577c187.1513005190.git.mark.kanda@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In case of backend crash, it is not possible to restore internal
avail index from the backend value as vhost_get_vring_base
callback fails.
This patch provides a new interface to restore internal avail index
from the vring used index, as done by some vhost-user backend on
reconnection.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
qemu uses wheel-up/down button events for mouse wheel input, however
linux applications typically want REL_WHEEL events.
This fixes wheel with linux guests. Tested with X11/wayland, and
windows virtio-input driver.
Based on a patch from Marc.
Added property to enable/disable wheel axis.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170926113243.26081-1-kraxel@redhat.com
vmstate_save_state is called in lots of places.
Route error returns from the easier cases back up; there are lots
of more complex cases where their own error paths need fixing.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-7-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Commit message fix up as Peter's review
We should guarantee that RAM will not be modified while VM has a stopped
state, otherwise it can lead to negative consequences during post-copy
migration. In RUN_STATE_FINISH_MIGRATE step, it's expected that RAM on
source side will not be modified as this could lead to non-consistent vm state
on the destination side. Also RAM access during postcopy-ram migration with
enabled release-ram capability can lead to sad consequences.
Let's add enable_backend() callback to avoid undesirable virtioqueue changes
in the guest memory.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-Id: <20170919120733.22020-1-pbutsykin@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since Linux switched to blk-mq as the default in Linux commit
5c279bd9e406 ("scsi: default to scsi-mq"), virtio-scsi LUNs consume
about 10x as much guest kernel memory.
This commit allows you to choose the virtqueue size for each
virtio-scsi-pci controller like this:
-device virtio-scsi-pci,id=scsi,virtqueue_size=16
The default is still 128 as before. Using smaller virtqueue_size
allows many more disks to be added to small memory virtual machines.
For a 1 vCPU, 500 MB, no swap VM I observed:
With scsi-mq enabled (upstream kernel): 175 disks
-"- ditto -"- virtqueue_size=64: 318 disks
-"- ditto -"- virtqueue_size=16: 775 disks
With scsi-mq disabled (kernel before 5c279bd9e406): 1755 disks
Note that to have any effect, this requires a kernel patch:
https://lkml.org/lkml/2017/8/10/689
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20170810165255.20865-1-rjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch enables the virtio-net tx queue size to be configurable
between 256 (the default queue size) and 1024 by the user when the
vhost-user backend is used.
Currently, the maximum tx queue size for other backends is 512 due
to the following limitations:
- QEMU backend: the QEMU backend implementation in some cases may
send 1024+1 iovs to writev.
- Vhost_net backend: there are possibilities that the guest sends
a vring_desc of memory which crosses a MemoryRegion thereby
generating more than 1024 iovs after translation from guest-physical
address in the backend.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit introduces a vhost-user device for SCSI. This is based
on the existing vhost-scsi implementation, but done over vhost-user
instead. It also uses a chardev to connect to the backend. Unlike
vhost-scsi (today), VMs using vhost-user-scsi can be live migrated.
To use it, start Qemu with a command line equivalent to:
qemu-system-x86_64 \
-chardev socket,id=vus0,path=/tmp/vus.sock \
-device vhost-user-scsi-pci,chardev=vus0,bus=pci.0,addr=...
A separate commit presents a sample application linked with libiscsi to
provide a backend for vhost-user-scsi.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1488479153-21203-4-git-send-email-felipe@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch reworks IOTLB messaging to prepare for vhost-user
device IOTLB support.
IOTLB messages handling is extracted from vhost-kernel backend,
so that only the messages transport remains backend specifics.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some backends might want to know when things went wrong.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds a new internal "x-mtu-bypass-backend" property
to bypass backends for MTU feature negotiation.
When this property is set, the MTU feature is negotiated as soon
as supported by the guest and a MTU value is set via the host_mtu
parameter. In case the backend advertises the feature (e.g. DPDK's
vhost-user backend), the feature negotiation is propagated down to
the backend.
When this property is not set, the backend has to support the MTU
feature for its negotiation to succeed.
For compatibility purpose, this property is disabled for machine
types v2.9 and older.
Cc: Aaron Conole <aconole@redhat.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Vlad Yasevich <vyasevic@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move to virtio-gpu-3d.c where all the other virgl code lives too.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170505104101.30589-2-kraxel@redhat.com
In order to introduce a new vhost-user-scsi host device type, it makes
sense to abstract part of vhost-scsi into a common parent class. This
commit does exactly that.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1488479153-21203-3-git-send-email-felipe@nutanix.com>