vhost_save_backend_state() and vhost_load_backend_state() can be used by
vhost front-ends to easily save and load the back-end's state to/from
the migration stream.
Because we do not know the full state size ahead of time,
vhost_save_backend_state() simply reads the data in 1 MB chunks, and
writes each chunk consecutively into the migration stream, prefixed by
its length. EOF is indicated by a 0-length chunk.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-7-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add the interface for transferring the back-end's state during migration
as defined previously in vhost-user.rst.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20231016134243.68248-6-hreitz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Multiple devices need this macro, move it to a common header.
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20231009044735.941655-3-fengli@smartx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost-user-scsi has a VirtioDeviceClass->reset() function that calls
->vhost_reset_device(). The other vhost devices don't notify the vhost
device upon reset.
Stateful vhost devices may need to handle device reset in order to free
resources or prevent stale device state from interfering after reset.
Call ->vhost_device_reset() from virtio_reset() so that that vhost
devices are notified of device reset.
This patch affects behavior as follows:
- vhost-kernel: No change in behavior since ->vhost_reset_device() is
not implemented.
- vhost-user: back-ends that negotiate
VHOST_USER_PROTOCOL_F_RESET_DEVICE now receive a
VHOST_USER_DEVICE_RESET message upon device reset. Otherwise there is
no change in behavior. DPDK, SPDK, libvhost-user, and the
vhost-user-backend crate do not negotiate
VHOST_USER_PROTOCOL_F_RESET_DEVICE automatically.
- vhost-vdpa: an extra SET_STATUS 0 call is made during device reset.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231004014532.1228637-4-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Let's add vhost_get_max_memslots().
Message-ID: <20230926185738.277351-12-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's return the number of free slots instead of only checking if there
is a free slot. Required to support memory devices that consume multiple
memslots.
This is a preparation for memory devices that consume multiple memslots.
Message-ID: <20230926185738.277351-6-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
The guest can disable or never enable Device-TLB. In these cases,
it can't be used even if enabled in QEMU. So, check Device-TLB state
before registering IOMMU notifier and select unmap flag depending on
that. Also, implement a way to change IOMMU notifier flag if Device-TLB
state is changed.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2001312
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230626091258.24453-2-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To support vIOMMU in vdpa, need to exposed the function
vhost_dev_has_iommu, vdpa will use this function to check
if vIOMMU enable.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20230510054631.2951812-2-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add functions to support configure interrupt.
The configure interrupt process will start in vhost_dev_start
and stop in vhost_dev_stop.
Also add the functions to support vhost_config_pending and
vhost_config_mask.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-8-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We have a bunch of variables associated with the device and the vhost
backend which are used inconsistently throughout the code base. Lets
start trying to bring some order by agreeing what each variable is
for.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Message-Id: <20221123152134.179929-2-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features")
properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user
backend, but we forgot to enable vrings as specified in
docs/interop/vhost-user.rst:
If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
ring starts directly in the enabled state.
If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
initialized in a disabled state and is enabled by
``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.
Some vhost-user front-ends already did this by calling
vhost_ops.vhost_set_vring_enable() directly:
- backends/cryptodev-vhost.c
- hw/net/virtio-net.c
- hw/virtio/vhost-user-gpio.c
But most didn't do that, so we would leave the vrings disabled and some
backends would not work. We observed this issue with the rust version of
virtiofsd [1], which uses the event loop [2] provided by the
vhost-user-backend crate where requests are not processed if vring is
not enabled.
Let's fix this issue by enabling the vrings in vhost_dev_start() for
vhost-user front-ends that don't already do this directly. Same thing
also in vhost_dev_stop() where we disable vrings.
[1] https://gitlab.com/virtio-fs/virtiofsd
[2] https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217
Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features")
Reported-by: German Maglione <gmaglione@redhat.com>
Tested-by: German Maglione <gmaglione@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20221123131630.52020-1-sgarzare@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221130112439.2527228-3-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Expose vhost_virtqueue_stop(), we need to use it when resetting a
virtqueue.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-9-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Expose vhost_virtqueue_start(), we need to use it when restarting a
virtqueue.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-8-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Display feature names instead of bitmaps for host, guest, and
backend for VirtIODevices.
Display status names instead of bitmaps for VirtIODevices.
Display feature names instead of bitmaps for backend, protocol,
acked, and features (hdev->features) for vhost devices.
Decode features according to device ID. Decode statuses
according to configuration status bitmap (config_status_map).
Decode vhost user protocol features according to vhost user
protocol bitmap (vhost_user_protocol_map).
Transport features are on the first line. Undecoded bits (if
any) are stored in a separate field.
[Jonah: Several changes made to this patch from prev. version (v14):
- Moved all device features mappings to hw/virtio/virtio.c
- Renamed device features mappings (less generic)
- Generalized @FEATURE_ENTRY macro for all device mappings
- Virtio device feature map definitions include descriptions of
feature bits
- Moved @VHOST_USER_F_PROTOCOL_FEATURES feature bit from transport
feature map to vhost-user-supported device feature mappings
(blk, fs, i2c, rng, net, gpu, input, scsi, vsock)
- New feature bit added for virtio-vsock: @VIRTIO_VSOCK_F_SEQPACKET
- New feature bit added for virtio-iommu: @VIRTIO_IOMMU_F_BYPASS_CONFIG
- New feature bit added for virtio-mem: @VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
- New virtio transport feature bit added: @VIRTIO_F_IN_ORDER
- Added device feature map definition for virtio-rng
]
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <1660220684-24909-4-git-send-email-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The `started` field is manipulated internally within the vhost code
except for one place, vhost-user-blk via f5b22d06fb (vhost: recheck
dev state in the vhost_migration_log routine). Mark that as a FIXME
because it introduces a potential race. I think the referenced fix
should be tracking its state locally.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220802095010.3330793-12-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwittz@nutanix.com>
Try and explicitly document the various state of feature bits as
related to the vhost_dev structure. Importantly the backend_features
can advertise things like VHOST_USER_F_PROTOCOL_FEATURES which is
never exposed to the driver and is only present in the vhost-user
feature negotiation.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220802095010.3330793-7-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220524154056.2896913-5-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220524154056.2896913-4-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Vhost has error notifications, let's log them like other errors.
For each virt-queue setup eventfd for vring error notifications.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
[vsementsov: rename patch, change commit message and dump error like
other errors in the file]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220623161325.18813-3-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
While trying to get my head around the nest of interactions for vhost
devices I though I could start by documenting the key API functions.
This patch documents the main API hooks for creating and starting a
vhost device as well as how the configuration changes are handled.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220321153037.3622127-11-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit f7220a7ce2.
Fixes: f7220a7ce2 ("vhost: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add functions to support configure interrupt.
The configure interrupt process will start in vhost_dev_start
and stop in vhost_dev_stop.
Also add the functions to support vhost_config_pending and
vhost_config_mask, for masked_config_notifier, we only
use the notifier saved in vq 0.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-8-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The doc of this field pointed out that last_index is the last vq index.
This is misleading, since it's actually one past the end of the vqs.
Renaming and modifying comment.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211104085625.2054959-2-eperezma@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch introduces a new field in the vhost_dev structure to record
the last virtqueue index for the virtio device. This will be useful
for the vhost backends with 1:N model to start or stop the device
after all the vhost_dev structures were started or stopped.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-9-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Switch to use unsigned int for nvqs since it's not expected to be
negative.
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-5-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Just a small refactor patch.
vhost_set_backend_type() gets called only in vhost.c, so we can move the
function there and make it static. We can then extern the visibility of
kernel_ops, to match the other VhostOps in vhost-backend.h.
The VhostOps constants now make more sense in vhost.h
Suggested-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Tiberiu Georgescu <tiberiu.georgescu@nutanix.com>
Message-Id: <20210809134015.67941-1-tiberiu.georgescu@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Instead of just returning 0/-1 and letting the caller make up a
meaningless error message, add an Error parameter to allow reporting the
real error and switch to 0/-errno so that different kind of errors can
be distinguished in the caller.
config_len in vhost_user_get_config() is defined by the device, so if
it's larger than VHOST_USER_MAX_CONFIG_SIZE, this is a programming
error. Turn the corresponding check into an assertion.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-6-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This allows callers to return better error messages instead of making
one up while the real error ends up on stderr. Most callers can
immediately make use of this because they already have an Error
parameter themselves. The others just keep printing the error with
error_report_err().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-2-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Creating a device with a number of queues that isn't supported by the
backend is pointless, the device won't work properly and the error
messages are rather confusing.
Just fail to create the device if num-queues is higher than what the
backend supports.
Since the relationship between num-queues and the number of virtqueues
depends on the specific device, this is an additional value that needs
to be initialised by the device. For convenience, allow leaving it 0 if
the check should be skipped. This makes sense for vhost-user-net where
separate vhost devices are used for the queues and custom initialisation
code is needed to perform the check.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1935031
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20210429171316.162022-7-kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Virtqueue has split and packed, so before setting inflight,
you need to inform the back-end virtqueue format.
Signed-off-by: Jin Yu <jin.yu@intel.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20201103123617.28256-1-jin.yu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit adb29c0273.
The commit broke -device vhost-user-blk-pci because the
vhost_dev_prepare_inflight() function it introduced segfaults in
vhost_dev_set_features() when attempting to access struct vhost_dev's
vdev pointer before it has been assigned.
To reproduce the segfault simply launch a vhost-user-blk device with the
contrib vhost-user-blk device backend:
$ build/contrib/vhost-user-blk/vhost-user-blk -s /tmp/vhost-user-blk.sock -r -b /var/tmp/foo.img
$ build/qemu-system-x86_64 \
-device vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 \
-object memory-backend-memfd,id=mem,size=1G,share=on \
-M memory-backend=mem,accel=kvm \
-chardev socket,id=char1,path=/tmp/vhost-user-blk.sock
Segmentation fault (core dumped)
Cc: Jin Yu <jin.yu@intel.com>
Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201102165709.232180-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Virtqueue has split and packed, so before setting inflight,
you need to inform the back-end virtqueue format.
Signed-off-by: Jin Yu <jin.yu@intel.com>
Message-Id: <20200910134851.7817-1-jin.yu@intel.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch tries to switch to use new kernel IOTLB format V2. Previous
version may have inconsistent ABI between 32bit and 64bit machines
because of the hole after type field. Refer kernel commit
("429711aec282 vhost: switch to use new message format") for more
information.
To enable this feature, qemu need to use a new ioctl
VHOST_SET_BACKEND_FEATURE with VHOST_BACKEND_F_IOTLB_MSG_V2 bit. A new
vhost setting backend features ops was introduced. And when we try to
set features for vhost dev, we will examine the support of new IOTLB
format and enable it. This process is total transparent to guest,
which means we can have different IOTLB message type in src and dst
during migration.
The conversion of IOTLB message is straightforward, just check the
type and behave accordingly.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20200907104903.31551-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently we have 2 types of vhost backends in QEMU: vhost kernel and
vhost-user. The above patch provides a generic device for vDPA purpose,
this vDPA device exposes to user space a non-vendor-specific configuration
interface for setting up a vhost HW accelerator, this patch set introduces
a third vhost backend called vhost-vdpa based on the vDPA interface.
Vhost-vdpa usage:
qemu-system-x86_64 -cpu host -enable-kvm \
......
-netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
-device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
Signed-off-by: Lingshan zhu <lingshan.zhu@intel.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Cindy Lu <lulu@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20200701145538.22333-14-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
The previous commits have left only the declaration of hw_error() in
hw/hw.h. This permits dropping most of its inclusions. Touching it
now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD
and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared
buffer between qemu and backend.
Firstly, qemu uses VHOST_USER_GET_INFLIGHT_FD to get the
shared buffer from backend. Then qemu should send it back
through VHOST_USER_SET_INFLIGHT_FD each time we start vhost-user.
This shared buffer is used to track inflight I/O by backend.
Qemu should retrieve a new one when vm reset.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Chai Wen <chaiwen@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-2-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Remove the old update mechanism, vhost_set_memory, and the functions
and flags it used.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor spotted that there's a race, where a region that's unref'd
in a _del callback might be free'd before the set_mem_table call in
the _commit callback, and thus the vhost might end up using free memory.
Fix this by building a complete temporary sections list, ref'ing every
section (during add and nop) and then unref'ing the whole list right
at the end of commit.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
used for live migration of vhost user devices, also vhost user devices
can benefit from the messages to get/set virtio config space from/to the
I/O target. For the purpose to support virtio config space change,
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG message is added as the event notifier
in case virtio config space change in the slave I/O target.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some backends might want to know when things went wrong.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We assumes the iommu_ops were attached to the root region of address
space. This may not be true for all kinds of IOMMU implementation and
especially after commit 3716d5902d ("pci: introduce a bus master
container"). So fix this by not assuming as->root has iommu_ops,
instead depending on the regions reported by memory listener through:
- register a memory listener to dma_as
- during region_add, if it's a region of IOMMU, register a specific
IOMMU notifier, and store all notifiers in a list.
- during region_del, compare and delete the IOMMU notifier from the list
This is also a must for making vhost device IOTLB works for all types
of IOMMUs. Note, since we register one notifier during each
.region_add, the IOTLB may be flushed more than one times, this is
suboptimal and could be optimized in the future.
Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Fixes: 3716d5902d ("pci: introduce a bus master container")
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patches implements Device IOTLB support for vhost kernel. This is
done through:
1) switch to use dma helpers when map/unmap vrings from vhost codes
2) introduce a set of VhostOps to:
- setting up device IOTLB request callback
- processing device IOTLB request
- processing device IOTLB invalidation
2) kernel support for Device IOTLB API:
- allow vhost-net to query the IOMMU IOTLB entry through eventfd
- enable the ability for qemu to update a specified mapping of vhost
- through ioctl.
- enable the ability to invalidate a specified range of iova for the
device IOTLB of vhost through ioctl. In x86/intel_iommu case this is
triggered through iommu memory region notifier from device IOTLB
invalidation descriptor processing routine.
With all the above, kernel vhost_net can co-operate with userspace
IOMMU. For vhost-user, the support could be easily done on top by
implementing the VhostOps.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The legacy vring layout is not used anymore as we use the separate
mappings even for legacy devices.
This patch simply removes it.
This also fixes a bug with virtio 1 devices when the vring descriptor table
is mapped at a higher address than the used vring because the following
function may return an insanely great value:
hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n)
{
return vdev->vq[n].vring.used - vdev->vq[n].vring.desc +
virtio_queue_get_used_size(vdev, n);
}
and the mapping fails.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.
In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.
This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.
This works for legacy mappings as well.
Cc: qemu-stable@nongnu.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Not all vhost-user backends support ops->vhost_net_set_backend(). It is
a nicer to provide an assert/error than to crash trying to
call. Furthermore, it improves a bit the code by hiding vhost_ops
details.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch add the capability of basic vhost net busy polling which is
supported by recent kernel. User could configure the maximum number of
us that could be spent on busy polling through a new property of tap
"poll-us".
Cc: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
commit 1e7398a1 ("vhost: enable vhost without without MSI-X"_
dropped the implementation of vhost_dev_query,
drop it from the header file as well.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Replace the generic vhost_call() by specific functions for each
function call to help with type safety and changing arguments.
While doing this, I found that "unsigned long long" and "uint64_t" were
used interchangeably and causing compilation warnings, using uint64_t
instead, as the vhost & protocol specifies.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[Fix enum usage and MQ - Thibaut Collet]
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
If the backend is requires it, allocate shareable memory.
vhost_log_get() now uses 2 globals "vhost_log" and "vhost_log_shm", that
way there is a common non-shareable log and a common shareable one.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thibaut Collet <thibaut.collet@6wind.com>