mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Peter Maydell	e3d0814368	hw: Use device_class_set_legacy_reset() instead of opencoding Use device_class_set_legacy_reset() instead of opencoding an assignment to DeviceClass::reset. This change was produced with: spatch --macro-file scripts/cocci-macro-file.h \ --sp-file scripts/coccinelle/device-reset.cocci \ --keep-comments --smpl-spacing --in-place --dir hw Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240830145812.1967042-8-peter.maydell@linaro.org	2024-09-13 15:31:44 +01:00
Peter Maydell	3baa3c9d62	virtio,pc,pci: features, fixes, cleanups i286 acpi speedup by precomputing _PRT by Ricardo Ribalda vhost_net speedup by using MR transactions by Zuo Boqun ich9 gained support for periodic and swsmi timer by Dominic Prinz Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbhoCUPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRptpUH/iR5AmJFpvAItqlPOvJiYDEch46C73tyrSws Kk/1EbGSL7mFFD5sfdSSV4Rw8CQBsmM/Dt5VDkJKsWnOLjkBQ2CYH0MYHktnrKcJ LlSk32HnY5p1DsXnJhgm5M7St8T3mV/oFdJCJAFgCmpx5uT8IRLrKETN8+30OaiY xo35xAKOAS296+xsWeVubKkMq7H4y2tdZLE/22gb8rlA8d96BJIeVLQ3y3IjeUPR 24q6c7zpObzGhYNZ/PzAKOn+YcVsV/lLAzKRZJTzTUPyG24BcjJTyyr/zNSYAgfk lLXzIZID3GThBmrCAiDZ1z6sfo3MRg2wNS/FBXtK6fPIuFxed+8= =ySRy -----END PGP SIGNATURE----- Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging virtio,pc,pci: features, fixes, cleanups i286 acpi speedup by precomputing _PRT by Ricardo Ribalda vhost_net speedup by using MR transactions by Zuo Boqun ich9 gained support for periodic and swsmi timer by Dominic Prinz Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbhoCUPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRptpUH/iR5AmJFpvAItqlPOvJiYDEch46C73tyrSws # Kk/1EbGSL7mFFD5sfdSSV4Rw8CQBsmM/Dt5VDkJKsWnOLjkBQ2CYH0MYHktnrKcJ # LlSk32HnY5p1DsXnJhgm5M7St8T3mV/oFdJCJAFgCmpx5uT8IRLrKETN8+30OaiY # xo35xAKOAS296+xsWeVubKkMq7H4y2tdZLE/22gb8rlA8d96BJIeVLQ3y3IjeUPR # 24q6c7zpObzGhYNZ/PzAKOn+YcVsV/lLAzKRZJTzTUPyG24BcjJTyyr/zNSYAgfk # lLXzIZID3GThBmrCAiDZ1z6sfo3MRg2wNS/FBXtK6fPIuFxed+8= # =ySRy # -----END PGP SIGNATURE----- # gpg: Signature made Wed 11 Sep 2024 14:50:29 BST # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: hw/acpi/ich9: Add periodic and swsmi timer virtio-mem: don't warn about THP sizes on a kernel without THP support hw/audio/virtio-sound: fix heap buffer overflow hw/cxl: fix physical address field in get scan media results output virtio-pci: Add lookup subregion of VirtIOPCIRegion MR vhost_net: configure all host notifiers in a single MR transaction tests/acpi: pc: update golden masters for DSDT hw/i386/acpi-build: Return a pre-computed _PRT table tests/acpi: pc: allow DSDT acpi table changes intel_iommu: Make PASID-cache and PIOTLB type invalid in legacy mode intel_iommu: Fix invalidation descriptor type field virtio: rename virtio_split_packed_update_used_idx hw/pci/pci-hmp-cmds: Avoid displaying bogus size in 'info pci' pci: don't skip function 0 occupancy verification for devfn auto assign hw/isa/vt82c686.c: Embed i8259 irq in device state instead of allocating hw: Move declaration of IRQState to header and add init function virtio: Always reset vhost devices virtio: Allow .get_vhost() without vhost_started Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-09-12 16:17:27 +01:00
David Hildenbrand	95b717a815	virtio-mem: don't warn about THP sizes on a kernel without THP support If the config directory in sysfs does not exist at all, we are dealing with a system that does not support THPs. Simply use 1 MiB block size then, instead of warning "Could not detect THP size, falling back to ..." and falling back to the default THP size. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Juraj Marcin <jmarcin@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240910163433.2100295-1-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-11 09:46:14 -04:00
Gao Shiyuan	ffa8a3e3b2	virtio-pci: Add lookup subregion of VirtIOPCIRegion MR Now virtio_address_space_lookup only lookup common/isr/device/notify MR and exclude their subregions. When VHOST_USER_PROTOCOL_F_HOST_NOTIFIER enable, the notify MR has host-notifier subregions and we need use host-notifier MR to notify the hardware accelerator directly instead of eventfd notify. Further more, maybe common/isr/device MR also has subregions in the future, so need memory_region_find for each MR incluing their subregions. Add lookup subregion of VirtIOPCIRegion MR instead of only lookup container MR. Fixes: `a93c8d8` ("virtio-pci: Replace modern_as with direct access to modern_bar") Co-developed-by: Zuo Boqun <zuoboqun@baidu.com> Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com> Signed-off-by: Zuo Boqun <zuoboqun@baidu.com> Message-Id: <20240903120304.97833-1-gaoshiyuan@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-11 09:46:14 -04:00
zuoboqun	6166799f69	vhost_net: configure all host notifiers in a single MR transaction This allows the vhost_net device which has multiple virtqueues to batch the setup of all its host notifiers. This significantly reduces the vhost_net device starting and stoping time, e.g. the time spend on enabling notifiers reduce from 630ms to 75ms and the time spend on disabling notifiers reduce from 441ms to 45ms for a VM with 192 vCPUs and 15 vhost-user-net devices (64vq per device) in our case. Signed-off-by: zuoboqun <zuoboqun@baidu.com> Message-Id: <20240816070835.8309-1-zuoboqun@baidu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-11 09:46:14 -04:00
Wenyu Huang	e667485a80	virtio: rename virtio_split_packed_update_used_idx virtio_split_packed_update_used_idx should be virtio_queue_split_update_used_idx like virtio_split_packed_update_used_idx. Signed-off-by: Wenyu Huang <huangwenyuu@outlook.com> Message-Id: <TYBP286MB036536B9015994AA5F3E4495ACB22@TYBP286MB0365.JPNP286.PROD.OUTLOOK.COM> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-11 09:46:05 -04:00
Hanna Czenczek	2688e8df60	virtio: Always reset vhost devices Requiring `vhost_started` to be true for resetting vhost devices in `virtio_reset()` seems like the wrong condition: Most importantly, the preceding `virtio_set_status(vdev, 0)` call will (for vhost devices) end up in `vhost_dev_stop()` (through vhost devices' `.set_status` implementations), setting `vdev->vhost_started = false`. Therefore, the gated `vhost_reset_device()` call is unreachable. `vhost_started` is not documented, so it is hard to say what exactly it is supposed to mean, but judging from the fact that `vhost_dev_start()` sets it and `vhost_dev_stop()` clears it, it seems like it indicates whether there is a vhost back-end, and whether that back-end is currently running and processing virtio requests. Making a reset conditional on whether the vhost back-end is processing virtio requests seems wrong; in fact, it is probably better to reset it only when it is not currently processing requests, which is exactly the current order of operations in `virtio_reset()`: First, the back-end is stopped through `virtio_set_status(vdev, 0)`, then we want to send a reset. Therefore, we should drop the `vhost_started` condition, but in its stead we then have to verify that we can indeed send a reset to this vhost device, by not just checking `k->get_vhost != NULL` (introduced by commit `95e1019a4a`), but also that the vhost back-end is connected (`hdev = k->get_vhost(); hdev != NULL && hdev->vhost_ops != NULL`). Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20240723163941.48775-3-hreitz@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-09-10 14:27:56 -04:00
Hanna Czenczek	00adced5c3	virtio: Allow .get_vhost() without vhost_started Historically, .get_vhost() was probably only called when vdev->vhost_started is true. However, we now decidedly want to call it also when vhost_started is false, specifically so we can issue a reset to the vhost back-end while device operation is stopped. Some .get_vhost() implementations dereference some pointers (or return offsets from them) that are probably guaranteed to be non-NULL when vhost_started is true, but not necessarily otherwise. This patch makes all such implementations check all such pointers, returning NULL if any is NULL. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20240723163941.48775-2-hreitz@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2024-09-10 14:27:56 -04:00
Markus Armbruster	b1019999e8	qapi/cryptodev: Rename QCryptodevBackendAlgType to *Algo, and drop prefix QAPI's 'prefix' feature can make the connection between enumeration type and its constants less than obvious. It's best used with restraint. QCryptodevBackendAlgType has a 'prefix' that overrides the generated enumeration constants' prefix to QCRYPTODEV_BACKEND_ALG. We could simply drop 'prefix', but I think the abbreviation "alg" is less than clear. Additionally rename the type to QCryptodevBackendAlgoType. The prefix becomes QCRYPTODEV_BACKEND_ALGO_TYPE. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20240904111836.3273842-19-armbru@redhat.com>	2024-09-10 14:03:30 +02:00
Markus Armbruster	d0917d79fd	qapi/cryptodev: Drop unwanted 'prefix' QAPI's 'prefix' feature can make the connection between enumeration type and its constants less than obvious. It's best used with restraint. QCryptodevBackendServiceType has a 'prefix' that overrides the generated enumeration constants' prefix to QCRYPTODEV_BACKEND_SERVICE. Drop it. The prefix becomes QCRYPTODEV_BACKEND_SERVICE_TYPE. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20240904111836.3273842-18-armbru@redhat.com>	2024-09-10 14:03:30 +02:00
Cindy Lu	a8e63ff289	virtio-pci: Fix the use of an uninitialized irqfd The crash was reported in MAC OS and NixOS, here is the link for this bug https://gitlab.com/qemu-project/qemu/-/issues/2334 https://gitlab.com/qemu-project/qemu/-/issues/2321 In this bug, they are using the virtio_input device. The guest notifier was not supported for this device, The function virtio_pci_set_guest_notifiers() was not called, and the vector_irqfd was not initialized. So the fix is adding the check for vector_irqfd in virtio_pci_get_notifier() The function virtio_pci_get_notifier() can be used in various devices. It could also be called when VIRTIO_CONFIG_S_DRIVER_OK is not set. In this situation, the vector_irqfd being NULL is acceptable. We can allow the device continue to boot If the vector_irqfd still hasn't been initialized after VIRTIO_CONFIG_S_DRIVER_OK is set, it means that the function set_guest_notifiers was not called before the driver started. This indicates that the device is not using the notifier. At this point, we will let the check fail. This fix is verified in vyatta,MacOS,NixOS,fedora system. The bt tree for this bug is: Thread 6 "CPU 0/KVM" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7c817be006c0 (LWP 1269146)] kvm_virtio_pci_vq_vector_use () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:817 817 if (irqfd->users == 0) { (gdb) thread apply all bt ... Thread 6 (Thread 0x7c817be006c0 (LWP 1269146) "CPU 0/KVM"): 0 kvm_virtio_pci_vq_vector_use () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:817 1 kvm_virtio_pci_vector_use_one () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:893 2 0x00005983657045e2 in memory_region_write_accessor () at ../qemu-9.0.0/system/memory.c:497 3 0x0000598365704ba6 in access_with_adjusted_size () at ../qemu-9.0.0/system/memory.c:573 4 0x0000598365705059 in memory_region_dispatch_write () at ../qemu-9.0.0/system/memory.c:1528 5 0x00005983659b8e1f in flatview_write_continue_step.isra.0 () at ../qemu-9.0.0/system/physmem.c:2713 6 0x000059836570ba7d in flatview_write_continue () at ../qemu-9.0.0/system/physmem.c:2743 7 flatview_write () at ../qemu-9.0.0/system/physmem.c:2774 8 0x000059836570bb76 in address_space_write () at ../qemu-9.0.0/system/physmem.c:2894 9 0x0000598365763afe in address_space_rw () at ../qemu-9.0.0/system/physmem.c:2904 10 kvm_cpu_exec () at ../qemu-9.0.0/accel/kvm/kvm-all.c:2917 11 0x000059836576656e in kvm_vcpu_thread_fn () at ../qemu-9.0.0/accel/kvm/kvm-accel-ops.c:50 12 0x0000598365926ca8 in qemu_thread_start () at ../qemu-9.0.0/util/qemu-thread-posix.c:541 13 0x00007c8185bcd1cf in ??? () at /usr/lib/libc.so.6 14 0x00007c8185c4e504 in clone () at /usr/lib/libc.so.6 Fixes: `2ce6cff94d` ("virtio-pci: fix use of a released vector") Cc: qemu-stable@nongnu.org Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20240806093715.65105-1-lulu@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-08-20 06:57:47 -04:00
thomas	f937309fbd	virtio-net: Fix network stall at the host side waiting for kick Patch `06b1297017` ("virtio-net: fix network stall under load") added double-check to test whether the available buffer size can satisfy the request or not, in case the guest has added some buffers to the avail ring simultaneously after the first check. It will be lucky if the available buffer size becomes okay after the double-check, then the host can send the packet to the guest. If the buffer size still can't satisfy the request, even if the guest has added some buffers, viritio-net would stall at the host side forever. The patch enables notification and checks whether the guest has added some buffers since last check of available buffers when the available buffers are insufficient. If no buffer is added, return false, else recheck the available buffers in the loop. If the available buffers are sufficient, disable notification and return true. Changes: 1. Change the return type of virtqueue_get_avail_bytes() from void to int, it returns an opaque that represents the shadow_avail_idx of the virtqueue on success, else -1 on error. 2. Add a new API: virtio_queue_enable_notification_and_check(), it takes an opaque as input arg which is returned from virtqueue_get_avail_bytes(). It enables notification firstly, then checks whether the guest has added some buffers since last check of available buffers or not by virtio_queue_poll(), return ture if yes. The patch also reverts patch "06b12970174". The case below can reproduce the stall. Guest 0 +--------+ \| iperf \| ---------------> \| server \| Host \| +--------+ +--------+ \| ... \| iperf \|---- \| client \|---- Guest n +--------+ \| +--------+ \| \| iperf \| ---------------> \| server \| +--------+ Boot many guests from qemu with virtio network: qemu ... -netdev tap,id=net_x \ -device virtio-net-pci-non-transitional,\ iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x Each guest acts as iperf server with commands below: iperf3 -s -D -i 10 -p 8001 iperf3 -s -D -i 10 -p 8002 The host as iperf client: iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 40000 iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 40000 After some time, the host loses connection to the guest, the guest can send packet to the host, but can't receive packet from the host. It's more likely to happen if SWIOTLB is enabled in the guest, allocating and freeing bounce buffer takes some CPU ticks, copying from/to bounce buffer takes more CPU ticks, compared with that there is no bounce buffer in the guest. Once the rate of producing packets from the host approximates the rate of receiveing packets in the guest, the guest would loop in NAPI. receive packets --- \| \| v \| free buf virtnet_poll \| \| v \| add buf to avail ring --- \| \| need kick the host? \| NAPI continues v receive packets --- \| \| v \| free buf virtnet_poll \| \| v \| add buf to avail ring --- \| v ... ... On the other hand, the host fetches free buf from avail ring, if the buf in the avail ring is not enough, the host notifies the guest the event by writing the avail idx read from avail ring to the event idx of used ring, then the host goes to sleep, waiting for the kick signal from the guest. Once the guest finds the host is waiting for kick singal (in virtqueue_kick_prepare_split()), it kicks the host. The host may stall forever at the sequences below: Host Guest ------------ ----------- fetch buf, send packet receive packet --- ... ... \| fetch buf, send packet add buf \| ... add buf virtnet_poll buf not enough avail idx-> add buf \| read avail idx add buf \| add buf --- receive packet --- write event idx ... \| wait for kick add buf virtnet_poll ... \| --- no more packet, exit NAPI In the first loop of NAPI above, indicated in the range of virtnet_poll above, the host is sending packets while the guest is receiving packets and adding buffers. step 1: The buf is not enough, for example, a big packet needs 5 buf, but the available buf count is 3. The host read current avail idx. step 2: The guest adds some buf, then checks whether the host is waiting for kick signal, not at this time. The used ring is not empty, the guest continues the second loop of NAPI. step 3: The host writes the avail idx read from avail ring to used ring as event idx via virtio_queue_set_notification(q->rx_vq, 1). step 4: At the end of the second loop of NAPI, recheck whether kick is needed, as the event idx in the used ring written by the host is beyound the range of kick condition, the guest will not send kick signal to the host. Fixes: `06b1297017` ("virtio-net: fix network stall under load") Cc: qemu-stable@nongnu.org Signed-off-by: Wencheng Yang <east.moutain.yang@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2024-08-02 11:09:52 +08:00
Michael S. Tsirkin	67f5b279fc	Revert "virtio-pci: Implement SR-IOV PF" This reverts commit `3f868ffb0b`. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-08-01 04:32:00 -04:00
Michael S. Tsirkin	cc91ac0a72	Revert "virtio-net: Implement SR-IOV VF" This reverts commit `c2d6db6a1f`. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-08-01 04:32:00 -04:00
Michael S. Tsirkin	024d046bf4	virtio-rng: block max-bytes=0 with max-bytes set to 0, quota is 0 and so device does not work. block this to avoid user confusion Message-Id: <73a89a42d82ec8b47358f25119b87063e4a6ea57.1721818306.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2024-08-01 04:32:00 -04:00
Eric Auger	6c027a9de3	virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain Add a trace point on virtio_iommu_detach_endpoint_from_domain(). Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-7-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-22 20:15:42 -04:00
Eric Auger	1993d634d5	virtio-iommu: Remove the end point on detach We currently miss the removal of the endpoint in case of detach. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-5-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-22 20:15:42 -04:00
Eric Auger	62ac01d1de	virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices We are currently missing the deallocation of the [host_]resv_regions in case of hot unplug. Also to make things more simple let's rule out the case where multiple HostIOMMUDevices would be aliased and attached to the same IOMMUDevice. This allows to remove the handling of conflicting Host reserved regions. Anyway this is not properly supported at guest kernel level. On hotunplug the reserved regions are reset to the ones set by virtio-iommu property. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-4-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-22 20:15:42 -04:00
Eric Auger	3745768918	virtio-iommu: Remove probe_done Now we have switched to PCIIOMMUOps to convey host IOMMU information, the host reserved regions are transmitted when the PCIe topology is built. This happens way before the virtio-iommu driver calls the probe request. So let's remove the probe_done flag that allowed to check the probe was not done before the IOMMU MR got enabled. Besides this probe_done flag had a flaw wrt migration since it was not saved/restored. The only case at risk is if 2 devices were plugged to a PCIe to PCI bridge and thus aliased. First of all we discovered in the past this case was not properly supported for neither SMMU nor virtio-iommu on guest kernel side: see [RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr() https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/ If this were supported by the guest kernel, it is unclear what the call sequence would be from a virtio-iommu driver point of view. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-3-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-22 20:15:42 -04:00
Eric Auger	935c391418	Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged" This reverts commit `1b889d6e39`. There are different problems with that tentative fix: - Some resources are left dangling (resv_regions, host_resv_ranges) and memory subregions are left attached to the root MR although freed as embedded in the sdev IOMMUDevice. Finally the sdev->as is not destroyed and associated listeners are left. - Even when fixing the above we observe a memory corruption associated with the deallocation of the IOMMUDevice. This can be observed when a VFIO device is hotplugged, hot-unplugged and a system reset is issued. At this stage we have not been able to identify the root cause (IOMMU MR or as structs beeing overwritten and used later on?). - Another issue is HostIOMMUDevice are indexed by non aliased BDF whereas the IOMMUDevice is indexed by aliased BDF - yes the current naming is really misleading -. Given the state of the code I don't think the virtio-iommu device works in non singleton group case though. So let's revert the patch for now. This means the IOMMU MR/as survive the hotunplug. This is what is done in the intel_iommu for instance. It does not sound very logical to keep those but currently there is no symetric function to pci_device_iommu_address_space(). probe_done issue will be handled in a subsequent patch. Also resv_regions and host_resv_regions will be deallocated separately. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-2-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-22 20:15:41 -04:00
Akihiko Odaki	c2d6db6a1f	virtio-net: Implement SR-IOV VF A virtio-net device can be added as a SR-IOV VF to another virtio-pci device that will be the PF. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-7-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-22 20:15:41 -04:00
Akihiko Odaki	3f868ffb0b	virtio-pci: Implement SR-IOV PF Allow user to attach SR-IOV VF to a virtio-pci PF. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240715-sriov-v5-6-3f5539093ffc@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-22 20:15:41 -04:00
Jonah Palmer	c03213fdc9	vhost,vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits Add support for the VIRTIO_F_IN_ORDER feature across a variety of vhost devices. The inclusion of VIRTIO_F_IN_ORDER in the feature bits arrays for these devices ensures that the backend is capable of offering and providing support for this feature, and that it can be disabled if the backend does not support it. Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240710125522.4168043-6-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-21 14:45:56 -04:00
Jonah Palmer	844619147c	virtio: virtqueue_ordered_flush - VIRTIO_F_IN_ORDER support Add VIRTIO_F_IN_ORDER feature support for the virtqueue_flush operation. The goal of the virtqueue_ordered_flush operation when the VIRTIO_F_IN_ORDER feature has been negotiated is to write elements to the used/descriptor ring in-order and then update used_idx. The function iterates through the VirtQueueElement used_elems array in-order starting at vq->used_idx. If the element is valid (filled), the element is written to the used/descriptor ring. This process continues until we find an invalid (not filled) element. For packed VQs, the first entry (at vq->used_idx) is written to the descriptor ring last so the guest doesn't see any invalid descriptors. If any elements were written, the used_idx is updated. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240710125522.4168043-5-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>	2024-07-21 14:45:56 -04:00
Jonah Palmer	b44135daa3	virtio: virtqueue_ordered_fill - VIRTIO_F_IN_ORDER support Add VIRTIO_F_IN_ORDER feature support for the virtqueue_fill operation. The goal of the virtqueue_ordered_fill operation when the VIRTIO_F_IN_ORDER feature has been negotiated is to search for this now-used element, set its length, and mark the element as filled in the VirtQueue's used_elems array. By marking the element as filled, it will indicate that this element has been processed and is ready to be flushed, so long as the element is in-order. Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240710125522.4168043-4-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-21 14:45:56 -04:00
Jonah Palmer	2256e8482b	virtio: virtqueue_pop - VIRTIO_F_IN_ORDER support Add VIRTIO_F_IN_ORDER feature support in virtqueue_split_pop and virtqueue_packed_pop. VirtQueueElements popped from the available/descritpor ring are added to the VirtQueue's used_elems array in-order and in the same fashion as they would be added the used and descriptor rings, respectively. This will allow us to keep track of the current order, what elements have been written, as well as an element's essential data after being processed. Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240710125522.4168043-3-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-21 14:45:56 -04:00
Zheyu Ma	89cf6574bc	hw/virtio/virtio-crypto: Fix op_code assignment in virtio_crypto_create_asym_session Currently, if the function fails during the key_len check, the op_code does not have a proper value, causing virtio_crypto_free_create_session_req not to free the memory correctly, leading to a memory leak. By setting the op_code before performing any checks, we ensure that virtio_crypto_free_create_session_req has the correct context to perform cleanup operations properly, thus preventing memory leaks. ASAN log: ==3055068==ERROR: LeakSanitizer: detected memory leaks Direct leak of 512 byte(s) in 1 object(s) allocated from: #0 0x5586a75e6ddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fb6b63b6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x5586a864bbde in virtio_crypto_handle_ctrl hw/virtio/virtio-crypto.c:407:19 #3 0x5586a94fc84c in virtio_queue_notify_vq hw/virtio/virtio.c:2277:9 #4 0x5586a94fc0a2 in virtio_queue_host_notifier_read hw/virtio/virtio.c:3641:9 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240702211835.3064505-1-zheyuma97@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-21 14:31:59 -04:00
David Hildenbrand	4d13ae45ff	virtio-mem: improve error message when unplug of device fails due to plugged memory The error message is actually expressive, considering QEMU only. But when called from Libvirt, talking about "size" can be confusing, because in Libvirt "size" translates to the memory backend size in QEMU (maximum size) and "current" translates to the QEMU "size" property. Let's simply avoid talking about the "size" property and spell out that some device memory is still plugged. Message-ID: <20240416141426.588544-1-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Cc: Liang Cong <lcong@redhat.com> Cc: Mario Casquero <mcasquer@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2024-07-10 18:06:24 +02:00
Eric Auger	956b30b9cf	virtio-iommu: Revert transient enablement of IOMMU MR in bypass mode In `94df5b2180` ("virtio-iommu: Fix 64kB host page size VFIO device assignment"), in case of bypass mode, we transiently enabled the IOMMU MR to allow the set_page_size_mask() to be called and pass information about the page size mask constraint of cold plugged VFIO devices. Now we do not use the IOMMU MR callback anymore, we can just get rid of this hack. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Eric Auger	2457343d05	memory: remove IOMMU MR iommu_set_page_size_mask() callback Everything is now in place to use the Host IOMMU Device callbacks to retrieve the page size mask usable with a given assigned device. This new method brings the advantage to pass the info much earlier to the virtual IOMMU and before the IOMMU MR gets enabled. So let's remove the call to memory_region_iommu_set_page_size_mask in vfio common.c and remove the single implementation of the IOMMU MR callback in the virtio-iommu.c Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Eric Auger	d7c8c95fbc	virtio-iommu : Retrieve page size mask on virtio_iommu_set_iommu_device() Retrieve the Host IOMMU Device page size mask when this latter is set. This allows to get the information much sooner than when relying on IOMMU MR set_page_size_mask() call, whcih happens when the IOMMU MR gets enabled. We introduce check_page_size_mask() helper whose code is inherited from current virtio_iommu_set_page_size_mask() implementation. This callback will be removed in a subsequent patch. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Eric Auger	d59ca1ca17	HostIOMMUDevice : remove Error handle from get_iova_ranges callback The error handle argument is not used anywhere. let's remove it. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Eric Auger	37baedf8e8	virtio-iommu: Fix error handling in virtio_iommu_set_host_iova_ranges() In case no IOMMUPciBus/IOMMUDevice are found we need to properly set the error handle and return. Fixes : Coverity CID 1549006 Signed-off-by: Eric Auger <eric.auger@redhat.com> Fixes: `cf2647a76e` ("virtio-iommu: Compute host reserved regions") Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Cédric Le Goater	1b889d6e39	virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged When a VFIO device is hoplugged in a VM using virtio-iommu, IOMMUPciBus and IOMMUDevice cache entries are created in the .get_address_space() handler of the machine IOMMU device. However, these entries are never destroyed, not even when the VFIO device is detached from the machine. This can lead to an assert if the device is reattached again. When reattached, the .get_address_space() handler reuses an IOMMUDevice entry allocated when the VFIO device was first attached. virtio_iommu_set_host_iova_ranges() is called later on from the .set_iommu_device() handler an fails with an assert on 'probe_done' because the device appears to have been already probed when this is not the case. The IOMMUDevice entry is allocated in pci_device_iommu_address_space() called from under vfio_realize(), the VFIO PCI realize handler. Since pci_device_unset_iommu_device() is called from vfio_exitfn(), a sub function of the PCIDevice unrealize() handler, it seems that the .unset_iommu_device() handler is the best place to release resources allocated at realize time. Clear the IOMMUDevice cache entry there to fix hotplug. Fixes: `817ef10da2` ("virtio-iommu: Implement set\|unset]_iommu_device() callbacks") Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-Id: <20240701101453.203985-1-clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-03 18:14:07 -04:00
Stefano Garzarella	7aa6492401	virtio: remove virtio_tswap16s() call in vring_packed_event_read() Commit `d152cdd6f6` ("virtio: use virtio accessor to access packed event") switched using of address_space_read_cached() to virito_lduw_phys_cached() to access packed descriptor event. When we used address_space_read_cached(), we needed to call virtio_tswap16s() to handle the endianess of the field, but virito_lduw_phys_cached() already handles it internally, so we no longer need to call virtio_tswap16s() (as the commit had done for `off_wrap`, but forgot for `flags`). Fixes: `d152cdd6f6` ("virtio: use virtio accessor to access packed event") Cc: jasowang@redhat.com Cc: qemu-stable@nongnu.org Reported-by: Xoykie <xoykie@gmail.com> Link: https://lore.kernel.org/qemu-devel/CAFU8RB_pjr77zMLsM0Unf9xPNxfr_--Tjr49F_eX32ZBc5o2zQ@mail.gmail.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20240701075208.19634-1-sgarzare@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-03 18:14:07 -04:00
BillXiang	7c211eb078	vhost-user: Skip unnecessary duplicated VHOST_USER_SET_LOG_BASE requests The VHOST_USER_SET_LOG_BASE requests should be categorized into non-vring specific messages, and should be sent only once. If send more than once, dpdk will munmap old log_addr which may has been used and cause segmentation fault. Signed-off-by: BillXiang <xiangwencheng@dayudpu.com> Message-Id: <20240613065150.3100-1-xiangwencheng@dayudpu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 17:16:05 -04:00
Manos Pitsidianakis	704391f94a	virtio-iommu: add error check before assert A fuzzer case discovered by Zheyu Ma causes an assert failure. Add a check before the assert, and respond with an error before moving on to the next queue element. To reproduce the failure: cat << EOF \| \ qemu-system-x86_64 \ -display none -machine accel=qtest -m 512M -machine q35 -nodefaults \ -device virtio-iommu -qtest stdio outl 0xcf8 0x80000804 outw 0xcfc 0x06 outl 0xcf8 0x80000820 outl 0xcfc 0xe0004000 write 0x10000e 0x1 0x01 write 0xe0004020 0x4 0x00001000 write 0xe0004028 0x4 0x00101000 write 0xe000401c 0x1 0x01 write 0x106000 0x1 0x05 write 0x100001 0x1 0x60 write 0x100002 0x1 0x10 write 0x100009 0x1 0x04 write 0x10000c 0x1 0x01 write 0x100018 0x1 0x04 write 0x10001c 0x1 0x02 write 0x101003 0x1 0x01 write 0xe0007001 0x1 0x00 EOF Reported-by: Zheyu Ma <zheyuma97@gmail.com> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2359 Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20240613-fuzz-2359-fix-v2-manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 17:16:05 -04:00
Akihiko Odaki	25b8a0f40c	hw/virtio: Free vqs after vhost_dev_cleanup() This fixes LeakSanitizer warnings. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240627-san-v2-7-750bb0946dbd@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 17:16:05 -04:00
Jiqian Chen	5d98e18823	virtio-pci: implement No_Soft_Reset bit In current code, when guest does S3, virtio-gpu are reset due to the bit No_Soft_Reset is not set. After resetting, the display resources of virtio-gpu are destroyed, then the display can't come back and only show blank after resuming. Implement No_Soft_Reset bit of PCI_PM_CTRL register, then guest can check this bit, if this bit is set, the devices resetting will not be done, and then the display can work after resuming. No_Soft_Reset bit is implemented for all virtio devices, and was tested only on virtio-gpu device. Set it false by default for safety. Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Message-Id: <20240606102205.114671-3-Jiqian.Chen@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 17:16:05 -04:00
Cindy Lu	a113d041e8	virtio-pci: Fix the failure process in kvm_virtio_pci_vector_use_one() In function kvm_virtio_pci_vector_use_one(), the function will only use the irqfd/vector for itself. Therefore, in the undo label, the failing process is incorrect. To fix this, we can just remove this label. Fixes: `f9a09ca3ea` ("vhost: add support for configure interrupt") Cc: qemu-stable@nongnu.org Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20240528084840.194538-1-lulu@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 17:16:04 -04:00
Li Feng	6eaf0e612b	vhost-user: fix lost reconnect again When the vhost-user is reconnecting to the backend, and if the vhost-user fails at the get_features in vhost_dev_init(), then the reconnect will fail and it will not be retriggered forever. The reason is: When the vhost-user fail at get_features, the vhost_dev_cleanup will be called immediately. vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'. The reconnect path is: vhost_user_blk_event vhost_user_async_close(.. vhost_user_blk_disconnect ..) qemu_chr_fe_set_handlers <----- clear the notifier callback schedule vhost_user_async_close_bh The vhost->vdev is null, so the vhost_user_blk_disconnect will not be called, then the event fd callback will not be reinstalled. We need to ensure that even if vhost_dev_init initialization fails, the event handler still needs to be reinstalled when s->connected is false. All vhost-user devices have this issue, including vhost-user-blk/scsi. Fixes: `71e076a07d` ("hw/virtio: generalise CHR_EVENT_CLOSED handling") Signed-off-by: Li Feng <fengli@smartx.com> Message-Id: <20240516025753.130171-3-fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 17:16:04 -04:00
Li Feng	9569fe0aac	Revert "vhost-user: fix lost reconnect" This reverts commit `f02a4b8e64`. Since the current patch cannot completely fix the lost reconnect problem, there is a scenario that is not considered: - When the virtio-blk driver is removed from the guest os, s->connected has no chance to be set to false, resulting in subsequent reconnection not being executed. The next patch will completely fix this issue with a better approach. Signed-off-by: Li Feng <fengli@smartx.com> Message-Id: <20240516025753.130171-2-fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 17:16:04 -04:00
Jiqian Chen	84b58169e4	virtio-pci: only reset pm state during resetting Fix bug imported by `27ce0f3afc` ("fix Power Management Control Register for PCI Express virtio devices" After this change, observe that QEMU may erroneously clear the power status of the device, or may erroneously clear non writable registers, such as NO_SOFT_RESET, etc. Only state of PM_CTRL is writable. Only when flag VIRTIO_PCI_FLAG_INIT_PM is set, need to reset state. Fixes: `27ce0f3afc` ("fix Power Management Control Register for PCI Express virtio devices" Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Message-Id: <20240515073526.17297-2-Jiqian.Chen@amd.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 17:16:04 -04:00
Wafer	33abfea239	hw/virtio: Fix obtain the buffer id from the last descriptor The virtio-1.3 specification <https://docs.oasis-open.org/virtio/virtio/v1.3/virtio-v1.3.html> writes: 2.8.6 Next Flag: Descriptor Chaining Buffer ID is included in the last descriptor in the list. If the feature (_F_INDIRECT_DESC) has been negotiated, install only one descriptor in the virtqueue. Therefor the buffer id should be obtained from the first descriptor. In descriptor chaining scenarios, the buffer id should be obtained from the last descriptor. Fixes: `86044b24e8` ("virtio: basic packed virtqueue support") Signed-off-by: Wafer <wafer@jaguarmicro.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240510072753.26158-2-wafer@jaguarmicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 14:56:23 -04:00
Halil Pasic	a0eebd790c	vhost-vsock: add VIRTIO_F_RING_PACKED to feature_bits Not having VIRTIO_F_RING_PACKED in feature_bits[] is a problem when the vhost-vsock device does not offer the feature bit VIRTIO_F_RING_PACKED but the in QEMU device is configured to try to use the packed layout (the virtio property "packed" is on). As of today, the Linux kernel vhost-vsock device does not support the packed queue layout (as vhost does not support packed), and does not offer VIRTIO_F_RING_PACKED. Thus when for example a vhost-vsock-ccw is used with packed=on, VIRTIO_F_RING_PACKED ends up being negotiated, despite the fact that the device does not actually support it, and one gets to keep the pieces. Fixes: `74b3e46630` ("virtio: add property to enable packed virtqueue") Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20240429113334.2454197-1-pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 14:56:23 -04:00
Jonah Palmer	b937fa8963	vhost/vhost-user: Add VIRTIO_F_NOTIFICATION_DATA to vhost feature bits Add support for the VIRTIO_F_NOTIFICATION_DATA feature across a variety of vhost devices. The inclusion of VIRTIO_F_NOTIFICATION_DATA in the feature bits arrays for these devices ensures that the backend is capable of offering and providing support for this feature, and that it can be disabled if the backend does not support it. Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-6-jonah.palmer@oracle.com> Acked-by: Srujana Challa <schalla@marvell.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 14:56:23 -04:00
Jonah Palmer	54869366be	virtio-mmio: Handle extra notification data Add support to virtio-mmio devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-mmio device when this feature is enabled varies depending on the device's virtqueue layout. The data passed to the virtio-mmio device is in the same format as the data passed to virtio-pci devices. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-4-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 14:56:23 -04:00
Jonah Palmer	78378f450a	virtio: Prevent creation of device using notification-data with ioeventfd Prevent the realization of a virtio device that attempts to use the VIRTIO_F_NOTIFICATION_DATA transport feature without disabling ioeventfd. Due to ioeventfd not being able to carry the extra data associated with this feature, having both enabled is a functional mismatch and therefore Qemu should not continue the device's realization process. Although the device does not yet know if the feature will be successfully negotiated, many devices using this feature wont actually work without this extra data and would fail FEATURES_OK anyway. If ioeventfd is able to work with the extra notification data in the future, this compatibility check can be removed. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-3-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 14:56:23 -04:00
Jonah Palmer	cf39b82860	virtio/virtio-pci: Handle extra notification data Add support to virtio-pci devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-pci device when this feature is enabled varies depending on the device's virtqueue layout. In a split virtqueue layout, this data includes: - upper 16 bits: shadow_avail_idx - lower 16 bits: virtqueue index In a packed virtqueue layout, this data includes: - upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx - lower 16 bits: virtqueue index Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240315165557.26942-2-jonah.palmer@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-01 14:56:23 -04:00
Si-Wei Liu	c5cd7e5f23	vhost: Perform memory section dirty scans once per iteration On setups with one or more virtio-net devices with vhost on, dirty tracking iteration increases cost the bigger the number amount of queues are set up e.g. on idle guests migration the following is observed with virtio-net with vhost=on: 48 queues -> 78.11% [.] vhost_dev_sync_region.isra.13 8 queues -> 40.50% [.] vhost_dev_sync_region.isra.13 1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13 2 devices, 1 queue -> 18.60% [.] vhost_dev_sync_region.isra.14 With high memory rates the symptom is lack of convergence as soon as it has a vhost device with a sufficiently high number of queues, the sufficient number of vhost devices. On every migration iteration (every 100msecs) it will redundantly query the shared log the number of queues configured with vhost that exist in the guest. For the virtqueue data, this is necessary, but not for the memory sections which are the same. So essentially we end up scanning the dirty log too often. To fix that, select a vhost device responsible for scanning the log with regards to memory sections dirty tracking. It is selected when we enable the logger (during migration) and cleared when we disable the logger. If the vhost logger device goes away for some reason, the logger will be re-selected from the rest of vhost devices. After making mem-section logger a singleton instance, constant cost of 7%-9% (like the 1 queue report) will be seen, no matter how many queues or how many vhost devices are configured: 48 queues -> 8.71% [.] vhost_dev_sync_region.isra.13 2 devices, 8 queues -> 7.97% [.] vhost_dev_sync_region.isra.14 Co-developed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <1710448055-11709-2-git-send-email-si-wei.liu@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2024-07-01 14:56:23 -04:00

1 2 3 4 5 ...

1720 Commits