mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Zhenzhong Duan	0d3e89bea8	vfio: Use g_autofree in all call site of vfio_get_region_info() There are some exceptions when pointer to vfio_region_info is reused. In that case, the pointed memory is freed manually. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:22 +02:00
Zhenzhong Duan	0a0bda0acd	vfio/pci-quirks: Make vfio_add_*_cap() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Include below functions: vfio_add_virt_caps() vfio_add_nv_gpudirect_cap() vfio_add_vmd_shadow_cap() Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:22 +02:00
Zhenzhong Duan	d3c6a18bc7	vfio/pci-quirks: Make vfio_pci_igd_opregion_init() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	514855e18f	vfio/pci: Use g_autofree for vfio_region_info pointer Pointer opregion is freed after vfio_pci_igd_opregion_init(). Use 'g_autofree' to avoid the g_free() calls. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	b771a40f9e	vfio/pci: Make capability related functions return bool The functions operating on capability don't have a consistent return style. Below functions are in bool-valued functions style: vfio_msi_setup() vfio_msix_setup() vfio_add_std_cap() vfio_add_capabilities() Below two are integer-valued functions: vfio_add_vendor_specific_cap() vfio_setup_pcie_cap() But the returned integer is only used for check succeed/failure. Change them all to return bool so now all capability related functions follow the coding standand in qapi/error.h to return bool. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	64410a741d	vfio/pci: Make vfio_populate_vga() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	c32bab074e	vfio/pci: Make vfio_intx_enable() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	e942d8f08d	vfio/pci: Make vfio_populate_device() return a bool Since vfio_populate_device() takes an 'Error **' argument, best practices suggest to return a bool. See the qapi/error.h Rules section. By this chance, pass errp directly to vfio_populate_device() to avoid calling error_propagate(). Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	713b59a674	vfio/pci: Make vfio_pci_relocate_msix() and vfio_msix_early_setup() return a bool Since vfio_pci_relocate_msix() and vfio_msix_early_setup() takes an 'Error **' argument, best practices suggest to return a bool. See the qapi/error.h Rules section. By this chance, pass errp directly to vfio_msix_early_setup() to avoid calling error_propagate(). Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	44cd660a99	vfio/pci: Make vfio_intx_enable_kvm() return a bool Since vfio_intx_enable_kvm() takes an 'Error **' argument, best practices suggest to return a bool. See the qapi/error.h Rules section. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	c6c6cf91c0	vfio/helpers: Make vfio_device_get_name() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	84e37d0296	vfio/helpers: Make vfio_set_irq_signaling() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	455c009dc4	vfio/display: Make vfio_display_*() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	b77548355a	vfio: Make VFIOIOMMUClass::attach_device() and its wrapper return bool Make VFIOIOMMUClass::attach_device() and its wrapper function vfio_attach_device() return bool. This is to follow the coding standand to return bool if 'Error **' is used to pass error. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-16 16:59:20 +02:00
Zhenzhong Duan	81987bd58b	vfio/pci: Use g_autofree in vfio_realize Local pointer name is allocated before vfio_attach_device() call and freed after the call. Same for tmp when calling realpath(). Use 'g_autofree' to avoid the g_free() calls. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-16 16:59:20 +02:00
Avihai Horon	5e1f8905ca	vfio/migration: Emit VFIO migration QAPI event Emit VFIO migration QAPI event when a VFIO device changes its migration state. This can be used by management applications to get updates on the current state of the VFIO device for their own purposes. A new per VFIO device capability, "migration-events", is added so events can be enabled only for the required devices. It is disabled by default. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-16 16:59:19 +02:00
Vinayak Kale	187716feeb	vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load In case of migration, during restore operation, qemu checks config space of the pci device with the config space in the migration stream captured during save operation. In case of config space data mismatch, restore operation is failed. config space check is done in function get_pci_config_device(). By default VSC (vendor-specific-capability) in config space is checked. Due to qemu's config space check for VSC, live migration is broken across NVIDIA vGPU devices in situation where source and destination host driver is different. In this situation, Vendor Specific Information in VSC varies on the destination to ensure vGPU feature capabilities exposed to the guest driver are compatible with destination host. If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with volatile Vendor Specific Info in VSC then qemu should exempt config space check for Vendor Specific Info. It is vendor driver's responsibility to ensure that VSC is consistent across migration. Here consistency could mean that VSC format should be same on source and destination, however actual Vendor Specific Info may not be byte-to-byte identical. This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI device by clearing pdev->cmask[] offsets. Config space check is still enforced for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips config space check for that offset. VSC check is skipped for machine types >= 9.1. The check would be enforced on older machine types (<= 9.0). Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Cédric Le Goater <clg@redhat.com> Signed-off-by: Vinayak Kale <vkale@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-16 16:59:19 +02:00
Cédric Le Goater	3783f814e7	vfio/migration: Add Error** argument to .vfio_save_config() handler Use vmstate_save_state_with_err() to improve error reporting in the callers and store a reported error under the migration stream. Add documentation while at it. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-16 16:59:19 +02:00
Zhao Liu	cf8afdfad4	hw/vfio/pci: Fix missing ERRP_GUARD() for error_prepend() As the comment in qapi/error, passing @errp to error_prepend() requires ERRP_GUARD(): * = Why, when and how to use ERRP_GUARD() = * * Without ERRP_GUARD(), use of the @errp parameter is restricted: ... * - It should not be passed to error_prepend(), error_vprepend() or * error_append_hint(), because that doesn't work with &error_fatal. * ERRP_GUARD() lifts these restrictions. * * To use ERRP_GUARD(), add it right at the beginning of the function. * @errp can then be used without worrying about the argument being * NULL or &error_fatal. ERRP_GUARD() could avoid the case when @errp is &error_fatal, the user can't see this additional information, because exit() happens in error_setg earlier than information is added [1]. In hw/vfio/pci.c, there are 2 functions passing @errp to error_prepend() without ERRP_GUARD(): - vfio_add_std_cap() - vfio_realize() The @errp of vfio_add_std_cap() is also from vfio_realize(). And vfio_realize(), as a PCIDeviceClass.realize method, its @errp is from DeviceClass.realize so that there is no guarantee that the @errp won't point to @error_fatal. To avoid the issue like [1] said, add missing ERRP_GUARD() at their beginning. [1]: Issue description in the commit message of commit `ae7c80a7bd` ("error: New macro ERRP_GUARD()"). Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20240311033822.3142585-24-zhao1.liu@linux.intel.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2024-03-12 11:45:45 +01:00
Frediano Ziglio	75d5a5fe67	hw/vfio/pci.c: Make some structure static Not used outside C module. Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-03-09 18:56:36 +03:00
Cédric Le Goater	d2b668fca5	vfio/pci: Clear MSI-X IRQ index always When doing device assignment of a physical device, MSI-X can be enabled with no vectors enabled and this sets the IRQ index to VFIO_PCI_MSIX_IRQ_INDEX. However, when MSI-X is disabled, the IRQ index is left untouched if no vectors are in use. Then, when INTx is enabled, the IRQ index value is considered incompatible (set to MSI-X) and VFIO_DEVICE_SET_IRQS fails. QEMU complains with : qemu-system-x86_64: vfio 0000:08:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Invalid argument To avoid that, unconditionaly clear the IRQ index when MSI-X is disabled. Buglink: https://issues.redhat.com/browse/RHEL-21293 Fixes: `5ebffa4e87` ("vfio/pci: use an invalid fd to enable MSI-X") Cc: Jing Liu <jing2.liu@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-01-29 08:26:25 +01:00
Cédric Le Goater	fdaa774e67	vfio/container: Introduce a VFIOIOMMU QOM interface VFIOContainerBase was not introduced as an abstract QOM object because it felt unnecessary to expose all the IOMMU backends to the QEMU machine and human interface. However, we can still abstract the IOMMU backend handlers using a QOM interface class. This provides more flexibility when referencing the various implementations. Simply transform the VFIOIOMMUOps struct in an InterfaceClass and do some initial name replacements. Next changes will start converting VFIOIOMMUOps. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-01-05 21:25:20 +01:00
Richard Henderson	65bd53e8d5	hw/vfio: Constify VMState Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Message-Id: <20231221031652.119827-60-richard.henderson@linaro.org>	2023-12-30 07:38:06 +11:00
Zhenzhong Duan	6106a32914	vfio: Introduce a helper function to initialize VFIODevice Introduce a helper function to replace the common code to initialize VFIODevice in pci, platform, ap and ccw VFIO device. No functional change intended. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-12-19 19:03:38 +01:00
Zhenzhong Duan	dd2fcb1716	vfio/pci: Move VFIODevice initializations in vfio_instance_init Some of the VFIODevice initializations is in vfio_realize, move all of them in vfio_instance_init. No functional change intended. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-12-19 19:03:38 +01:00
Zhenzhong Duan	da3e04b26f	vfio/pci: Make vfio cdev pre-openable by passing a file handle This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Together with the earlier support of pre-opening /dev/iommu device, now we have full support of passing a vfio device to unprivileged qemu by management tool. This mode is no more considered for the legacy backend. So let's remove the "TODO" comment. Add helper functions vfio_device_set_fd() and vfio_device_get_name() to set fd and get device name, they will also be used by other vfio devices. There is no easy way to check if a device is mdev with FD passing, so fail the x-balloon-allowed check unconditionally in this case. There is also no easy way to get BDF as name with FD passing, so we fake a name by VFIO_FD[fd]. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-12-19 19:03:38 +01:00
Eric Auger	ee42b261b0	vfio/pci: Allow the selection of a given iommu backend Now we support two types of iommu backends, let's add the capability to select one of them. This depends on whether an iommufd object has been linked with the vfio-pci device: If the user wants to use the legacy backend, it shall not link the vfio-pci device with any iommufd object: -device vfio-pci,host=0000:02:00.0 This is called the legacy mode/backend. If the user wants to use the iommufd backend (/dev/iommu) it shall pass an iommufd object id in the vfio-pci device options: -object iommufd,id=iommufd0 -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0 Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-12-19 19:03:38 +01:00
Zhenzhong Duan	c328e7e8ad	vfio/pci: Introduce a vfio pci hot reset interface Legacy vfio pci and iommufd cdev have different process to hot reset vfio device, expand current code to abstract out pci_hot_reset callback for legacy vfio, this same interface will also be used by iommufd cdev vfio device. Rename vfio_pci_hot_reset to vfio_legacy_pci_hot_reset and move it into container.c. vfio_pci_[pre/post]_reset and vfio_pci_host_match are exported so they could be called in legacy and iommufd pci_hot_reset callback. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-12-19 19:03:38 +01:00
Zhenzhong Duan	4d36ec23a7	vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info This helper will be used by both legacy and iommufd backends. No functional changes intended. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-12-19 19:03:38 +01:00
Cédric Le Goater	f8d6f3b16c	vfio/pci: Fix buffer overrun when writing the VF token qemu_uuid_unparse() includes a trailing NUL when writing the uuid string and the buffer size should be UUID_FMT_LEN + 1 bytes. Use the recently added UUID_STR_LEN which defines the correct size. Fixes: CID 1522913 Fixes: `2dca1b37a7` ("vfio/pci: add support for VF token") Cc: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: "Denis V. Lunev" <den@openvz.org> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-11-03 09:20:31 +01:00
Marc-André Lureau	8741781157	hw/vfio: add ramfb migration support Add a "VFIODisplay" subsection whenever "x-ramfb-migrate" is turned on. Turn it off by default on machines <= 8.1 for compatibility reasons. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> [ clg: - checkpatch fixes - improved warn_report() in vfio_realize() ] Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-18 10:10:49 +02:00
Eric Auger	c0f527f4cc	vfio/pci: Remove vfio_detach_device from vfio_realize error path In vfio_realize, on the error path, we currently call vfio_detach_device() after a successful vfio_attach_device. While this looks natural, vfio_instance_finalize also induces a vfio_detach_device(), and it seems to be the right place instead as other resources are released there which happen to be a prerequisite to a successful UNSET_CONTAINER. So let's rely on the finalize vfio_detach_device call to free all the relevant resources. Fixes: a28e06621170 ("vfio/pci: Introduce vfio_[attach/detach]_device") Reported-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Tested-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-18 10:10:49 +02:00
Eric Auger	5456b1867d	vfio/pci: Introduce vfio_[attach/detach]_device We want the VFIO devices to be able to use two different IOMMU backends, the legacy VFIO one and the new iommufd one. Introduce vfio_[attach/detach]_device which aim at hiding the underlying IOMMU backend (IOCTLs, datatypes, ...). Once vfio_attach_device completes, the device is attached to a security context and its fd can be used. Conversely When vfio_detach_device completes, the device has been detached from the security context. At the moment only the implementation based on the legacy container/group exists. Let's use it from the vfio-pci device. Subsequent patches will handle other devices. We also take benefit of this patch to properly free vbasedev->name on failure. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-18 10:10:49 +02:00
Jing Liu	eaadba6f9b	vfio/pci: enable MSI-X in interrupt restoring on dynamic allocation During migration restoring, vfio_enable_vectors() is called to restore enabling MSI-X interrupts for assigned devices. It sets the range from 0 to nr_vectors to kernel to enable MSI-X and the vectors unmasked in guest. During the MSI-X enabling, all the vectors within the range are allocated according to the VFIO_DEVICE_SET_IRQS ioctl. When dynamic MSI-X allocation is supported, we only want the guest unmasked vectors being allocated and enabled. Use vector 0 with an invalid fd to get MSI-X enabled, after that, all the vectors can be allocated in need. Signed-off-by: Jing Liu <jing2.liu@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-05 22:04:51 +02:00
Jing Liu	5ebffa4e87	vfio/pci: use an invalid fd to enable MSI-X Guests typically enable MSI-X with all of the vectors masked in the MSI-X vector table. To match the guest state of device, QEMU enables MSI-X by enabling vector 0 with userspace triggering and immediately release. However the release function actually does not release it due to already using userspace mode. It is no need to enable triggering on host and rely on the mask bit to avoid spurious interrupts. Use an invalid fd (i.e. fd = -1) is enough to get MSI-X enabled. After dynamic MSI-X allocation is supported, the interrupt restoring also need use such way to enable MSI-X, therefore, create a function for that. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-05 22:04:51 +02:00
Jing Liu	d9e6710d7d	vfio/pci: enable vector on dynamic MSI-X allocation The vector_use callback is used to enable vector that is unmasked in guest. The kernel used to only support static MSI-X allocation. When allocating a new interrupt using "static MSI-X allocation" kernels, QEMU first disables all previously allocated vectors and then re-allocates all including the new one. The nr_vectors of VFIOPCIDevice indicates that all vectors from 0 to nr_vectors are allocated (and may be enabled), which is used to loop all the possibly used vectors when e.g., disabling MSI-X interrupts. Extend the vector_use function to support dynamic MSI-X allocation when host supports the capability. QEMU therefore can individually allocate and enable a new interrupt without affecting others or causing interrupts lost during runtime. Utilize nr_vectors to calculate the upper bound of enabled vectors in dynamic MSI-X allocation mode since looping all msix_entries_nr is not efficient and unnecessary. Signed-off-by: Jing Liu <jing2.liu@intel.com> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-05 22:04:51 +02:00
Jing Liu	45d85f6228	vfio/pci: detect the support of dynamic MSI-X allocation Kernel provides the guidance of dynamic MSI-X allocation support of passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to guide user space. Fetch the flags from host to determine if dynamic MSI-X allocation is supported. Originally-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-05 22:04:51 +02:00
Zhenzhong Duan	c06327c9db	vfio/pci: rename vfio_put_device to vfio_pci_put_device vfio_put_device() is a VFIO PCI specific function, rename it with 'vfio_pci' prefix to avoid confusing. No functional change. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-05 22:04:51 +02:00
Cédric Le Goater	44fa20c928	spapr: Remove support for NVIDIA V100 GPU with NVLink2 NVLink2 support was removed from the PPC PowerNV platform and VFIO in Linux 5.13 with commits : 562d1e207d32 ("powerpc/powernv: remove the nvlink support") b392a1989170 ("vfio/pci: remove vfio_pci_nvlink2") This was 2.5 years ago. Do the same in QEMU with a revert of commit `ec132efaa8` ("spapr: Support NVIDIA V100 GPU with NVLink2"). Some adjustements are required on the NUMA part. Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20230918091717.149950-1-clg@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2023-09-18 07:25:28 -03:00
Alex Williamson	c00aac6f14	vfio/pci: Enable AtomicOps completers on root ports Dynamically enable Atomic Ops completer support around realize/exit of vfio-pci devices reporting host support for these accesses and adhering to a minimal configuration standard. While the Atomic Ops completer bits in the root port device capabilities2 register are read-only, the PCIe spec does allow RO bits to change to reflect hardware state. We take advantage of that here around the realize and exit functions of the vfio-pci device. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Robin Voetter <robin@streamhpc.com> Tested-by: Robin Voetter <robin@streamhpc.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Avihai Horon	8af87a3ec7	vfio: Fix null pointer dereference bug in vfio_bars_finalize() vfio_realize() has the following flow: 1. vfio_bars_prepare() -- sets VFIOBAR->size. 2. msix_early_setup(). 3. vfio_bars_register() -- allocates VFIOBAR->mr. After vfio_bars_prepare() is called msix_early_setup() can fail. If it does fail, vfio_bars_register() is never called and VFIOBAR->mr is not allocated. In this case, vfio_bars_finalize() is called as part of the error flow to free the bars' resources. However, vfio_bars_finalize() calls object_unparent() for VFIOBAR->mr after checking only VFIOBAR->size, and thus we get a null pointer dereference. Fix it by checking VFIOBAR->mr in vfio_bars_finalize(). Fixes: `89d5202edc` ("vfio/pci: Allow relocating MSI-X MMIO") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Zhenzhong Duan	d4a2af747d	vfio/migration: Return bool type for vfio_migration_realize() Make vfio_migration_realize() adhere to the convention of other realize() callbacks(like qdev_realize) by returning bool instead of int. Suggested-by: Cédric Le Goater <clg@redhat.com> Suggested-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Zhenzhong Duan	0520d63c77	vfio/migration: Remove print of "Migration disabled" Property enable_migration supports [on/off/auto]. In ON mode, error pointer is passed to errp and logged. In OFF mode, we doesn't need to log "Migration disabled" as it's intentional. In AUTO mode, we should only ever see errors or warnings if the device supports migration and an error or incompatibility occurs while further probing or configuring it. Lack of support for migration shoundn't generate an error or warning. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Zhenzhong Duan	2b43b2995b	vfio/migration: Free resources when vfio_migration_realize fails When vfio_realize() succeeds, hot unplug will call vfio_exitfn() to free resources allocated in vfio_realize(); when vfio_realize() fails, vfio_exitfn() is never called and we need to free resources in vfio_realize(). In the case that vfio_migration_realize() fails, e.g: with -only-migratable & enable-migration=off, we see below: (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off 0000:81:11.1: Migration disabled Error: disallowing migration blocker (--only-migratable) for: 0000:81:11.1: Migration is disabled for VFIO device If we hotplug again we should see same log as above, but we see: (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,enable-migration=off Error: vfio 0000:81:11.1: device is already attached That's because some references to VFIO device isn't released. For resources allocated in vfio_migration_realize(), free them by jumping to out_deinit path with calling a new function vfio_migration_deinit(). For resources allocated in vfio_realize(), free them by jumping to de-register path in vfio_realize(). Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Fixes: `a22651053b` ("vfio: Make vfio-pci device migration capable") Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Zhenzhong Duan	3c26c80a0a	vfio/migration: Change vIOMMU blocker from global to per device Contrary to multiple device blocker which needs to consider already-attached devices to unblock/block dynamically, the vIOMMU migration blocker is a device specific config. Meaning it only needs to know whether the device is bypassing or not the vIOMMU (via machine property, or per pxb-pcie::bypass_iommu), and does not need the state of currently present devices. For this reason, the vIOMMU global migration blocker can be consolidated into the per-device migration blocker, allowing us to remove some unnecessary code. This change also makes vfio_mig_active() more accurate as it doesn't check for global blocker. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Zhenzhong Duan	adee0da036	vfio/pci: Disable INTx in vfio_realize error path When vfio realize fails, INTx isn't disabled if it has been enabled. This may confuse host side with unhandled interrupt report. Fixes: `c5478fea27` ("vfio/pci: Respond to KVM irqchip change notifier") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Zhenzhong Duan	0cc889c882	vfio/pci: Free leaked timer in vfio_realize error path When vfio_realize fails, the mmap_timer used for INTx optimization isn't freed. As this timer isn't activated yet, the potential impact is just a piece of leaked memory. Fixes: `ea486926b0` ("vfio-pci: Update slow path INTx algorithm timer related") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00
Zhenzhong Duan	357bd7932a	vfio/pci: Fix a segfault in vfio_realize The kvm irqchip notifier is only registered if the device supports INTx, however it's unconditionally removed in vfio realize error path. If the assigned device does not support INTx, this will cause QEMU to crash when vfio realize fails. Change it to conditionally remove the notifier only if the notify hook is setup. Before fix: (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1 Connection closed by foreign host. After fix: (qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1 Error: vfio 0000:81:11.1: xres and yres properties require display=on (qemu) Fixes: `c5478fea27` ("vfio/pci: Respond to KVM irqchip change notifier") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00
Avihai Horon	8bbcb64a71	vfio/migration: Make VFIO migration non-experimental The major parts of VFIO migration are supported today in QEMU. This includes basic VFIO migration, device dirty page tracking and precopy support. Thus, at this point in time, it seems appropriate to make VFIO migration non-experimental: remove the x prefix from enable_migration property, change it to ON_OFF_AUTO and let the default value be AUTO. In addition, make the following adjustments: 1. When enable_migration is ON and migration is not supported, fail VFIO device realization. 2. When enable_migration is AUTO (i.e., not explicitly enabled), require device dirty tracking support. This is because device dirty tracking is currently the only method to do dirty page tracking, which is essential for migrating in a reasonable downtime. Setting enable_migration to ON will not require device dirty tracking. 3. Make migration error and blocker messages more elaborate. 4. Remove error prints in vfio_migration_query_flags(). 5. Rename trace_vfio_migration_probe() to trace_vfio_migration_realize(). Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00
Shameer Kolothum	c174088923	vfio/pci: Call vfio_prepare_kvm_msi_virq_batch() in MSI retry path When vfio_enable_vectors() returns with less than requested nr_vectors we retry with what kernel reported back. But the retry path doesn't call vfio_prepare_kvm_msi_virq_batch() and this results in, qemu-system-aarch64: vfio: Error: Failed to enable 4 MSI vectors, retry with 1 qemu-system-aarch64: ../hw/vfio/pci.c:602: vfio_commit_kvm_msi_virq_batch: Assertion `vdev->defer_kvm_irq_routing' failed Fixes: `dc580d51f7` ("vfio: defer to commit kvm irq routing when enable msi/msix") Reviewed-by: Longpeng <longpeng2@huawei.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-06-30 06:02:51 +02:00

1 2 3 4 5

243 Commits