mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Maciej S. Szmigiero	89b5161527	vfio/migration: Add vfio_save_block_precopy_empty_hit trace event This way it is clearly known when there's no more data to send for that device. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>	2024-11-05 15:51:14 +01:00
Maciej S. Szmigiero	2564e45171	vfio/migration: Add save_{iterate, complete_precopy}_start trace events This way both the start and end points of migrating a particular VFIO device are known. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>	2024-11-05 15:51:14 +01:00
Peter Xu	f018eb62b2	migration: Drop migration_is_setup_or_active() This helper is mostly the same as migration_is_running(), except that one has COLO reported as true, the other has CANCELLING reported as true. Per my past years experience on the state changes, none of them should matter. To make it slightly safer, report both COLO \|\| CANCELLING to be true in migration_is_running(), then drop the other one. We kept the 1st only because the name is simpler, and clear enough. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20241024213056.1395400-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-10-31 15:48:18 -04:00
Alex Williamson	00b519c0bc	vfio/helpers: Align mmaps Thanks to work by Peter Xu, support is introduced in Linux v6.12 to allow pfnmap insertions at PMD and PUD levels of the page table. This means that provided a properly aligned mmap, the vfio driver is able to map MMIO at significantly larger intervals than PAGE_SIZE. For example on x86_64 (the only architecture currently supporting huge pfnmaps for PUD), rather than 4KiB mappings, we can map device MMIO using 2MiB and even 1GiB page table entries. Typically mmap will already provide PMD aligned mappings, so devices with moderately sized MMIO ranges, even GPUs with standard 256MiB BARs, will already take advantage of this support. However in order to better support devices exposing multi-GiB MMIO, such as 3D accelerators or GPUs with resizable BARs enabled, we need to manually align the mmap. There doesn't seem to be a way for userspace to easily learn about PMD and PUD mapping level sizes, therefore this takes the simple approach to align the mapping to the power-of-two size of the region, up to 1GiB, which is currently the maximum alignment we care about. Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>	2024-10-23 14:46:24 +02:00
Alex Williamson	49915c0d2c	vfio/helpers: Refactor vfio_region_mmap() error handling Move error handling code to the end of the function so that it can more easily be shared by new mmap failure conditions. No functional change intended. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>	2024-10-23 14:46:24 +02:00
Avihai Horon	fa4e20defe	vfio/migration: Change trace formats from hex to decimal Data sizes in VFIO migration trace events are printed in hex format while in migration core trace events they are printed in decimal format. This inconsistency makes it less readable when using both trace event types. Hence, change the data sizes print format to decimal in VFIO migration trace events. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>	2024-10-23 14:46:24 +02:00
Avihai Horon	3b5948f808	vfio/migration: Report only stop-copy size in vfio_state_pending_exact() vfio_state_pending_exact() is used to update migration core how much device data is left for the device migration. Currently, the sum of pre-copy and stop-copy sizes of the VFIO device are reported. The pre-copy size is obtained via the VFIO_MIG_GET_PRECOPY_INFO ioctl, which returns the amount of device data available to be transferred while the device is in the PRE_COPY states. The stop-copy size is obtained via the VFIO_DEVICE_FEATURE_MIG_DATA_SIZE ioctl, which returns the total amount of device data left to be transferred in order to complete the device migration. According to the above, current implementation is wrong -- it reports extra overlapping data because pre-copy size is already contained in stop-copy size. Fix it by reporting only stop-copy size. Fixes: `eda7362af9` ("vfio/migration: Add VFIO migration pre-copy support") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>	2024-10-23 14:46:24 +02:00
Corvin Köhne	8719224166	vfio/igd: correctly calculate stolen memory size for gen 9 and later We have to update the calculation of the stolen memory size because we've seen devices using values of 0xf0 and above for the graphics mode select field. The new calculation was taken from the linux kernel [1]. [1] `7c626ce4ba/arch/x86/kernel/early-quirks.c (L455-L460)` Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2024-09-17 10:37:55 +02:00
Corvin Köhne	971ca22f04	vfio/igd: don't set stolen memory size to zero The stolen memory is required for the GOP (EFI) driver and the Windows driver. While the GOP driver seems to work with any stolen memory size, the Windows driver will crash if the size doesn't match the size allocated by the host BIOS. For that reason, it doesn't make sense to overwrite the stolen memory size. It's true that this wastes some VM memory. In the worst case, the stolen memory can take up more than a GB. However, that's uncommon. Additionally, it's likely that a bunch of RAM is assigned to VMs making use of GPU passthrough. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2024-09-17 10:37:55 +02:00
Corvin Köhne	9c86b9fb43	vfio/igd: add ID's for ElkhartLake and TigerLake ElkhartLake and TigerLake devices were tested in legacy mode with Linux and Windows VMs. Both are working properly. It's likely that other Intel GPUs of gen 11 and 12 like IceLake device are working too. However, we're only adding known good devices for now. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2024-09-17 10:37:55 +02:00
Corvin Köhne	11b5ce95be	vfio/igd: add new bar0 quirk to emulate BDSM mirror The BDSM register is mirrored into MMIO space at least for gen 11 and later devices. Unfortunately, the Windows driver reads the register value from MMIO space instead of PCI config space for those devices [1]. Therefore, we either have to keep a 1:1 mapping for the host and guest address or we have to emulate the MMIO register too. Using the igd in legacy mode is already hard due to it's many constraints. Keeping a 1:1 mapping may not work in all cases and makes it even harder to use. An MMIO emulation has to trap the whole MMIO page. This makes accesses to this page slower compared to using second level address translation. Nevertheless, it doesn't have any constraints and I haven't noticed any performance degradation yet making it a better solution. [1] `5c351bee0f/devicemodel/hw/pci/passthrough.c (L650-L653)` Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2024-09-17 10:37:55 +02:00
Corvin Köhne	7bafcd1767	vfio/igd: use new BDSM register location and size for gen 11 and later Intel changed the location and size of the BDSM register for gen 11 devices and later. We have to adjust our emulation for these devices to properly support them. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2024-09-17 10:37:55 +02:00
Corvin Köhne	abd9dda914	vfio/igd: support legacy mode for all known generations We're soon going to add support for legacy mode to ElkhartLake and TigerLake devices. Those are gen 11 and 12 devices. At the moment, all devices identified by our igd_gen function do support legacy mode. This won't change when adding our new devices of gen 11 and 12. Therefore, it makes more sense to accept legacy mode for all known devices instead of maintaining a long list of known good generations. If we add a new generation to igd_gen which doesn't support legacy mode for some reason, it'll be easy to advance the check to reject legacy mode for this specific generation. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2024-09-17 10:37:55 +02:00
Corvin Köhne	e433f20897	vfio/igd: return an invalid generation for unknown devices Intel changes it's specification quite often e.g. the location and size of the BDSM register has change for gen 11 devices and later. This causes our emulation to fail on those devices. So, it's impossible for us to use a suitable default value for unknown devices. Instead of returning a random generation value and hoping that everthing works fine, we should verify that different devices are working and add them to our list of known devices. Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2024-09-17 10:37:55 +02:00
Peter Maydell	4bd683d6f1	hw/vfio/pci.c: Use correct type in trace_vfio_msix_early_setup() The tracepoint trace_vfio_msix_early_setup() uses "int" for the type of the table_bar argument, but we use this to print a uint32_t. Coverity warns that this means that we could end up treating it as a negative number. We only use this in printing the value in the tracepoint, so mishandling it as a negative number would be harmless, but it's better to use the right type in the tracepoint. Use uint64_t to match how we print the table_offset in the vfio_msix_relo() tracepoint. Resolves: Coverity CID 1547690 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com>	2024-09-17 10:37:55 +02:00
Peter Maydell	e3d0814368	hw: Use device_class_set_legacy_reset() instead of opencoding Use device_class_set_legacy_reset() instead of opencoding an assignment to DeviceClass::reset. This change was produced with: spatch --macro-file scripts/cocci-macro-file.h \ --sp-file scripts/coccinelle/device-reset.cocci \ --keep-comments --smpl-spacing --in-place --dir hw Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240830145812.1967042-8-peter.maydell@linaro.org	2024-09-13 15:31:44 +01:00
Markus Armbruster	64f5e9db77	qapi/vfio: Rename VfioMigrationState to Qapi*, and drop prefix QAPI's 'prefix' feature can make the connection between enumeration type and its constants less than obvious. It's best used with restraint. VfioMigrationState has a 'prefix' that overrides the generated enumeration constants' prefix to QAPI_VFIO_MIGRATION_STATE. We could simply drop 'prefix', but then the enumeration constants would look as if they came from kernel header linux/vfio.h. Rename the type to QapiVfioMigrationState instead, so that 'prefix' is not needed. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240904111836.3273842-20-armbru@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com>	2024-09-10 14:03:45 +02:00
Markus Armbruster	55872c7061	qapi/common: Drop temporary 'prefix' Recent commit "qapi: Smarter camel_to_upper() to reduce need for 'prefix'" added a temporary 'prefix' to delay changing the generated code. Revert it. This improves OffAutoPCIBAR's generated enumeration constant prefix from OFF_AUTOPCIBAR to OFF_AUTO_PCIBAR. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20240904111836.3273842-5-armbru@redhat.com>	2024-09-10 13:22:47 +02:00
Richard Henderson	dd4bc5f1cf	vfio queue: * IOMMUFD Dirty Tracking support * Fix for a possible SEGV in IOMMU type1 container * Dropped initialization of host IOMMU device with mdev devices -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmafyVUACgkQUaNDx8/7 7KGebRAAzEYxvstDxSPNF+1xx937TKbRpiKYtspTfEgu4Ht50MwO2ZqnVWzTBSwa qcjhDf2avMBpBvkp4O9fR7nXR0HRN2KvYrBSThZ3Qpqu4KjxCAGcHI5uYmgfizYh BBLrw3eWME5Ry220TinQF5KFl50vGq7Z/mku5N5Tgj2qfTfCXYK1Kc19SyAga49n LSokTIjZAGJa4vxrE7THawaEUjFRjfCJey64JUs/TPJaGr4R1snJcWgETww6juUE 9OSw/xl0AoQhaN/ZTRC1qCsBLUI2MVPsC+x+vqVK62HlTjCx+uDRVQ8KzfDzjCeH gaLkMjxJSuJZMpm4UU7DBzDGEGcEBCGeNyFt37BSqqPPpX55CcFhj++d8vqTiwpF YzmTNd/znxcZTw6OJN9sQZohh+NeS86CVZ3x31HD3dXifhRf17jbh7NoIyi+0ZCb N+mytOH5BXsD+ddwbk+yMaxXV43Fgz7ThG5tB1tjhhNtLZHDA5ezFvGZ5F/FJrqE xAbjOhz5MC+RcOVNSzQJCULNqFpfE6Gqeys6btEDm/ltf4LpAe6W1HYuv8BJc19T UsqGK2yKAuQX8GErYxJ1zqZCttVrgpsmXFYTC5iGbxC84mvsF0Iti96IdXz9gfzN Vlb2OxoefcOwVqIhbkvTZW0ZwYGGDDPAYhLMfr5lSuRqj123OOo= =cViP -----END PGP SIGNATURE----- Merge tag 'pull-vfio-20240723-1' of https://github.com/legoater/qemu into staging vfio queue: * IOMMUFD Dirty Tracking support * Fix for a possible SEGV in IOMMU type1 container * Dropped initialization of host IOMMU device with mdev devices # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmafyVUACgkQUaNDx8/7 # 7KGebRAAzEYxvstDxSPNF+1xx937TKbRpiKYtspTfEgu4Ht50MwO2ZqnVWzTBSwa # qcjhDf2avMBpBvkp4O9fR7nXR0HRN2KvYrBSThZ3Qpqu4KjxCAGcHI5uYmgfizYh # BBLrw3eWME5Ry220TinQF5KFl50vGq7Z/mku5N5Tgj2qfTfCXYK1Kc19SyAga49n # LSokTIjZAGJa4vxrE7THawaEUjFRjfCJey64JUs/TPJaGr4R1snJcWgETww6juUE # 9OSw/xl0AoQhaN/ZTRC1qCsBLUI2MVPsC+x+vqVK62HlTjCx+uDRVQ8KzfDzjCeH # gaLkMjxJSuJZMpm4UU7DBzDGEGcEBCGeNyFt37BSqqPPpX55CcFhj++d8vqTiwpF # YzmTNd/znxcZTw6OJN9sQZohh+NeS86CVZ3x31HD3dXifhRf17jbh7NoIyi+0ZCb # N+mytOH5BXsD+ddwbk+yMaxXV43Fgz7ThG5tB1tjhhNtLZHDA5ezFvGZ5F/FJrqE # xAbjOhz5MC+RcOVNSzQJCULNqFpfE6Gqeys6btEDm/ltf4LpAe6W1HYuv8BJc19T # UsqGK2yKAuQX8GErYxJ1zqZCttVrgpsmXFYTC5iGbxC84mvsF0Iti96IdXz9gfzN # Vlb2OxoefcOwVqIhbkvTZW0ZwYGGDDPAYhLMfr5lSuRqj123OOo= # =cViP # -----END PGP SIGNATURE----- # gpg: Signature made Wed 24 Jul 2024 01:16:37 AM AEST # gpg: using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1 # gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: A0F6 6548 F048 95EB FE6B 0B60 51A3 43C7 CFFB ECA1 * tag 'pull-vfio-20240723-1' of https://github.com/legoater/qemu: vfio/common: Allow disabling device dirty page tracking vfio/migration: Don't block migration device dirty tracking is unsupported vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support vfio/iommufd: Probe and request hwpt dirty tracking capability vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device() vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps vfio/{iommufd,container}: Remove caps::aw_bits vfio/iommufd: Introduce auto domain creation vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt() backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev vfio/pci: Extract mdev check into an helper hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize() Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-07-24 12:58:46 +10:00
Joao Martins	30b9167785	vfio/common: Allow disabling device dirty page tracking The property 'x-pre-copy-dirty-page-tracking' allows disabling the whole tracking of VF pre-copy phase of dirty page tracking, though it means that it will only be used at the start of the switchover phase. Add an option that disables the VF dirty page tracking, and fall back into container-based dirty page tracking. This also allows to use IOMMU dirty tracking even on VFs with their own dirty tracker scheme. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>	2024-07-23 17:14:53 +02:00
Joao Martins	f48b472450	vfio/migration: Don't block migration device dirty tracking is unsupported By default VFIO migration is set to auto, which will support live migration if the migration capability is set and also dirty page tracking is supported. For testing purposes one can force enable without dirty page tracking via enable-migration=on, but that option is generally left for testing purposes. So starting with IOMMU dirty tracking it can use to accommodate the lack of VF dirty page tracking allowing us to minimize the VF requirements for migration and thus enabling migration by default for those too. While at it change the error messages to mention IOMMU dirty tracking as well. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> [ clg: - spelling in commit log ] Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-07-23 17:14:53 +02:00
Joao Martins	7c30710bd9	vfio/iommufd: Implement VFIOIOMMUClass::query_dirty_bitmap support ioctl(iommufd, IOMMU_HWPT_GET_DIRTY_BITMAP, arg) is the UAPI that fetches the bitmap that tells what was dirty in an IOVA range. A single bitmap is allocated and used across all the hwpts sharing an IOAS which is then used in log_sync() to set Qemu global bitmaps. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>	2024-07-23 17:14:52 +02:00
Joao Martins	52ce88229c	vfio/iommufd: Implement VFIOIOMMUClass::set_dirty_tracking support ioctl(iommufd, IOMMU_HWPT_SET_DIRTY_TRACKING, arg) is the UAPI that enables or disables dirty page tracking. The ioctl is used if the hwpt has been created with dirty tracking supported domain (stored in hwpt::flags) and it is called on the whole list of iommu domains. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Joao Martins	dddfd8d667	vfio/iommufd: Probe and request hwpt dirty tracking capability In preparation to using the dirty tracking UAPI, probe whether the IOMMU supports dirty tracking. This is done via the data stored in hiod::caps::hw_caps initialized from GET_HW_INFO. Qemu doesn't know if VF dirty tracking is supported when allocating hardware pagetable in iommufd_cdev_autodomains_get(). This is because VFIODevice migration state hasn't been initialized yet hence it can't pick between VF dirty tracking vs IOMMU dirty tracking. So, if IOMMU supports dirty tracking it always creates HWPTs with IOMMU_HWPT_ALLOC_DIRTY_TRACKING even if later on VFIOMigration decides to use VF dirty tracking instead. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [ clg: - Fixed vbasedev->iommu_dirty_tracking assignment in iommufd_cdev_autodomains_get() - Added warning for heterogeneous dirty page tracking support in iommufd_cdev_autodomains_get() ] Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>	2024-07-23 17:14:52 +02:00
Joao Martins	83a4d596a9	vfio/{iommufd, container}: Invoke HostIOMMUDevice::realize() during attach_device() Move the HostIOMMUDevice::realize() to be invoked during the attach of the device before we allocate IOMMUFD hardware pagetable objects (HWPT). This allows the use of the hw_caps obtained by IOMMU_GET_HW_INFO that essentially tell if the IOMMU behind the device supports dirty tracking. Note: The HostIOMMUDevice data from legacy backend is static and doesn't need any information from the (type1-iommu) backend to be initialized. In contrast however, the IOMMUFD HostIOMMUDevice data requires the iommufd FD to be connected and having a devid to be able to successfully GET_HW_INFO. This means vfio_device_hiod_realize() is called in different places within the backend .attach_device() implementation. Suggested-by: Cédric Le Goater <clg@redhat.cm> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> [ clg: Fixed error handling in iommufd_cdev_attach() ] Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Joao Martins	21e8d3a3aa	vfio/iommufd: Add hw_caps field to HostIOMMUDeviceCaps Store the value of @caps returned by iommufd_backend_get_device_info() in a new field HostIOMMUDeviceCaps::hw_caps. Right now the only value is whether device IOMMU supports dirty tracking (IOMMU_HW_CAP_DIRTY_TRACKING). This is in preparation for HostIOMMUDevice::realize() being called early during attach_device(). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Joao Martins	6c63532642	vfio/{iommufd,container}: Remove caps::aw_bits Remove caps::aw_bits which requires the bcontainer::iova_ranges being initialized after device is actually attached. Instead defer that to .get_cap() and call vfio_device_get_aw_bits() directly. This is in preparation for HostIOMMUDevice::realize() being called early during attach_device(). Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Joao Martins	5b1e96e654	vfio/iommufd: Introduce auto domain creation There's generally two modes of operation for IOMMUFD: 1) The simple user API which intends to perform relatively simple things with IOMMUs e.g. DPDK. The process generally creates an IOAS and attaches to VFIO and mainly performs IOAS_MAP and UNMAP. 2) The native IOMMUFD API where you have fine grained control of the IOMMU domain and model it accordingly. This is where most new feature are being steered to. For dirty tracking 2) is required, as it needs to ensure that the stage-2/parent IOMMU domain will only attach devices that support dirty tracking (so far it is all homogeneous in x86, likely not the case for smmuv3). Such invariant on dirty tracking provides a useful guarantee to VMMs that will refuse incompatible device attachments for IOMMU domains. Dirty tracking insurance is enforced via HWPT_ALLOC, which is responsible for creating an IOMMU domain. This is contrast to the 'simple API' where the IOMMU domain is created by IOMMUFD automatically when it attaches to VFIO (usually referred as autodomains) but it has the needed handling for mdevs. To support dirty tracking with the advanced IOMMUFD API, it needs similar logic, where IOMMU domains are created and devices attached to compatible domains. Essentially mimicking kernel iommufd_device_auto_get_domain(). With mdevs given there's no IOMMU domain it falls back to IOAS attach. The auto domain logic allows different IOMMU domains to be created when DMA dirty tracking is not desired (and VF can provide it), and others where it is. Here it is not used in this way given how VFIODevice migration state is initialized after the device attachment. But such mixed mode of IOMMU dirty tracking + device dirty tracking is an improvement that can be added on. Keep the 'all of nothing' of type1 approach that we have been using so far between container vs device dirty tracking. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> [ clg: Added ERRP_GUARD() in iommufd_cdev_autodomains_get() ] Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Zhenzhong Duan	8b8705e7f2	vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev mdevs aren't "physical" devices and when asking for backing IOMMU info, it fails the entire provisioning of the guest. Fix that by setting vbasedev->mdev true so skipping HostIOMMUDevice initialization in the presence of mdevs. Fixes: `9305895201` ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Zhenzhong Duan	c598d65aef	vfio/ap: Don't initialize HOST_IOMMU_DEVICE with mdev mdevs aren't "physical" devices and when asking for backing IOMMU info, it fails the entire provisioning of the guest. Fix that by setting vbasedev->mdev true so skipping HostIOMMUDevice initialization in the presence of mdevs. Fixes: `9305895201` ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Joao Martins	b07dcb7d4f	vfio/iommufd: Return errno in iommufd_cdev_attach_ioas_hwpt() In preparation to implement auto domains have the attach function return the errno it got during domain attach instead of a bool. -EINVAL is tracked to track domain incompatibilities, and decide whether to create a new IOMMU domain. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>	2024-07-23 17:14:52 +02:00
Joao Martins	2d1bf25897	backends/iommufd: Extend iommufd_backend_get_device_info() to fetch HW capabilities The helper will be able to fetch vendor agnostic IOMMU capabilities supported both by hardware and software. Right now it is only iommu dirty tracking. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Joao Martins	9f17604195	vfio/iommufd: Don't initialize nor set a HOST_IOMMU_DEVICE with mdev mdevs aren't "physical" devices and when asking for backing IOMMU info, it fails the entire provisioning of the guest. Fix that by skipping HostIOMMUDevice initialization in the presence of mdevs, and skip setting an iommu device when it is known to be an mdev. Cc: Zhenzhong Duan <zhenzhong.duan@intel.com> Fixes: `9305895201` ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>	2024-07-23 17:14:52 +02:00
Joao Martins	13e522f644	vfio/pci: Extract mdev check into an helper In preparation to skip initialization of the HostIOMMUDevice for mdev, extract the checks that validate if a device is an mdev into helpers. A vfio_device_is_mdev() is created, and subsystems consult VFIODevice::mdev to check if it's mdev or not. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2024-07-23 17:14:52 +02:00
Eric Auger	07321a6d08	hw/vfio/container: Fix SIGSEV on vfio_container_instance_finalize() In vfio_connect_container's error path, the base container is removed twice form the VFIOAddressSpace QLIST: first on the listener_release_exit label and second, on free_container_exit label, through object_unref(container), which calls vfio_container_instance_finalize(). Let's remove the first instance. Fixes: `938026053f` ("vfio/container: Switch to QOM") Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>	2024-07-23 17:14:52 +02:00
Eric Auger	a6586419a1	hw/vfio/common: Add vfio_listener_region_del_iommu trace event Trace when VFIO gets notified about the deletion of an IOMMU MR. Also trace the name of the region in the add_iommu trace message. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20240716094619.1713905-6-eric.auger@redhat.com> Tested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-22 20:15:42 -04:00
Zhao Liu	eb92e6e3e7	hw/vfio/container: Get rid of qemu_open_old() For qemu_open_old(), osdep.h said: > Don't introduce new usage of this function, prefer the following > qemu_open/qemu_create that take an "Error **errp". So replace qemu_open_old() with qemu_open(). Cc: Alex Williamson <alex.williamson@redhat.com> Cc: "Cédric Le Goater" <clg@redhat.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-07-17 14:04:15 +03:00
Zhenzhong Duan	83d9019202	vfio/display: Fix vfio_display_edid_init() error path vfio_display_edid_init() can fail for many reasons and return silently. It would be good to report the error. Old mdev driver may not support vfio edid region and we allow to go through in this case. vfio_display_edid_update() isn't changed because it can be called at runtime when UI changes (i.e. window resize). Fixes: `08479114b0` ("vfio/display: add edid support.") Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2024-07-09 11:50:37 +02:00
Zhenzhong Duan	f15da599a1	vfio/display: Fix potential memleak of edid info EDID related device region info is leaked in vfio_display_edid_init() error path and VFIODisplay destroying path. Fixes: `08479114b0` ("vfio/display: add edid support.") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2024-07-09 11:50:37 +02:00
Eric Auger	2457343d05	memory: remove IOMMU MR iommu_set_page_size_mask() callback Everything is now in place to use the Host IOMMU Device callbacks to retrieve the page size mask usable with a given assigned device. This new method brings the advantage to pass the info much earlier to the virtual IOMMU and before the IOMMU MR gets enabled. So let's remove the call to memory_region_iommu_set_page_size_mask in vfio common.c and remove the single implementation of the IOMMU MR callback in the virtio-iommu.c Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Eric Auger	8fe0ebe15d	HostIOMMUDevice: Introduce get_page_size_mask() callback This callback will be used to retrieve the page size mask supported along a given Host IOMMU device. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Eric Auger	d59ca1ca17	HostIOMMUDevice : remove Error handle from get_iova_ranges callback The error handle argument is not used anywhere. let's remove it. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Eric Auger	3966bca539	vfio-container-base: Introduce vfio_container_get_iova_ranges() helper Introduce vfio_container_get_iova_ranges() to retrieve the usable IOVA regions of the base container and use it in the Host IOMMU device implementations of get_iova_ranges() callback. We also fix a UAF bug as the list was shallow copied while g_list_free_full() was used both on the single call site, in virtio_iommu_set_iommu_device() but also in vfio_container_instance_finalize(). Instead use g_list_copy_deep. Fixes: `cf2647a76e` ("virtio-iommu: Compute host reserved regions") Signed-off-by: Eric Auger <eric.auger@redhat.com> Suggested-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>	2024-07-09 11:50:37 +02:00
Thomas Huth	f0936cbc1d	Remove inclusion of hw/hw.h from files that don't need it hw/hw.h only contains the prototype of hw_error() nowadays, so files that don't use this function don't need to include this header. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-ID: <20240701132649.58345-1-thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2024-07-02 06:58:48 +02:00
Cédric Le Goater	96b7af4388	vfio/container: Move vfio_container_destroy() to an instance_finalize() handler vfio_container_destroy() clears the resources allocated VFIOContainerBase object. Now that VFIOContainerBase is a QOM object, add an instance_finalize() handler to do the cleanup. It will be called through object_unref(). Suggested-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-06-24 23:15:31 +02:00
Cédric Le Goater	b052f73cbe	vfio/container: Introduce vfio_iommu_legacy_instance_init() Just as we did for the VFIOContainerBase object, introduce an instance_init() handler for the legacy VFIOContainer object and do the specific initialization there. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-06-24 23:15:31 +02:00
Cédric Le Goater	2f7243cb8a	vfio/container: Remove vfio_container_init() It's now empty. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-06-24 23:15:31 +02:00
Cédric Le Goater	41d698b8d6	vfio/container: Remove VFIOContainerBase::ops Instead, use VFIO_IOMMU_GET_CLASS() to get the class pointer. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-06-24 23:15:31 +02:00
Cédric Le Goater	2137d2fd17	vfio/container: Introduce an instance_init() handler This allows us to move the initialization code from vfio_container_init(), which we will soon remove. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-06-24 23:15:31 +02:00
Cédric Le Goater	938026053f	vfio/container: Switch to QOM Instead of allocating the container struct, create a QOM object of the appropriate type. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-06-24 23:15:31 +02:00

1 2 3 4 5 ...

794 Commits