mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Philippe Mathieu-Daudé	4c8a2054e7	hw/vfio/ccw: Simplify using DEVICE() macro QOM parenthood relationship is: VFIOCCWDevice -> S390CCWDevice -> CcwDevice -> DeviceState We can directly use the QOM DEVICE() macro to get the parent object. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20230213170145.45666-3-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-02-27 09:15:38 +01:00
Avihai Horon	48e4d8289f	vfio: Alphabetize migration section of VFIO trace-events file Sort the migration section of VFIO trace events file alphabetically and move two misplaced traces to common.c section. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-11-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	7429aebe1c	vfio/migration: Remove VFIO migration protocol v1 Now that v2 protocol implementation has been added, remove the deprecated v1 implementation. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-10-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	31bcbbb5be	vfio/migration: Implement VFIO migration protocol v2 Implement the basic mandatory part of VFIO migration protocol v2. This includes all functionality that is necessary to support VFIO_MIGRATION_STOP_COPY part of the v2 protocol. The two protocols, v1 and v2, will co-exist and in the following patches v1 protocol code will be removed. There are several main differences between v1 and v2 protocols: - VFIO device state is now represented as a finite state machine instead of a bitmap. - Migration interface with kernel is now done using VFIO_DEVICE_FEATURE ioctl and normal read() and write() instead of the migration region. - Pre-copy is made optional in v2 protocol. Support for pre-copy will be added later on. Detailed information about VFIO migration protocol v2 and its difference compared to v1 protocol can be found here [1]. [1] https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.com/ Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com>. Link: https://lore.kernel.org/r/20230216143630.25610-9-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	6eeb290910	vfio/migration: Rename functions/structs related to v1 protocol To avoid name collisions, rename functions and structs related to VFIO migration protocol v1. This will allow the two protocols to co-exist when v2 protocol is added, until v1 is removed. No functional changes intended. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-8-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	16fe4e8ab7	vfio/migration: Move migration v1 logic to vfio_migration_init() Move vfio_dev_get_region_info() logic from vfio_migration_probe() to vfio_migration_init(). This logic is specific to v1 protocol and moving it will make it easier to add the v2 protocol implementation later. No functional changes intended. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-7-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	29d81b71aa	vfio/migration: Block multiple devices migration Currently VFIO migration doesn't implement some kind of intermediate quiescent state in which P2P DMAs are quiesced before stopping or running the device. This can cause problems in multi-device migration where the devices are doing P2P DMAs, since the devices are not stopped together at the same time. Until such support is added, block migration of multiple devices. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-6-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	8b942af393	vfio/common: Change vfio_devices_all_running_and_saving() logic to equivalent one vfio_devices_all_running_and_saving() is used to check if migration is in pre-copy phase. This is done by checking if migration is in setup or active states and if all VFIO devices are in pre-copy state, i.e. _SAVING \| _RUNNING. In VFIO migration protocol v2 pre-copy support is made optional. Hence, a matching v2 protocol pre-copy state can't be used here. As preparation for adding v2 protocol, change vfio_devices_all_running_and_saving() logic such that it doesn't use the VFIO pre-copy state. The new equivalent logic checks if migration is in active state and if all VFIO devices are in running state [1]. No functional changes intended. [1] Note that checking if migration is in setup or active states and if all VFIO devices are in running state doesn't guarantee that we are in pre-copy phase, thus we check if migration is only in active state. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-5-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	b051a3f640	vfio/migration: Allow migration without VFIO IOMMU dirty tracking support Currently, if IOMMU of a VFIO container doesn't support dirty page tracking, migration is blocked. This is because a DMA-able VFIO device can dirty RAM pages without updating QEMU about it, thus breaking the migration. However, this doesn't mean that migration can't be done at all. In such case, allow migration and let QEMU VFIO code mark all pages dirty. This guarantees that all pages that might have gotten dirty are reported back, and thus guarantees a valid migration even without VFIO IOMMU dirty tracking support. The motivation for this patch is the introduction of iommufd [1]. iommufd can directly implement the /dev/vfio/vfio container IOCTLs by mapping them into its internal ops, allowing the usage of these IOCTLs over iommufd. However, VFIO IOMMU dirty tracking is not supported by this VFIO compatibility API. This patch will allow migration by hosts that use the VFIO compatibility API and prevent migration regressions caused by the lack of VFIO IOMMU dirty tracking support. [1] https://lore.kernel.org/kvm/0-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com/ Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-4-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	5c4dbcb748	vfio/migration: Fix NULL pointer dereference bug As part of its error flow, vfio_vmstate_change() accesses MigrationState->to_dst_file without any checks. This can cause a NULL pointer dereference if the error flow is taken and MigrationState->to_dst_file is not set. For example, this can happen if VM is started or stopped not during migration and vfio_vmstate_change() error flow is taken, as MigrationState->to_dst_file is not set at that time. Fix it by checking that MigrationState->to_dst_file is set before using it. Fixes: `02a7e71b1e` ("vfio: Add VM state change handler to know state of VM") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Link: https://lore.kernel.org/r/20230216143630.25610-3-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Juan Quintela	24beea4efe	migration: Rename res_{postcopy,precopy}_only Once that res_compatible is removed, they don't make sense anymore. We remove the _only preffix. And to make things clearer we rename them to must_precopy and can_postcopy. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-15 20:04:30 +01:00
Juan Quintela	24f254ed79	migration: Remove unused res_compatible Nothing assigns to it after previous commit. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-15 20:04:30 +01:00
Juan Quintela	fd70385d38	migration: Remove unused threshold_size parameter Until previous commit, save_live_pending() was used for ram. Now with the split into state_pending_estimate() and state_pending_exact() it is not needed anymore, so remove them. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Juan Quintela	c8df4a7aef	migration: Split save_live_pending() into state_pending_* We split the function into to: - state_pending_estimate: We estimate the remaining state size without stopping the machine. - state pending_exact: We calculate the exact amount of remaining state. The only "device" that implements different functions for _estimate() and _exact() is ram. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Juan Quintela	255dc7af7e	migration: No save_live_pending() method uses the QEMUFile parameter So remove it everywhere. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Markus Armbruster	edf5ca5dbe	include/hw/pci: Split pci_device.h off pci.h PCIDeviceClass and PCIDevice are defined in pci.h. Many users of the header don't actually need them. Similar structs live in their own headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h, PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and PCIESlot in pcie_port.h. Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with the code that needs them. Adjust include directives. This also enables the next commit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221222100330.380143-6-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-01-08 01:54:22 -05:00
Stefan Hajnoczi	f21f1cfeb9	pci,pc,virtio: features, tests, fixes, cleanups lots of acpi rework first version of biosbits infrastructure ASID support in vhost-vdpa core_count2 support in smbios PCIe DOE emulation virtio vq reset HMAT support part of infrastructure for viommu support in vhost-vdpa VTD PASID support fixes, tests all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNpXDkPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpD0AH/2G8ZPrgrxJC9y3uD5/5J6QRzO+TsDYbg5ut uBf4rKSHHzcu6zdyAfsrhbAKKzyD4HrEGNXZrBjnKM1xCiB/SGBcDIWntwrca2+s 5Dpbi4xvd4tg6tVD4b47XNDCcn2uUbeI0e2M5QIbtCmzdi/xKbFAfl5G8DQp431X Kmz79G4CdKWyjVlM0HoYmdCw/4FxkdjD02tE/Uc5YMrePNaEg5Bw4hjCHbx1b6ur 6gjeXAtncm9s4sO0l+sIdyiqlxiTry9FSr35WaQ0qPU+Og5zaf1EiWfdl8TRo4qU EAATw5A4hyw11GfOGp7oOVkTGvcNB/H7aIxD7emdWZV8+BMRPKo= =zTCn -----END PGP SIGNATURE----- Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging pci,pc,virtio: features, tests, fixes, cleanups lots of acpi rework first version of biosbits infrastructure ASID support in vhost-vdpa core_count2 support in smbios PCIe DOE emulation virtio vq reset HMAT support part of infrastructure for viommu support in vhost-vdpa VTD PASID support fixes, tests all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNpXDkPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpD0AH/2G8ZPrgrxJC9y3uD5/5J6QRzO+TsDYbg5ut # uBf4rKSHHzcu6zdyAfsrhbAKKzyD4HrEGNXZrBjnKM1xCiB/SGBcDIWntwrca2+s # 5Dpbi4xvd4tg6tVD4b47XNDCcn2uUbeI0e2M5QIbtCmzdi/xKbFAfl5G8DQp431X # Kmz79G4CdKWyjVlM0HoYmdCw/4FxkdjD02tE/Uc5YMrePNaEg5Bw4hjCHbx1b6ur # 6gjeXAtncm9s4sO0l+sIdyiqlxiTry9FSr35WaQ0qPU+Og5zaf1EiWfdl8TRo4qU # EAATw5A4hyw11GfOGp7oOVkTGvcNB/H7aIxD7emdWZV8+BMRPKo= # =zTCn # -----END PGP SIGNATURE----- # gpg: Signature made Mon 07 Nov 2022 14:27:53 EST # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (83 commits) checkpatch: better pattern for inline comments hw/virtio: introduce virtio_device_should_start tests/acpi: update tables for new core count test bios-tables-test: add test for number of cores > 255 tests/acpi: allow changes for core_count2 test bios-tables-test: teach test to use smbios 3.0 tables hw/smbios: add core_count2 to smbios table type 4 vhost-user: Support vhost_dev_start vhost: Change the sequence of device start intel-iommu: PASID support intel-iommu: convert VTD_PE_GET_FPD_ERR() to be a function intel-iommu: drop VTDBus intel-iommu: don't warn guest errors when getting rid2pasid entry vfio: move implement of vfio_get_xlat_addr() to memory.c tests: virt: Update expected .acpihmatvirt tables tests: acpi: aarch64/virt: add a test for hmat nodes with no initiators hw/arm/virt: Enable HMAT on arm virt machine tests: Add HMAT AArch64/virt empty table files tests: acpi: q35: update expected blobs .hmat-noinitiators expected HMAT: tests: acpi: q35: add test for hmat nodes without initiators ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-11-07 18:43:56 -05:00
Cindy Lu	baa44bce87	vfio: move implement of vfio_get_xlat_addr() to memory.c - Move the implement vfio_get_xlat_addr to softmmu/memory.c, and change the name to memory_get_xlat_addr(). So we can use this function on other devices, such as vDPA device. - Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check whether the memory is backed by a discard manager. then device can have its own warning. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20221031031020.1405111-2-lulu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Avihai Horon	2461e75219	vfio/migration: Fix wrong enum usage vfio_migration_init() initializes VFIOMigration->device_state using enum of VFIO migration protocol v2. Current implemented protocol is v1 so v1 enum should be used. Fix it. Fixes: `429c728006` ("vfio/migration: Fix incorrect initialization value for parameters in VFIOMigration") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Link: https://lore.kernel.org/r/20221016085752.32740-1-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-11-03 15:57:31 -06:00
Alex Williamson	85b6d2b5fc	vfio/common: Fix vfio_iommu_type1_info use after free On error, vfio_get_iommu_info() frees and clears *info, but vfio_connect_container() continues to use the pointer regardless of the return value. Restructure the code such that a failure of this function triggers an error and clean up the remainder of the function, including updating an outdated comment that had drifted from its relevant line of code and using host page size for a default for better compatibility on non-4KB systems. Reported-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/all/20220910004245.2878-1-nicolinc@nvidia.com/ Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/166326219630.3388898.12882473157184946072.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-27 14:26:42 -06:00
Kunkun Jiang	429c728006	vfio/migration: Fix incorrect initialization value for parameters in VFIOMigration The structure VFIOMigration of a VFIODevice is allocated and initialized in vfio_migration_init(). "device_state" and "vm_running" are initialized to 0, indicating that VFIO device is_STOP and VM is not-running. The initialization value is incorrect. According to the agreement, default state of VFIO device is _RUNNING. And if a VFIO device is hot-plugged while the VM is running, "vm_running" should be 1. This patch fixes it. Fixes: `02a7e71b1e` ("vfio: Add VM state change handler to know state of VM") Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Link: https://lore.kernel.org/r/20220711014651.1327-1-jiangkunkun@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-27 14:26:39 -06:00
Akihiko Odaki	362239c05f	ui/console: Do not return a value with ui_info The returned value is not used and misleading. Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com> Message-Id: <20220226115516.59830-2-akihiko.odaki@gmail.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2022-06-14 10:34:37 +02:00
Eric Auger	ec6600be0d	vfio/common: remove spurious warning on vfio_listener_region_del `851d6d1a0f` ("vfio/common: remove spurious tpm-crb-cmd misalignment warning") removed the warning on vfio_listener_region_add() path. However the same warning also hits on region_del path. Let's remove it and reword the dynamic trace as this can be called on both map and unmap path. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220524091405.416256-1-eric.auger@redhat.com Fixes: `851d6d1a0f` ("vfio/common: remove spurious tpm-crb-cmd misalignment warning") Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-06-08 08:44:19 -06:00
Bernhard Beschow	8f1b608798	hw/vfio/pci-quirks: Resolve redundant property getters The QOM API already provides getters for uint64 and uint32 values, so reuse them. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220301225220.239065-2-shentey@gmail.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>	2022-05-24 10:38:50 +10:00
Alex Williamson	e4082063e4	linux-headers: Update to v5.18-rc6 Update to c5eb0a61238d ("Linux 5.18-rc6"). Mechanical search and replace of vfio defines with white space massaging. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-13 08:20:11 -06:00
Yi Liu	44ee6aaae0	vfio/common: Rename VFIOGuestIOMMU::iommu into ::iommu_mr Rename VFIOGuestIOMMU iommu field into iommu_mr. Then it becomes clearer it is an IOMMU memory region. no functional change intended Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20220502094223.36384-4-yi.l.liu@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:51 -06:00
Eric Auger	0d570a2572	vfio/pci: Use vbasedev local variable in vfio_realize() Using a VFIODevice handle local variable to improve the code readability. no functional change intended Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20220502094223.36384-3-yi.l.liu@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Eric Auger	9d38ffc5d8	hw/vfio/pci: fix vfio_pci_hot_reset_result trace point "%m" format specifier is not interpreted by the trace infrastructure and thus "%m" is output instead of the actual errno string. Fix it by outputting strerror(errno). Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20220502094223.36384-2-yi.l.liu@intel.com [aw: replace commit log as provided by Eric] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Eric Auger	851d6d1a0f	vfio/common: remove spurious tpm-crb-cmd misalignment warning The CRB command buffer currently is a RAM MemoryRegion and given its base address alignment, it causes an error report on vfio_listener_region_add(). This region could have been a RAM device region, easing the detection of such safe situation but this option was not well received. So let's add a helper function that uses the memory region owner type to detect the situation is safe wrt the assignment. Other device types can be checked here if such kind of problem occurs again. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220506132510.1847942-3-eric.auger@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Xiang Chen	99510d271b	vfio/common: Fix a small boundary issue of a trace It uses [offset, offset + size - 1] to indicate that the length of range is size in most places in vfio trace code (such as trace_vfio_region_region_mmap()) execpt trace_vfio_region_sparse_mmap_entry(). So change it for trace_vfio_region_sparse_mmap_entry(), but if size is zero, the trace will be weird with an underflow, so move the trace and trace it only if size is not zero. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1650100104-130737-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	dc580d51f7	vfio: defer to commit kvm irq routing when enable msi/msix In migration resume phase, all unmasked msix vectors need to be setup when loading the VF state. However, the setup operation would take longer if the VM has more VFs and each VF has more unmasked vectors. The hot spot is kvm_irqchip_commit_routes, it'll scan and update all irqfds that are already assigned each invocation, so more vectors means need more time to process them. vfio_pci_load_config vfio_msix_enable msix_set_vector_notifiers for (vector = 0; vector < dev->msix_entries_nr; vector++) { vfio_msix_vector_do_use vfio_add_kvm_msi_virq kvm_irqchip_commit_routes <-- expensive } We can reduce the cost by only committing once outside the loop. The routes are cached in kvm_state, we commit them first and then bind irqfd for each vector. The test VM has 128 vcpus and 8 VF (each one has 65 vectors), we measure the cost of the vfio_msix_enable for each VF, and we can see 90+% costs can be reduce. VF Count of irqfds[] Original With this patch 1st 65 8 2 2nd 130 15 2 3rd 195 22 2 4th 260 24 3 5th 325 36 2 6th 390 44 3 7th 455 51 3 8th 520 58 4 Total 258ms 21ms [] Count of irqfds How many irqfds that already assigned and need to process in this round. The optimization can be applied to msi type too. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-6-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	75d546fc18	Revert "vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration" Commit `ecebe53fe9` ("vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration") avoids inefficiently disabling and enabling vectors repeatedly and lets the unmasked vectors be enabled one by one. But we want to batch multiple routes and defer the commit, and only commit once outside the loop of setting vector notifiers, so we cannot enable the vectors one by one in the loop now. Revert that commit and we will take another way in the next patch, it can not only avoid disabling/enabling vectors repeatedly, but also satisfy our requirement of defer to commit. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-5-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	8ab217d5d3	vfio: simplify the failure path in vfio_msi_enable Use vfio_msi_disable_common to simplify the error handling in vfio_msi_enable. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-4-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	be4a46eccf	vfio: move re-enabling INTX out of the common helper Move re-enabling INTX out, and the callers should decide to re-enable it or not. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-3-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	a6f5770fb2	vfio: simplify the conditional statements in vfio_msi_enable It's unnecessary to test against the specific return value of VFIO_DEVICE_SET_IRQS, since any positive return is an error indicating the number of vectors we should retry with. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-2-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Marc-André Lureau	8e3b0cbb72	Replace qemu_real_host_page variables with inlined functions Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:38 +02:00
Markus Armbruster	b21e238037	Use g_new() & friends where that makes obvious sense g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>	2022-03-21 15:44:44 +01:00
Longpeng(Mike)	def4c5570c	kvm/msi: do explicit commit when adding msi routes We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:26:20 +01:00
Bernhard Beschow	5e78c98b7c	Mark remaining global TypeInfo instances as const More than 1k of TypeInfo instances are already marked as const. Mark the remaining ones, too. This commit was created with: git grep -z -l 'static TypeInfo' -- '*.c' \| \ xargs -0 sed -i 's/static TypeInfo/static const TypeInfo/' Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Corey Minyard <cminyard@mvista.com> Message-id: 20220117145805.173070-2-shentey@gmail.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-02-21 13:30:20 +00:00
Peng Liang	f3bc3a73c9	vfio: Fix memory leak of hostwin hostwin is allocated and added to hostwin_list in vfio_host_win_add, but it is only deleted from hostwin_list in vfio_host_win_del, which causes a memory leak. Also, freeing all elements in hostwin_list is missing in vfio_disconnect_container. Fix: `2e4109de8e` ("vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)") CC: qemu-stable@nongnu.org Signed-off-by: Peng Liang <liangpeng10@huawei.com> Link: https://lore.kernel.org/r/20211117014739.1839263-1-liangpeng10@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-11-17 11:25:55 -07:00
Kunkun Jiang	e4b3470838	vfio/common: Add a trace point when a MMIO RAM section cannot be mapped The MSI-X structures of some devices and other non-MSI-X structures may be in the same BAR. They may share one host page, especially in the case of large page granularity, such as 64K. For example, MSIX-Table size of 82599 NIC is 0x30 and the offset in Bar 3(size 64KB) is 0x0. vfio_listener_region_add() will be called to map the remaining range (0x30-0xffff). If host page size is 64KB, it will return early at 'int128_ge((int128_make64(iova), llend))' without any message. Let's add a trace point to inform users like commit `5c08600547` ("vfio: Use a trace point when a RAM section cannot be DMA mapped") did. Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Link: https://lore.kernel.org/r/20211027090406.761-3-jiangkunkun@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-11-01 12:17:51 -06:00
Kunkun Jiang	f36d4fb85f	vfio/pci: Add support for mmapping sub-page MMIO BARs after live migration We can expand MemoryRegions of sub-page MMIO BARs in vfio_pci_write_config() to improve IO performance for some devices. However, the MemoryRegions of destination VM are not expanded any more after live migration. Because their addresses have been updated in vmstate_load_state() (vfio_pci_load_config) and vfio_sub_page_bar_update_mapping() will not be called. This may result in poor performance after live migration. So iterate BARs in vfio_pci_load_config() and try to update sub-page BARs. Reported-by: Nianyao Tang <tangnianyao@huawei.com> Reported-by: Qixin Gan <ganqixin@huawei.com> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Link: https://lore.kernel.org/r/20211027090406.761-2-jiangkunkun@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-11-01 12:17:51 -06:00
Kevin Wolf	f3558b1b76	qdev: Base object creation on QDict rather than QemuOpts QDicts are both what QMP natively uses and what the keyval parser produces. Going through QemuOpts isn't useful for either one, so switch the main device creation function to QDicts. By sharing more code with the -object/object-add code path, we can even reduce the code size a bit. This commit doesn't remove the detour through QemuOpts from any code path yet, but it allows the following commits to do so. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20211008133442.141332-15-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-10-15 16:11:22 +02:00
Peter Xu	142518bda5	memory: Name all the memory listeners Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 15:30:24 +02:00
Sean Christopherson	56918a126a	memory: Add RAM_PROTECTED flag to skip IOMMU mappings Add a new RAMBlock flag to denote "protected" memory, i.e. memory that looks and acts like RAM but is inaccessible via normal mechanisms, including DMA. Use the flag to skip protected memory regions when mapping RAM for DMA in VFIO. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:19 +02:00
Cai Huoqing	631ba5a128	hw/vfio: Fix typo in comments Fix typo in comments: programatically ==> programmatically disconecting ==> disconnecting mulitple ==> multiple timout ==> timeout regsiter ==> register forumula ==> formula Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210730012613.2198-1-caihuoqing@baidu.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-09-16 11:57:01 +02:00
Cornelia Huck	759a5d3be0	vfio-ccw: forward halt/clear errors hsch and csch basically have two parts: execute the command, and perform the halt/clear function. For fully emulated subchannels, it is pretty clear how it will work: check the subchannel state, and actually 'perform the halt/clear function' and set cc 0 if everything looks good. For passthrough subchannels, some of the checking is done within QEMU, but some has to be done within the kernel. QEMU's subchannel state may be such that we can perform the async function, but the kernel may still get a cc != 0 when it is actually executing the instruction. In that case, we need to set the condition actually encountered by the kernel; if we set cc 0 on error, we would actually need to inject an interrupt as well. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Tested-by: Jared Rossi <jrossi@linux.ibm.com> Message-Id: <20210705163952.736020-2-cohuck@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-09-06 16:22:54 +02:00
Markus Armbruster	eb24a23e15	vfio: Avoid error_propagate() after migrate_add_blocker() When migrate_add_blocker(blocker, &err) is followed by error_propagate(errp, err), we can often just as well do migrate_add_blocker(..., errp). This is the case in vfio_migration_probe(). Prior art: commit `386f6c07d2` "error: Avoid error_propagate() after migrate_add_blocker()". Cc: Kirti Wankhede <kwankhede@nvidia.com> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-8-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed by: Kirti Wankhede <kwankhede@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-26 17:15:28 +02:00
Cai Huoqing	1bd9f1b14d	vfio/pci: Add pba_offset PCI quirk for BAIDU KUNLUN AI processor Fix pba_offset initialization value for BAIDU KUNLUN Virtual Function device. The KUNLUN hardware returns an incorrect value for the VF PBA offset, and add a quirk to instead return a hardcoded value of 0xb400. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Link: https://lore.kernel.org/r/20210713093743.942-1-caihuoqing@baidu.com [aw: comment & whitespace tuning] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-07-14 13:47:17 -06:00
Cai Huoqing	936555bc4f	vfio/pci: Change to use vfio_pci_is() Make use of vfio_pci_is() helper function. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Link: https://lore.kernel.org/r/20210713014831.742-1-caihuoqing@baidu.com [aw: commit log wording] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-07-14 13:47:17 -06:00
David Hildenbrand	a5dba9bc05	vfio: Fix CID 1458134 in vfio_register_ram_discard_listener() CID 1458134: Integer handling issues (BAD_SHIFT) In expression "1 << ctz64(container->pgsizes)", left shifting by more than 31 bits has undefined behavior. The shift amount, "ctz64(container->pgsizes)", is 64. Commit `5e3b981c33` ("vfio: Support for RamDiscardManager in the !vIOMMU case") added an assertion that our granularity is at least as big as the page size. Although unlikely, we could have a page size that does not fit into 32 bit. In that case, we'd try shifting by more than 31 bit. Let's use 1ULL instead and make sure we're not shifting by more than 63 bit by asserting that any bit in container->pgsizes is set. Fixes: CID 1458134 Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Link: https://lore.kernel.org/r/20210712083135.15755-1-david@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-07-14 13:47:17 -06:00
Peter Maydell	57e28d34c0	s390x updates: - add gen16 cpumodels - refactor/cleanup some code - bugfixes -----BEGIN PGP SIGNATURE----- iIgEABYIADAWIQRpo7U29cv8ZSCAJsHeiLtWQd5mwQUCYObg3RIcY29odWNrQHJl ZGhhdC5jb20ACgkQ3oi7VkHeZsGAdAD/dSZkhfgjNWJjka0hmnyQyNCSzq6jox1L PccGyqhkqU8BAM4DUa2bZdst8bLfhUuAA0M5gKkCqkzHdDraBqTL8LQJ =H7dn -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/cohuck-gitlab/tags/s390x-20210708' into staging s390x updates: - add gen16 cpumodels - refactor/cleanup some code - bugfixes # gpg: Signature made Thu 08 Jul 2021 12:26:21 BST # gpg: using EDDSA key 69A3B536F5CBFC65208026C1DE88BB5641DE66C1 # gpg: issuer "cohuck@redhat.com" # gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" [unknown] # gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" [full] # gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" [full] # gpg: aka "Cornelia Huck <cohuck@kernel.org>" [unknown] # gpg: aka "Cornelia Huck <cohuck@redhat.com>" [unknown] # Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF # Subkey fingerprint: 69A3 B536 F5CB FC65 2080 26C1 DE88 BB56 41DE 66C1 * remotes/cohuck-gitlab/tags/s390x-20210708: target/s390x: split sysemu part of cpu models target/s390x: move kvm files into kvm/ target/s390x: remove kvm-stub.c target/s390x: use kvm_enabled() to wrap call to kvm_s390_get_hpage_1m target/s390x: make helper.c sysemu-only target/s390x: split cpu-dump from helper.c target/s390x: move sysemu-only code out to cpu-sysemu.c target/s390x: start moving TCG-only code to tcg/ target/s390x: rename internal.h to s390x-internal.h target/s390x: remove tcg-stub.c hw/s390x: only build tod-tcg from the CONFIG_TCG build hw/s390x: tod: make explicit checks for accelerators when initializing hw/s390x: rename tod-qemu.c to tod-tcg.c target/s390x: meson: add target_user_arch s390x/tcg: Fix m5 vs. m4 field for VECTOR MULTIPLY SUM LOGICAL target/s390x: Fix CC set by CONVERT TO FIXED/LOGICAL s390x/cpumodel: add 3931 and 3932 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-07-12 19:15:11 +01:00
David Hildenbrand	53d1b5fcfb	vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus We support coordinated discarding of RAM using the RamDiscardManager for the VFIO_TYPE1 iommus. Let's unlock support for coordinated discards, keeping uncoordinated discards (e.g., via virtio-balloon) disabled if possible. This unlocks virtio-mem + vfio on x86-64. Note that vfio used via "nvme://" by the block layer has to be implemented/unlocked separately. For now, virtio-mem only supports x86-64; we don't restrict RamDiscardManager to x86-64, though: arm64 and s390x are supposed to work as well, and we'll test once unlocking virtio-mem support. The spapr IOMMUs will need special care, to be tackled later, e.g.., once supporting virtio-mem. Note: The block size of a virtio-mem device has to be set to sane sizes, depending on the maximum hotplug size - to not run out of vfio mappings. The default virtio-mem block size is usually in the range of a couple of MBs. The maximum number of mapping is 64k, shared with other users. Assume you want to hotplug 256GB using virtio-mem - the block size would have to be set to at least 8 MiB (resulting in 32768 separate mappings). Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-14-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
David Hildenbrand	0fd7616e0f	vfio: Support for RamDiscardManager in the vIOMMU case vIOMMU support works already with RamDiscardManager as long as guests only map populated memory. Both, populated and discarded memory is mapped into &address_space_memory, where vfio_get_xlat_addr() will find that memory, to create the vfio mapping. Sane guests will never map discarded memory (e.g., unplugged memory blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while memory is getting discarded. However, there are two cases where a malicious guests could trigger pinning of more memory than intended. One case is easy to handle: the guest trying to map discarded memory into an IOMMU. The other case is harder to handle: the guest keeping memory mapped in the IOMMU while it is getting discarded. We would have to walk over all mappings when discarding memory and identify if any mapping would be a violation. Let's keep it simple for now and print a warning, indicating that setting RLIMIT_MEMLOCK can mitigate such attacks. We have to take care of incoming migration: at the point the IOMMUs get restored and start creating mappings in vfio, RamDiscardManager implementations might not be back up and running yet: let's add runstate priorities to enforce the order when restoring. Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-10-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
David Hildenbrand	a74317f636	vfio: Sanity check maximum number of DMA mappings with RamDiscardManager Although RamDiscardManager can handle running into the maximum number of DMA mappings by propagating errors when creating a DMA mapping, we want to sanity check and warn the user early that there is a theoretical setup issue and that virtio-mem might not be able to provide as much memory towards a VM as desired. As suggested by Alex, let's use the number of KVM memory slots to guess how many other mappings we might see over time. Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-9-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
David Hildenbrand	3eed155caf	vfio: Query and store the maximum number of possible DMA mappings Let's query the maximum number of possible DMA mappings by querying the available mappings when creating the container (before any mappings are created). We'll use this informaton soon to perform some sanity checks and warn the user. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-8-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
David Hildenbrand	5e3b981c33	vfio: Support for RamDiscardManager in the !vIOMMU case Implement support for RamDiscardManager, to prepare for virtio-mem support. Instead of mapping the whole memory section, we only map "populated" parts and update the mapping when notified about discarding/population of memory via the RamDiscardListener. Similarly, when syncing the dirty bitmaps, sync only the actually mapped (populated) parts by replaying via the notifier. Using virtio-mem with vfio is still blocked via ram_block_discard_disable()/ram_block_discard_require() after this patch. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-7-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
Cho, Yu-Chen	67043607d1	target/s390x: move kvm files into kvm/ move kvm files into kvm/ After the reshuffling, update MAINTAINERS accordingly. Make use of the new directory: target/s390x/kvm/ Signed-off-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Cho, Yu-Chen <acho@suse.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210707105324.23400-14-acho@suse.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-07-07 14:01:59 +02:00
Eric Farman	c626710fc7	s390x/css: Add passthrough IRB Wire in the subchannel callback for building the IRB ESW and ECW space for passthrough devices, and copy the hardware's ESW into the IRB we are building. If the hardware presented concurrent sense, then copy that sense data into the IRB's ECW space. Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20210617232537.1337506-5-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-06-21 08:48:21 +02:00
Kirti Wankhede	d742d064c1	vfio/migration: Correct device state from vmstate change for savevm case Set _SAVING flag for device state from vmstate change handler when it gets called from savevm. Currently State transition savevm/suspend is seen as: _RUNNING -> _STOP -> Stop-and-copy -> _STOP State transition savevm/suspend should be: _RUNNING -> Stop-and-copy -> _STOP State transition from _RUNNING to _STOP occurs from vfio_vmstate_change() where when vmstate changes from running to !running, _RUNNING flag is reset but at the same time when vfio_vmstate_change() is called for RUN_STATE_SAVE_VM, _SAVING bit should be set. Reported by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Message-Id: <1623177441-27496-1-git-send-email-kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-06-18 09:10:35 -06:00
Kunkun Jiang	22fca190e2	vfio: Fix unregister SaveVMHandler in vfio_migration_finalize In the vfio_migration_init(), the SaveVMHandler is registered for VFIO device. But it lacks the operation of 'unregister'. It will lead to 'Segmentation fault (core dumped)' in qemu_savevm_state_setup(), if performing live migration after a VFIO device is hot deleted. Fixes: `7c2f5f75f9` (vfio: Register SaveVMHandlers for VFIO device) Reported-by: Qixin Gan <ganqixin@huawei.com> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Message-Id: <20210527123101.289-1-jiangkunkun@huawei.com> Reviewed by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-06-18 08:38:04 -06:00
Stefano Garzarella	d0fb9657a3	docs: fix references to docs/devel/tracing.rst Commit `e50caf4a5c` ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-06-02 06:51:09 +02:00
Eric Farman	dcc9cf3801	vfio-ccw: Attempt to clean up all IRQs on error The vfio_ccw_unrealize() routine makes an unconditional attempt to unregister every IRQ notifier, though they may not have been registered in the first place (when running on an older kernel, for example). Let's mirror this behavior in the error cleanups in vfio_ccw_realize() so that if/when new IRQs are added, it is less confusing to recognize the necessary procedures. The worst case scenario would be some extra messages about an undefined IRQ, but since this is an error exit that won't be the only thing to worry about. And regarding those messages, let's change it to a warning instead of an error, to better reflect their severity. The existing code in both paths handles everything anyway. Signed-off-by: Eric Farman <farman@linux.ibm.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <20210428143652.1571487-1-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-05-20 14:19:30 +02:00
Eric Farman	6178d4689a	vfio-ccw: Permit missing IRQs Commit `690e29b911` ("vfio-ccw: Refactor ccw irq handler") changed one of the checks for the IRQ notifier registration from saying "the host needs to recognize the only IRQ that exists" to saying "the host needs to recognize ANY IRQ that exists." And this worked fine, because the subsequent change to support the CRW IRQ notifier doesn't get into this code when running on an older kernel, thanks to a guard by a capability region. The later addition of the REQ(uest) IRQ by commit `b2f96f9e4f` ("vfio-ccw: Connect the device request notifier") broke this assumption because there is no matching capability region. Thus, running new QEMU on an older kernel fails with: vfio: unexpected number of irqs 2 Let's adapt the message here so that there's a better clue of what IRQ is missing. Furthermore, let's make the REQ(uest) IRQ not fail when attempting to register it, to permit running vfio-ccw on a newer QEMU with an older kernel. Fixes: `b2f96f9e4f` ("vfio-ccw: Connect the device request notifier") Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20210421152053.2379873-1-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-05-20 14:19:30 +02:00
Thomas Huth	2068cabd3f	Do not include cpu.h if it's not really necessary Stop including cpu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-4-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Thomas Huth	4c386f8064	Do not include sysemu/sysemu.h if it's not really necessary Stop including sysemu/sysemu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-2-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Thomas Huth	f6527eadeb	hw: Do not include hw/sysbus.h if it is not necessary Many files include hw/sysbus.h without needing it. Remove the superfluous include statements. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210327082804.2259480-1-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Thomas Huth	e06054368c	hw: Remove superfluous includes of hw/hw.h The include/hw/hw.h header only has a prototype for hw_error(), so it does not make sense to include this in files that do not use this function. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210326151848.2217216-1-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Keqian Zhu	758b96b61d	vfio/migrate: Move switch of dirty tracking into vfio_memory_listener For now the switch of vfio dirty page tracking is integrated into @vfio_save_handler. The reason is that some PCI vendor driver may start to track dirty base on _SAVING state of device, so if dirty tracking is started before setting device state, vfio will report full-dirty to QEMU. However, the dirty bmap of all ramblocks are fully set when setup ram saving, so it's not matter whether the device is in _SAVING state when start vfio dirty tracking. Moreover, this logic causes some problems [1]. The object of dirty tracking is guest memory, but the object of @vfio_save_handler is device state, which produces unnecessary coupling and conflicts: 1. Coupling: Their saving granule is different (perVM vs perDevice). vfio will enable dirty_page_tracking for each devices, actually once is enough. 2. Conflicts: The ram_save_setup() traverses all memory_listeners to execute their log_start() and log_sync() hooks to get the first round dirty bitmap, which is used by the bulk stage of ram saving. However, as vfio dirty tracking is not yet started, it can't get dirty bitmap from vfio. Then we give up the chance to handle vfio dirty page at bulk stage. Move the switch of vfio dirty_page_tracking into vfio_memory_listener can solve above problems. Besides, Do not require devices in SAVING state for vfio_sync_dirty_bitmap(). [1] https://www.spinics.net/lists/kvm/msg229967.html Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210309031913.11508-1-zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Kunkun Jiang	1eb7f64275	vfio: Support host translation granule size The cpu_physical_memory_set_dirty_lebitmap() can quickly deal with the dirty pages of memory by bitmap-traveling, regardless of whether the bitmap is aligned correctly or not. cpu_physical_memory_set_dirty_lebitmap() supports pages in bitmap of host page size. So it'd better to set bitmap_pgsize to host page size to support more translation granule sizes. [aw: The Fixes commit below introduced code to restrict migration support to configurations where the target page size intersects the host dirty page support. For example, a 4K guest on a 4K host. Due to the above flexibility in bitmap handling, this restriction unnecessarily prevents mixed target/host pages size that could otherwise be supported. Use host page size for dirty bitmap.] Fixes: `87ea529c50` ("vfio: Get migration capability flags for container") Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Message-Id: <20210304133446.1521-1-jiangkunkun@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Shenming Lu	ecebe53fe9	vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration In VFIO migration resume phase and some guest startups, there are already unmasked vectors in the vector table when calling vfio_msix_enable(). So in order to avoid inefficiently disabling and enabling vectors repeatedly, let's allocate all needed vectors first and then enable these unmasked vectors one by one without disabling. Signed-off-by: Shenming Lu <lushenming@huawei.com> Message-Id: <20210310030233.1133-4-lushenming@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Shenming Lu	8ce1ff990e	vfio: Set the priority of the VFIO VM state change handler explicitly In the VFIO VM state change handler when stopping the VM, the _RUNNING bit in device_state is cleared which makes the VFIO device stop, including no longer generating interrupts. Then we can save the pending states of all interrupts in the GIC VM state change handler (on ARM). So we have to set the priority of the VFIO VM state change handler explicitly (like virtio devices) to ensure it is called before the GIC's in saving. Signed-off-by: Shenming Lu <lushenming@huawei.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210310030233.1133-3-lushenming@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Shenming Lu	d329f5032e	vfio: Move the saving of the config space to the right place in VFIO migration On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt setup, if the restoring of the VFIO PCI device config space is before the VGIC, an error might occur in the kernel. So we move the saving of the config space to the non-iterable process, thus it will be called after the VGIC according to their priorities. As for the possible dependence of the device specific migration data on it's config space, we can let the vendor driver to include any config info it needs in its own data stream. Signed-off-by: Shenming Lu <lushenming@huawei.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Message-Id: <20210310030233.1133-2-lushenming@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Eric Auger	8dca037b48	vfio: Do not register any IOMMU_NOTIFIER_DEVIOTLB_UNMAP notifier In an attempt to fix smmu/virtio-iommu - vhost regression, commit `958ec334bc` ("vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support") broke virtio-iommu integration. This is due to the fact VFIO registers IOMMU_NOTIFIER_ALL notifiers, which includes IOMMU_NOTIFIER_DEVIOTLB_UNMAP and this latter now is rejected by the virtio-iommu. As a consequence, the registration fails. VHOST behaves like a device with an ATC cache. The VFIO device does not support this scheme yet. Let's register only legacy MAP and UNMAP notifiers. Fixes: `958ec334bc` ("vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support") Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20210209213233.40985-2-eric.auger@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Philippe Mathieu-Daudé	4eda914cac	hw/vfio/pci-quirks: Replace the word 'blacklist' Follow the inclusive terminology from the "Conscious Language in your Open Source Projects" guidelines [] and replace the word "blacklist" appropriately. [] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210205171817.2108907-9-philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Zenghui Yu	4292d50193	vfio: Fix vfio_listener_log_sync function name typo There is an obvious typo in the function name of the .log_sync() callback. Spell it correctly. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Message-Id: <20201204014240.772-1-yuzenghui@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Philippe Mathieu-Daudé	538f049704	sysemu: Let VMChangeStateHandler take boolean 'running' argument The 'running' argument from VMChangeStateHandler does not require other value than 0 / 1. Make it a plain boolean. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210111152020.1422021-3-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-03-09 23:13:57 +01:00
Eric Farman	d6cd66311f	vfio-ccw: Do not read region ret_code after write A pwrite() call returns the number of bytes written (or -1 on error), and vfio-ccw compares this number with the size of the region to determine if an error had occurred or not. If they are not equal, this is a failure and the errno is used to determine exactly how things failed. An errno of zero is possible (though unlikely) in this situation and would be translated to a successful operation. If they ARE equal, the ret_code field is read from the region to determine how to proceed. While the kernel sets the ret_code field as necessary, the region and thus this field is not "written back" to the user. So the value can only be what it was initialized to, which is zero. So, let's convert an unexpected length with errno of zero to a return code of -EFAULT, and explicitly set an expected length to a return code of zero. This will be a little safer and clearer. Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20210303160739.2179378-1-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-03-04 11:24:49 +01:00
Prasad J Pandit	24202d2b56	vfio: add quirk device write method Add vfio quirk device mmio write method to avoid NULL pointer dereference issue. Reported-by: Lei Sun <slei.casper@gmail.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <20200811114133.672647-4-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-02-08 15:15:32 +01:00
Marc-André Lureau	a7dfbe289e	ui: add an optional get_flags callback to GraphicHwOps Those flags can be used to express different requirements for the display or other needs. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20210204105232.834642-12-marcandre.lureau@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2021-02-04 15:58:54 +01:00
Eric Farman	b2f96f9e4f	vfio-ccw: Connect the device request notifier Now that the vfio-ccw code has a notifier interface to request that a device be unplugged, let's wire that together. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210104202057.48048-4-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-01-21 11:19:45 +01:00
Peter Maydell	729cc68373	Remove superfluous timer_del() calls This commit is the result of running the timer-del-timer-free.cocci script on the whole source tree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201215154107.3255-4-peter.maydell@linaro.org	2021-01-08 15:13:38 +00:00
Eduardo Habkost	1e198715e1	qdev: Rename qdev_get_prop_ptr() to object_field_prop_ptr() The function will be moved to common QOM code, as it is not specific to TYPE_DEVICE anymore. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-31-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-18 15:20:18 -05:00
Eduardo Habkost	ea7c1e5c3e	qdev: Move dev->realized check to qdev_property_set() Every single qdev property setter function manually checks dev->realized. We can just check dev->realized inside qdev_property_set() instead. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-24-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-18 15:20:17 -05:00
Eduardo Habkost	ce35e2295e	qdev: Move softmmu properties to qdev-properties-system.h Move the property types and property macros implemented in qdev-properties-system.c to a new qdev-properties-system.h header. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20201211220529.2290218-16-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-18 15:20:17 -05:00
Eduardo Habkost	828ade86ee	qdev: Make qdev_get_prop_ptr() get Object* arg Make the code more generic and not specific to TYPE_DEVICE. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> #s390 parts Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-10-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-15 10:02:07 -05:00
Kirti Wankhede	bb0990d174	vfio: Change default dirty pages tracking behavior during migration By default dirty pages tracking is enabled during iterative phase (pre-copy phase). Added per device opt-out option 'x-pre-copy-dirty-page-tracking' to disable dirty pages tracking during iterative phase. If the option 'x-pre-copy-dirty-page-tracking=off' is set for any VFIO device, dirty pages tracking during iterative phase will be disabled. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-23 10:05:58 -07:00
Alex Williamson	cf254988a5	vfio: Make migration support experimental Support for migration of vfio devices is still in flux. Developers are attempting to add support for new devices and new architectures, but none are yet readily available for validation. We have concerns whether we're transferring device resources at the right point in the migration, whether we're guaranteeing that updates during pre-copy are migrated, and whether we can provide bit-stream compatibility should any of this change. Even the question of whether devices should participate in dirty page tracking during pre-copy seems contentious. In short, migration support has not had enough soak time and it feels premature to mark it as supported. Create an experimental option such that we can continue to develop. [Retaining previous acks/reviews for a previously identical code change with different specifics in the commit log.] Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-23 08:29:29 -07:00
Stefan Weil	ac9574bc87	docs: Fix some typos (found by codespell) Fix also a similar typo in a code comment. Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20201117193448.393472-1-sw@weilnetz.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-11-18 09:29:41 +01:00
Kirti Wankhede	e408aeef86	Fix use after free in vfio_migration_probe Fixes Coverity issue: CID 1436126: Memory - illegal accesses (USE_AFTER_FREE) Fixes: `a9e271ec9b` ("vfio: Add migration region initialization and finalize function") Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: David Edmondson <dme@dme.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-12 15:58:16 -07:00
Jean-Philippe Brucker	1b296c3def	vfio: Don't issue full 2^64 unmap IOMMUs may declare memory regions spanning from 0 to UINT64_MAX. When attempting to deal with such region, vfio_listener_region_del() passes a size of 2^64 to int128_get64() which throws an assertion failure. Even ignoring this, the VFIO_IOMMU_DMA_MAP ioctl cannot handle this size since the size field is 64-bit. Split the request in two. Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-11-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-11-03 16:39:05 -05:00
Bharat Bhushan	b917749842	vfio: Set IOMMU page size as per host supported page size Set IOMMU supported page size mask same as host Linux supported page size mask. Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-9-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-11-03 07:19:27 -05:00
Zhengui li	c624b6b312	vfio: fix incorrect print type The type of input variable is unsigned int while the printer type is int. So fix incorrect print type. Signed-off-by: Zhengui li <lizhengui@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:52 -07:00
Amey Narkhede	88eef59796	hw/vfio: Use lock guard macros Use qemu LOCK_GUARD macros in hw/vfio. Saves manual unlock calls Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:52 -07:00
Matthew Rosato	92fe289ace	vfio: Add routine for finding VFIO_DEVICE_GET_INFO capabilities Now that VFIO_DEVICE_GET_INFO supports capability chains, add a helper function to find specific capabilities in the chain. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:52 -07:00
Matthew Rosato	7486a62845	vfio: Find DMA available capability The underlying host may be limiting the number of outstanding DMA requests for type 1 IOMMU. Add helper functions to check for the DMA available capability and retrieve the current number of DMA mappings allowed. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [aw: vfio_get_info_dma_avail moved inside CONFIG_LINUX] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Matthew Rosato	3ab7a0b40d	vfio: Create shared routine for scanning info capabilities Rather than duplicating the same loop in multiple locations, create a static function to do the work. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	3710586caa	qapi: Add VFIO devices migration stats in Migration stats Added amount of bytes transferred to the VM at destination by all VFIO devices Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	a22651053b	vfio: Make vfio-pci device migration capable If the device is not a failover primary device, call vfio_migration_probe() and vfio_migration_finalize() to enable migration support for those devices that support it respectively to tear it down again. Removed migration blocker from VFIO PCI device specific structure and use migration blocker from generic structure of VFIO device. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	9e7b0442f2	vfio: Add ioctl to get dirty pages bitmap during dma unmap With vIOMMU, IO virtual address range can get unmapped while in pre-copy phase of migration. In that case, unmap ioctl should return pages pinned in that range and QEMU should find its correcponding guest physical addresses and report those dirty. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> [aw: fix error_report types, fix cpu_physical_memory_set_dirty_lebitmap() cast] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00

1 2 3 4 5 ...

547 Commits