mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Zhenzhong Duan	b83b40b614	vfio/pci: Fix a use-after-free issue vbasedev->name is freed wrongly which leads to garbage VFIO trace log. Fix it by allocating a dup of vbasedev->name and then free the dup. Fixes: `2dca1b37a7` ("vfio/pci: add support for VF token") Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-05-24 09:21:22 +02:00
Alex Williamson	b5048a4cbf	vfio/pci: Static Resizable BAR capability The PCI Resizable BAR (ReBAR) capability is currently hidden from the VM because the protocol for interacting with the capability does not support a mechanism for the device to reject an advertised supported BAR size. However, when assigned to a VM, the act of resizing the BAR requires adjustment of host resources for the device, which absolutely can fail. Linux does not currently allow us to reserve resources for the device independent of the current usage. The only writable field within the ReBAR capability is the BAR Size register. The PCIe spec indicates that when written, the device should immediately begin to operate with the provided BAR size. The spec however also notes that software must only write values corresponding to supported sizes as indicated in the capability and control registers. Writing unsupported sizes produces undefined results. Therefore, if the hypervisor were to virtualize the capability and control registers such that the current size is the only indicated available size, then a write of anything other than the current size falls into the category of undefined behavior, where we can essentially expose the modified ReBAR capability as read-only. This may seem pointless, but users have reported that virtualizing the capability in this way not only allows guest software to expose related features as available (even if only cosmetic), but in some scenarios can resolve guest driver issues. Additionally, no regressions in behavior have been reported for this change. A caveat here is that the PCIe spec requires for compatibility that devices report support for a size in the range of 1MB to 512GB, therefore if the current BAR size falls outside that range we revert to hiding the capability. Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230505232308.2869912-1-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-05-09 09:30:13 -06:00
Avihai Horon	ff180c6bd7	vfio/migration: Skip log_sync during migration SETUP state Currently, VFIO log_sync can be issued while migration is in SETUP state. However, doing this log_sync is at best redundant and at worst can fail. Redundant -- all RAM is marked dirty in migration SETUP state and is transferred only after migration is set to ACTIVE state, so doing log_sync during migration SETUP is pointless. Can fail -- there is a time window, between setting migration state to SETUP and starting dirty tracking by RAM save_live_setup handler, during which dirty tracking is still not started. Any VFIO log_sync call that is issued during this time window will fail. For example, this error can be triggered by migrating a VM when a GUI is active, which constantly calls log_sync. Fix it by skipping VFIO log_sync while migration is in SETUP state. Fixes: `758b96b61d` ("vfio/migrate: Move switch of dirty tracking into vfio_memory_listener") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Link: https://lore.kernel.org/r/20230403130000.6422-1-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-05-09 09:30:13 -06:00
Minwoo Im	2dca1b37a7	vfio/pci: add support for VF token VF token was introduced [1] to kernel vfio-pci along with SR-IOV support [2]. This patch adds support VF token among PF and VF(s). To passthu PCIe VF to a VM, kernel >= v5.7 needs this. It can be configured with UUID like: -device vfio-pci,host=DDDD:BB:DD:F,vf-token=<uuid>,... [1] https://lore.kernel.org/linux-pci/158396393244.5601.10297430724964025753.stgit@gimli.home/ [2] https://lore.kernel.org/linux-pci/158396044753.5601.14804870681174789709.stgit@gimli.home/ Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Link: https://lore.kernel.org/r/20230320073522epcms2p48f682ecdb73e0ae1a4850ad0712fd780@epcms2p4 Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-05-09 09:30:13 -06:00
Richard Henderson	cc37d98bfb	*: Add missing includes of qemu/error-report.h This had been pulled in via qemu/plugin.h from hw/core/cpu.h, but that will be removed. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230310195252.210956-5-richard.henderson@linaro.org> [AJB: add various additional cases shown by CI] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230315174331.2959-15-alex.bennee@linaro.org> Reviewed-by: Emilio Cota <cota@braap.org>	2023-03-22 15:06:57 +00:00
Cédric Le Goater	969dae5448	vfio: Fix vfio_get_dev_region() trace event Simply transpose 'x8' to fix the typo and remove the ending '8' Fixes: `e61a424f05` ("vfio: Create device specific region info helper") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1526 Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20230303074330.2609377-1-clg@kaod.org [aw: commit log s/revert/transpose/] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 11:19:07 -07:00
Alex Williamson	8249cffc62	vfio/migration: Rename entry points Pick names that align with the section drivers should use them from, avoiding the confusion of calling a _finalize() function from _exit() and generalizing the actual _finalize() to handle removing the viommu blocker. Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/167820912978.606734.12740287349119694623.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 11:19:07 -07:00
Joao Martins	95b29658b6	vfio/migration: Query device dirty page tracking support Now that everything has been set up for device dirty page tracking, query the device for device dirty page tracking support. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-15-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 10:21:22 -07:00
Joao Martins	e46883204c	vfio/migration: Block migration with vIOMMU Migrating with vIOMMU will require either tracking maximum IOMMU supported address space (e.g. 39/48 address width on Intel) or range-track current mappings and dirty track the new ones post starting dirty tracking. This will be done as a separate series, so add a live migration blocker until that is fixed. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-14-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 10:21:22 -07:00
Joao Martins	b153402a89	vfio/common: Add device dirty page bitmap sync Add device dirty page bitmap sync functionality. This uses the device DMA logging uAPI to sync dirty page bitmap from the device. Device dirty page bitmap sync is used only if all devices within a container support device dirty page tracking. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-13-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 10:21:22 -07:00
Avihai Horon	6607109f05	vfio/common: Extract code from vfio_get_dirty_bitmap() to new function Extract the VFIO_IOMMU_DIRTY_PAGES ioctl code in vfio_get_dirty_bitmap() to its own function. This will help the code to be more readable after next patch will add device dirty page bitmap sync functionality. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-12-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 10:21:22 -07:00
Joao Martins	5255bbf4ec	vfio/common: Add device dirty page tracking start/stop Add device dirty page tracking start/stop functionality. This uses the device DMA logging uAPI to start and stop dirty page tracking by device. Device dirty page tracking is used only if all devices within a container support device dirty page tracking. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20230307125450.62409-11-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 10:21:11 -07:00
Joao Martins	62c1b0024b	vfio/common: Record DMA mapped IOVA ranges According to the device DMA logging uAPI, IOVA ranges to be logged by the device must be provided all at once upon DMA logging start. As preparation for the following patches which will add device dirty page tracking, keep a record of all DMA mapped IOVA ranges so later they can be used for DMA logging start. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-10-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Joao Martins	4ead830848	vfio/common: Add helper to consolidate iova/end calculation In preparation to be used in device dirty tracking, move the code that calculate a iova/end range from the container/section. This avoids duplication on the common checks across listener callbacks. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-9-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Joao Martins	b92f237635	vfio/common: Consolidate skip/invalid section into helper The checks are replicated against region_add and region_del and will be soon added in another memory listener dedicated for dirty tracking. Move these into a new helper for avoid duplication. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Link: https://lore.kernel.org/r/20230307125450.62409-8-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Joao Martins	1cd7fa7adc	vfio/common: Use a single tracepoint for skipped sections In preparation to turn more of the memory listener checks into common functions, one of the affected places is how we trace when sections are skipped. Right now there is one for each. Change it into one single tracepoint `vfio_listener_region_skip` which receives a name which refers to the callback i.e. region_add and region_del. Suggested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-7-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Joao Martins	fbc6c92134	vfio/common: Add helper to validate iova/end against hostwin Move the code that finds the container host DMA window against a iova range. This avoids duplication on the common checks across listener callbacks. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Link: https://lore.kernel.org/r/20230307125450.62409-6-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Avihai Horon	725ccd7e41	vfio/common: Add VFIOBitmap and alloc function There are already two places where dirty page bitmap allocation and calculations are done in open code. To avoid code duplication, introduce VFIOBitmap struct and corresponding alloc function and use them where applicable. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-5-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Avihai Horon	236e0a45f5	vfio/common: Abort migration if dirty log start/stop/sync fails If VFIO dirty pages log start/stop/sync fails during migration, migration should be aborted as pages dirtied by VFIO devices might not be reported properly. This is not the case today, where in such scenario only an error is printed. Fix it by aborting migration in the above scenario. Fixes: `758b96b61d` ("vfio/migrate: Move switch of dirty tracking into vfio_memory_listener") Fixes: `b6dd6504e3` ("vfio: Add vfio_listener_log_sync to mark dirty pages") Fixes: `9e7b0442f2` ("vfio: Add ioctl to get dirty pages bitmap during dma unmap") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-4-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Avihai Horon	db9b829b15	vfio/common: Fix wrong %m usages There are several places where the %m conversion is used if one of vfio_dma_map(), vfio_dma_unmap() or vfio_get_dirty_bitmap() fail. The %m usage in these places is wrong since %m relies on errno value while the above functions don't report errors via errno. Fix it by using strerror() with the returned value instead. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-3-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Avihai Horon	3e2413a652	vfio/common: Fix error reporting in vfio_get_dirty_bitmap() Return -errno instead of -1 if VFIO_IOMMU_DIRTY_PAGES ioctl fails in vfio_get_dirty_bitmap(). Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20230307125450.62409-2-joao.m.martins@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-03-07 07:20:32 -07:00
Philippe Mathieu-Daudé	ecba646895	hw/vfio/ccw: Replace DO_UPCAST(VFIOCCWDevice) by VFIO_CCW() Use the VFIO_CCW() QOM type-checking macro to avoid DO_UPCAST(). Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230213170145.45666-7-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-02-27 09:15:38 +01:00
Philippe Mathieu-Daudé	4b447883ed	hw/vfio/ccw: Remove pointless S390CCWDevice variable QOM parenthood relationship is: VFIOCCWDevice -> S390CCWDevice -> CcwDevice -> DeviceState No need to double-cast, call CCW_DEVICE() on VFIOCCWDevice. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20230213170145.45666-6-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-02-27 09:15:38 +01:00
Philippe Mathieu-Daudé	0cea1f62a8	hw/vfio/ccw: Replace DO_UPCAST(S390CCWDevice) by S390_CCW_DEVICE() Use the S390_CCW_DEVICE() QOM type-checking macro to avoid DO_UPCAST(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230213170145.45666-5-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-02-27 09:15:38 +01:00
Philippe Mathieu-Daudé	011da22c5c	hw/vfio/ccw: Use intermediate S390CCWDevice variable 'cdev' is VFIOCCWDevice's private parent object. Access it using the S390_CCW_DEVICE() QOM macro. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20230213170145.45666-4-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-02-27 09:15:38 +01:00
Philippe Mathieu-Daudé	4c8a2054e7	hw/vfio/ccw: Simplify using DEVICE() macro QOM parenthood relationship is: VFIOCCWDevice -> S390CCWDevice -> CcwDevice -> DeviceState We can directly use the QOM DEVICE() macro to get the parent object. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20230213170145.45666-3-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-02-27 09:15:38 +01:00
Avihai Horon	48e4d8289f	vfio: Alphabetize migration section of VFIO trace-events file Sort the migration section of VFIO trace events file alphabetically and move two misplaced traces to common.c section. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-11-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	7429aebe1c	vfio/migration: Remove VFIO migration protocol v1 Now that v2 protocol implementation has been added, remove the deprecated v1 implementation. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-10-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	31bcbbb5be	vfio/migration: Implement VFIO migration protocol v2 Implement the basic mandatory part of VFIO migration protocol v2. This includes all functionality that is necessary to support VFIO_MIGRATION_STOP_COPY part of the v2 protocol. The two protocols, v1 and v2, will co-exist and in the following patches v1 protocol code will be removed. There are several main differences between v1 and v2 protocols: - VFIO device state is now represented as a finite state machine instead of a bitmap. - Migration interface with kernel is now done using VFIO_DEVICE_FEATURE ioctl and normal read() and write() instead of the migration region. - Pre-copy is made optional in v2 protocol. Support for pre-copy will be added later on. Detailed information about VFIO migration protocol v2 and its difference compared to v1 protocol can be found here [1]. [1] https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.com/ Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com>. Link: https://lore.kernel.org/r/20230216143630.25610-9-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	6eeb290910	vfio/migration: Rename functions/structs related to v1 protocol To avoid name collisions, rename functions and structs related to VFIO migration protocol v1. This will allow the two protocols to co-exist when v2 protocol is added, until v1 is removed. No functional changes intended. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-8-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	16fe4e8ab7	vfio/migration: Move migration v1 logic to vfio_migration_init() Move vfio_dev_get_region_info() logic from vfio_migration_probe() to vfio_migration_init(). This logic is specific to v1 protocol and moving it will make it easier to add the v2 protocol implementation later. No functional changes intended. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-7-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	29d81b71aa	vfio/migration: Block multiple devices migration Currently VFIO migration doesn't implement some kind of intermediate quiescent state in which P2P DMAs are quiesced before stopping or running the device. This can cause problems in multi-device migration where the devices are doing P2P DMAs, since the devices are not stopped together at the same time. Until such support is added, block migration of multiple devices. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-6-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	8b942af393	vfio/common: Change vfio_devices_all_running_and_saving() logic to equivalent one vfio_devices_all_running_and_saving() is used to check if migration is in pre-copy phase. This is done by checking if migration is in setup or active states and if all VFIO devices are in pre-copy state, i.e. _SAVING \| _RUNNING. In VFIO migration protocol v2 pre-copy support is made optional. Hence, a matching v2 protocol pre-copy state can't be used here. As preparation for adding v2 protocol, change vfio_devices_all_running_and_saving() logic such that it doesn't use the VFIO pre-copy state. The new equivalent logic checks if migration is in active state and if all VFIO devices are in running state [1]. No functional changes intended. [1] Note that checking if migration is in setup or active states and if all VFIO devices are in running state doesn't guarantee that we are in pre-copy phase, thus we check if migration is only in active state. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-5-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	b051a3f640	vfio/migration: Allow migration without VFIO IOMMU dirty tracking support Currently, if IOMMU of a VFIO container doesn't support dirty page tracking, migration is blocked. This is because a DMA-able VFIO device can dirty RAM pages without updating QEMU about it, thus breaking the migration. However, this doesn't mean that migration can't be done at all. In such case, allow migration and let QEMU VFIO code mark all pages dirty. This guarantees that all pages that might have gotten dirty are reported back, and thus guarantees a valid migration even without VFIO IOMMU dirty tracking support. The motivation for this patch is the introduction of iommufd [1]. iommufd can directly implement the /dev/vfio/vfio container IOCTLs by mapping them into its internal ops, allowing the usage of these IOCTLs over iommufd. However, VFIO IOMMU dirty tracking is not supported by this VFIO compatibility API. This patch will allow migration by hosts that use the VFIO compatibility API and prevent migration regressions caused by the lack of VFIO IOMMU dirty tracking support. [1] https://lore.kernel.org/kvm/0-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com/ Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Link: https://lore.kernel.org/r/20230216143630.25610-4-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Avihai Horon	5c4dbcb748	vfio/migration: Fix NULL pointer dereference bug As part of its error flow, vfio_vmstate_change() accesses MigrationState->to_dst_file without any checks. This can cause a NULL pointer dereference if the error flow is taken and MigrationState->to_dst_file is not set. For example, this can happen if VM is started or stopped not during migration and vfio_vmstate_change() error flow is taken, as MigrationState->to_dst_file is not set at that time. Fix it by checking that MigrationState->to_dst_file is set before using it. Fixes: `02a7e71b1e` ("vfio: Add VM state change handler to know state of VM") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Link: https://lore.kernel.org/r/20230216143630.25610-3-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-02-16 12:13:46 -07:00
Juan Quintela	24beea4efe	migration: Rename res_{postcopy,precopy}_only Once that res_compatible is removed, they don't make sense anymore. We remove the _only preffix. And to make things clearer we rename them to must_precopy and can_postcopy. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-15 20:04:30 +01:00
Juan Quintela	24f254ed79	migration: Remove unused res_compatible Nothing assigns to it after previous commit. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-15 20:04:30 +01:00
Juan Quintela	fd70385d38	migration: Remove unused threshold_size parameter Until previous commit, save_live_pending() was used for ram. Now with the split into state_pending_estimate() and state_pending_exact() it is not needed anymore, so remove them. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Juan Quintela	c8df4a7aef	migration: Split save_live_pending() into state_pending_* We split the function into to: - state_pending_estimate: We estimate the remaining state size without stopping the machine. - state pending_exact: We calculate the exact amount of remaining state. The only "device" that implements different functions for _estimate() and _exact() is ram. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Juan Quintela	255dc7af7e	migration: No save_live_pending() method uses the QEMUFile parameter So remove it everywhere. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Markus Armbruster	edf5ca5dbe	include/hw/pci: Split pci_device.h off pci.h PCIDeviceClass and PCIDevice are defined in pci.h. Many users of the header don't actually need them. Similar structs live in their own headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h, PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and PCIESlot in pcie_port.h. Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with the code that needs them. Adjust include directives. This also enables the next commit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221222100330.380143-6-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-01-08 01:54:22 -05:00
Stefan Hajnoczi	f21f1cfeb9	pci,pc,virtio: features, tests, fixes, cleanups lots of acpi rework first version of biosbits infrastructure ASID support in vhost-vdpa core_count2 support in smbios PCIe DOE emulation virtio vq reset HMAT support part of infrastructure for viommu support in vhost-vdpa VTD PASID support fixes, tests all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNpXDkPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpD0AH/2G8ZPrgrxJC9y3uD5/5J6QRzO+TsDYbg5ut uBf4rKSHHzcu6zdyAfsrhbAKKzyD4HrEGNXZrBjnKM1xCiB/SGBcDIWntwrca2+s 5Dpbi4xvd4tg6tVD4b47XNDCcn2uUbeI0e2M5QIbtCmzdi/xKbFAfl5G8DQp431X Kmz79G4CdKWyjVlM0HoYmdCw/4FxkdjD02tE/Uc5YMrePNaEg5Bw4hjCHbx1b6ur 6gjeXAtncm9s4sO0l+sIdyiqlxiTry9FSr35WaQ0qPU+Og5zaf1EiWfdl8TRo4qU EAATw5A4hyw11GfOGp7oOVkTGvcNB/H7aIxD7emdWZV8+BMRPKo= =zTCn -----END PGP SIGNATURE----- Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging pci,pc,virtio: features, tests, fixes, cleanups lots of acpi rework first version of biosbits infrastructure ASID support in vhost-vdpa core_count2 support in smbios PCIe DOE emulation virtio vq reset HMAT support part of infrastructure for viommu support in vhost-vdpa VTD PASID support fixes, tests all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNpXDkPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpD0AH/2G8ZPrgrxJC9y3uD5/5J6QRzO+TsDYbg5ut # uBf4rKSHHzcu6zdyAfsrhbAKKzyD4HrEGNXZrBjnKM1xCiB/SGBcDIWntwrca2+s # 5Dpbi4xvd4tg6tVD4b47XNDCcn2uUbeI0e2M5QIbtCmzdi/xKbFAfl5G8DQp431X # Kmz79G4CdKWyjVlM0HoYmdCw/4FxkdjD02tE/Uc5YMrePNaEg5Bw4hjCHbx1b6ur # 6gjeXAtncm9s4sO0l+sIdyiqlxiTry9FSr35WaQ0qPU+Og5zaf1EiWfdl8TRo4qU # EAATw5A4hyw11GfOGp7oOVkTGvcNB/H7aIxD7emdWZV8+BMRPKo= # =zTCn # -----END PGP SIGNATURE----- # gpg: Signature made Mon 07 Nov 2022 14:27:53 EST # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (83 commits) checkpatch: better pattern for inline comments hw/virtio: introduce virtio_device_should_start tests/acpi: update tables for new core count test bios-tables-test: add test for number of cores > 255 tests/acpi: allow changes for core_count2 test bios-tables-test: teach test to use smbios 3.0 tables hw/smbios: add core_count2 to smbios table type 4 vhost-user: Support vhost_dev_start vhost: Change the sequence of device start intel-iommu: PASID support intel-iommu: convert VTD_PE_GET_FPD_ERR() to be a function intel-iommu: drop VTDBus intel-iommu: don't warn guest errors when getting rid2pasid entry vfio: move implement of vfio_get_xlat_addr() to memory.c tests: virt: Update expected .acpihmatvirt tables tests: acpi: aarch64/virt: add a test for hmat nodes with no initiators hw/arm/virt: Enable HMAT on arm virt machine tests: Add HMAT AArch64/virt empty table files tests: acpi: q35: update expected blobs .hmat-noinitiators expected HMAT: tests: acpi: q35: add test for hmat nodes without initiators ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-11-07 18:43:56 -05:00
Cindy Lu	baa44bce87	vfio: move implement of vfio_get_xlat_addr() to memory.c - Move the implement vfio_get_xlat_addr to softmmu/memory.c, and change the name to memory_get_xlat_addr(). So we can use this function on other devices, such as vDPA device. - Add a new function vfio_get_xlat_addr in vfio/common.c, and it will check whether the memory is backed by a discard manager. then device can have its own warning. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20221031031020.1405111-2-lulu@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Avihai Horon	2461e75219	vfio/migration: Fix wrong enum usage vfio_migration_init() initializes VFIOMigration->device_state using enum of VFIO migration protocol v2. Current implemented protocol is v1 so v1 enum should be used. Fix it. Fixes: `429c728006` ("vfio/migration: Fix incorrect initialization value for parameters in VFIOMigration") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Link: https://lore.kernel.org/r/20221016085752.32740-1-avihaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-11-03 15:57:31 -06:00
Alex Williamson	85b6d2b5fc	vfio/common: Fix vfio_iommu_type1_info use after free On error, vfio_get_iommu_info() frees and clears *info, but vfio_connect_container() continues to use the pointer regardless of the return value. Restructure the code such that a failure of this function triggers an error and clean up the remainder of the function, including updating an outdated comment that had drifted from its relevant line of code and using host page size for a default for better compatibility on non-4KB systems. Reported-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/all/20220910004245.2878-1-nicolinc@nvidia.com/ Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/166326219630.3388898.12882473157184946072.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-27 14:26:42 -06:00
Kunkun Jiang	429c728006	vfio/migration: Fix incorrect initialization value for parameters in VFIOMigration The structure VFIOMigration of a VFIODevice is allocated and initialized in vfio_migration_init(). "device_state" and "vm_running" are initialized to 0, indicating that VFIO device is_STOP and VM is not-running. The initialization value is incorrect. According to the agreement, default state of VFIO device is _RUNNING. And if a VFIO device is hot-plugged while the VM is running, "vm_running" should be 1. This patch fixes it. Fixes: `02a7e71b1e` ("vfio: Add VM state change handler to know state of VM") Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Link: https://lore.kernel.org/r/20220711014651.1327-1-jiangkunkun@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-09-27 14:26:39 -06:00
Akihiko Odaki	362239c05f	ui/console: Do not return a value with ui_info The returned value is not used and misleading. Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com> Message-Id: <20220226115516.59830-2-akihiko.odaki@gmail.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2022-06-14 10:34:37 +02:00
Eric Auger	ec6600be0d	vfio/common: remove spurious warning on vfio_listener_region_del `851d6d1a0f` ("vfio/common: remove spurious tpm-crb-cmd misalignment warning") removed the warning on vfio_listener_region_add() path. However the same warning also hits on region_del path. Let's remove it and reword the dynamic trace as this can be called on both map and unmap path. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220524091405.416256-1-eric.auger@redhat.com Fixes: `851d6d1a0f` ("vfio/common: remove spurious tpm-crb-cmd misalignment warning") Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-06-08 08:44:19 -06:00
Bernhard Beschow	8f1b608798	hw/vfio/pci-quirks: Resolve redundant property getters The QOM API already provides getters for uint64 and uint32 values, so reuse them. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220301225220.239065-2-shentey@gmail.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>	2022-05-24 10:38:50 +10:00
Alex Williamson	e4082063e4	linux-headers: Update to v5.18-rc6 Update to c5eb0a61238d ("Linux 5.18-rc6"). Mechanical search and replace of vfio defines with white space massaging. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-13 08:20:11 -06:00
Yi Liu	44ee6aaae0	vfio/common: Rename VFIOGuestIOMMU::iommu into ::iommu_mr Rename VFIOGuestIOMMU iommu field into iommu_mr. Then it becomes clearer it is an IOMMU memory region. no functional change intended Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20220502094223.36384-4-yi.l.liu@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:51 -06:00
Eric Auger	0d570a2572	vfio/pci: Use vbasedev local variable in vfio_realize() Using a VFIODevice handle local variable to improve the code readability. no functional change intended Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20220502094223.36384-3-yi.l.liu@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Eric Auger	9d38ffc5d8	hw/vfio/pci: fix vfio_pci_hot_reset_result trace point "%m" format specifier is not interpreted by the trace infrastructure and thus "%m" is output instead of the actual errno string. Fix it by outputting strerror(errno). Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20220502094223.36384-2-yi.l.liu@intel.com [aw: replace commit log as provided by Eric] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Eric Auger	851d6d1a0f	vfio/common: remove spurious tpm-crb-cmd misalignment warning The CRB command buffer currently is a RAM MemoryRegion and given its base address alignment, it causes an error report on vfio_listener_region_add(). This region could have been a RAM device region, easing the detection of such safe situation but this option was not well received. So let's add a helper function that uses the memory region owner type to detect the situation is safe wrt the assignment. Other device types can be checked here if such kind of problem occurs again. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220506132510.1847942-3-eric.auger@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Xiang Chen	99510d271b	vfio/common: Fix a small boundary issue of a trace It uses [offset, offset + size - 1] to indicate that the length of range is size in most places in vfio trace code (such as trace_vfio_region_region_mmap()) execpt trace_vfio_region_sparse_mmap_entry(). So change it for trace_vfio_region_sparse_mmap_entry(), but if size is zero, the trace will be weird with an underflow, so move the trace and trace it only if size is not zero. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1650100104-130737-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	dc580d51f7	vfio: defer to commit kvm irq routing when enable msi/msix In migration resume phase, all unmasked msix vectors need to be setup when loading the VF state. However, the setup operation would take longer if the VM has more VFs and each VF has more unmasked vectors. The hot spot is kvm_irqchip_commit_routes, it'll scan and update all irqfds that are already assigned each invocation, so more vectors means need more time to process them. vfio_pci_load_config vfio_msix_enable msix_set_vector_notifiers for (vector = 0; vector < dev->msix_entries_nr; vector++) { vfio_msix_vector_do_use vfio_add_kvm_msi_virq kvm_irqchip_commit_routes <-- expensive } We can reduce the cost by only committing once outside the loop. The routes are cached in kvm_state, we commit them first and then bind irqfd for each vector. The test VM has 128 vcpus and 8 VF (each one has 65 vectors), we measure the cost of the vfio_msix_enable for each VF, and we can see 90+% costs can be reduce. VF Count of irqfds[] Original With this patch 1st 65 8 2 2nd 130 15 2 3rd 195 22 2 4th 260 24 3 5th 325 36 2 6th 390 44 3 7th 455 51 3 8th 520 58 4 Total 258ms 21ms [] Count of irqfds How many irqfds that already assigned and need to process in this round. The optimization can be applied to msi type too. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-6-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	75d546fc18	Revert "vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration" Commit `ecebe53fe9` ("vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration") avoids inefficiently disabling and enabling vectors repeatedly and lets the unmasked vectors be enabled one by one. But we want to batch multiple routes and defer the commit, and only commit once outside the loop of setting vector notifiers, so we cannot enable the vectors one by one in the loop now. Revert that commit and we will take another way in the next patch, it can not only avoid disabling/enabling vectors repeatedly, but also satisfy our requirement of defer to commit. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-5-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	8ab217d5d3	vfio: simplify the failure path in vfio_msi_enable Use vfio_msi_disable_common to simplify the error handling in vfio_msi_enable. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-4-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	be4a46eccf	vfio: move re-enabling INTX out of the common helper Move re-enabling INTX out, and the callers should decide to re-enable it or not. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-3-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Longpeng(Mike)	a6f5770fb2	vfio: simplify the conditional statements in vfio_msi_enable It's unnecessary to test against the specific return value of VFIO_DEVICE_SET_IRQS, since any positive return is an error indicating the number of vectors we should retry with. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-2-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Marc-André Lureau	8e3b0cbb72	Replace qemu_real_host_page variables with inlined functions Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:38 +02:00
Markus Armbruster	b21e238037	Use g_new() & friends where that makes obvious sense g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void *, which lets the compiler catch more type errors. This commit only touches allocations with size arguments of the form sizeof(T). Patch created mechanically with: $ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \ --macro-file scripts/cocci-macro-file.h FILES... Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220315144156.1595462-4-armbru@redhat.com> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>	2022-03-21 15:44:44 +01:00
Longpeng(Mike)	def4c5570c	kvm/msi: do explicit commit when adding msi routes We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:26:20 +01:00
Bernhard Beschow	5e78c98b7c	Mark remaining global TypeInfo instances as const More than 1k of TypeInfo instances are already marked as const. Mark the remaining ones, too. This commit was created with: git grep -z -l 'static TypeInfo' -- '*.c' \| \ xargs -0 sed -i 's/static TypeInfo/static const TypeInfo/' Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Corey Minyard <cminyard@mvista.com> Message-id: 20220117145805.173070-2-shentey@gmail.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-02-21 13:30:20 +00:00
Peng Liang	f3bc3a73c9	vfio: Fix memory leak of hostwin hostwin is allocated and added to hostwin_list in vfio_host_win_add, but it is only deleted from hostwin_list in vfio_host_win_del, which causes a memory leak. Also, freeing all elements in hostwin_list is missing in vfio_disconnect_container. Fix: `2e4109de8e` ("vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)") CC: qemu-stable@nongnu.org Signed-off-by: Peng Liang <liangpeng10@huawei.com> Link: https://lore.kernel.org/r/20211117014739.1839263-1-liangpeng10@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-11-17 11:25:55 -07:00
Kunkun Jiang	e4b3470838	vfio/common: Add a trace point when a MMIO RAM section cannot be mapped The MSI-X structures of some devices and other non-MSI-X structures may be in the same BAR. They may share one host page, especially in the case of large page granularity, such as 64K. For example, MSIX-Table size of 82599 NIC is 0x30 and the offset in Bar 3(size 64KB) is 0x0. vfio_listener_region_add() will be called to map the remaining range (0x30-0xffff). If host page size is 64KB, it will return early at 'int128_ge((int128_make64(iova), llend))' without any message. Let's add a trace point to inform users like commit `5c08600547` ("vfio: Use a trace point when a RAM section cannot be DMA mapped") did. Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Link: https://lore.kernel.org/r/20211027090406.761-3-jiangkunkun@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-11-01 12:17:51 -06:00
Kunkun Jiang	f36d4fb85f	vfio/pci: Add support for mmapping sub-page MMIO BARs after live migration We can expand MemoryRegions of sub-page MMIO BARs in vfio_pci_write_config() to improve IO performance for some devices. However, the MemoryRegions of destination VM are not expanded any more after live migration. Because their addresses have been updated in vmstate_load_state() (vfio_pci_load_config) and vfio_sub_page_bar_update_mapping() will not be called. This may result in poor performance after live migration. So iterate BARs in vfio_pci_load_config() and try to update sub-page BARs. Reported-by: Nianyao Tang <tangnianyao@huawei.com> Reported-by: Qixin Gan <ganqixin@huawei.com> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Link: https://lore.kernel.org/r/20211027090406.761-2-jiangkunkun@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-11-01 12:17:51 -06:00
Kevin Wolf	f3558b1b76	qdev: Base object creation on QDict rather than QemuOpts QDicts are both what QMP natively uses and what the keyval parser produces. Going through QemuOpts isn't useful for either one, so switch the main device creation function to QDicts. By sharing more code with the -object/object-add code path, we can even reduce the code size a bit. This commit doesn't remove the detour through QemuOpts from any code path yet, but it allows the following commits to do so. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20211008133442.141332-15-kwolf@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-10-15 16:11:22 +02:00
Peter Xu	142518bda5	memory: Name all the memory listeners Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 15:30:24 +02:00
Sean Christopherson	56918a126a	memory: Add RAM_PROTECTED flag to skip IOMMU mappings Add a new RAMBlock flag to denote "protected" memory, i.e. memory that looks and acts like RAM but is inaccessible via normal mechanisms, including DMA. Use the flag to skip protected memory regions when mapping RAM for DMA in VFIO. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:19 +02:00
Cai Huoqing	631ba5a128	hw/vfio: Fix typo in comments Fix typo in comments: programatically ==> programmatically disconecting ==> disconnecting mulitple ==> multiple timout ==> timeout regsiter ==> register forumula ==> formula Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210730012613.2198-1-caihuoqing@baidu.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-09-16 11:57:01 +02:00
Cornelia Huck	759a5d3be0	vfio-ccw: forward halt/clear errors hsch and csch basically have two parts: execute the command, and perform the halt/clear function. For fully emulated subchannels, it is pretty clear how it will work: check the subchannel state, and actually 'perform the halt/clear function' and set cc 0 if everything looks good. For passthrough subchannels, some of the checking is done within QEMU, but some has to be done within the kernel. QEMU's subchannel state may be such that we can perform the async function, but the kernel may still get a cc != 0 when it is actually executing the instruction. In that case, we need to set the condition actually encountered by the kernel; if we set cc 0 on error, we would actually need to inject an interrupt as well. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Tested-by: Jared Rossi <jrossi@linux.ibm.com> Message-Id: <20210705163952.736020-2-cohuck@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-09-06 16:22:54 +02:00
Markus Armbruster	eb24a23e15	vfio: Avoid error_propagate() after migrate_add_blocker() When migrate_add_blocker(blocker, &err) is followed by error_propagate(errp, err), we can often just as well do migrate_add_blocker(..., errp). This is the case in vfio_migration_probe(). Prior art: commit `386f6c07d2` "error: Avoid error_propagate() after migrate_add_blocker()". Cc: Kirti Wankhede <kwankhede@nvidia.com> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-8-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed by: Kirti Wankhede <kwankhede@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-26 17:15:28 +02:00
Cai Huoqing	1bd9f1b14d	vfio/pci: Add pba_offset PCI quirk for BAIDU KUNLUN AI processor Fix pba_offset initialization value for BAIDU KUNLUN Virtual Function device. The KUNLUN hardware returns an incorrect value for the VF PBA offset, and add a quirk to instead return a hardcoded value of 0xb400. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Link: https://lore.kernel.org/r/20210713093743.942-1-caihuoqing@baidu.com [aw: comment & whitespace tuning] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-07-14 13:47:17 -06:00
Cai Huoqing	936555bc4f	vfio/pci: Change to use vfio_pci_is() Make use of vfio_pci_is() helper function. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Link: https://lore.kernel.org/r/20210713014831.742-1-caihuoqing@baidu.com [aw: commit log wording] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-07-14 13:47:17 -06:00
David Hildenbrand	a5dba9bc05	vfio: Fix CID 1458134 in vfio_register_ram_discard_listener() CID 1458134: Integer handling issues (BAD_SHIFT) In expression "1 << ctz64(container->pgsizes)", left shifting by more than 31 bits has undefined behavior. The shift amount, "ctz64(container->pgsizes)", is 64. Commit `5e3b981c33` ("vfio: Support for RamDiscardManager in the !vIOMMU case") added an assertion that our granularity is at least as big as the page size. Although unlikely, we could have a page size that does not fit into 32 bit. In that case, we'd try shifting by more than 31 bit. Let's use 1ULL instead and make sure we're not shifting by more than 63 bit by asserting that any bit in container->pgsizes is set. Fixes: CID 1458134 Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Link: https://lore.kernel.org/r/20210712083135.15755-1-david@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-07-14 13:47:17 -06:00
Peter Maydell	57e28d34c0	s390x updates: - add gen16 cpumodels - refactor/cleanup some code - bugfixes -----BEGIN PGP SIGNATURE----- iIgEABYIADAWIQRpo7U29cv8ZSCAJsHeiLtWQd5mwQUCYObg3RIcY29odWNrQHJl ZGhhdC5jb20ACgkQ3oi7VkHeZsGAdAD/dSZkhfgjNWJjka0hmnyQyNCSzq6jox1L PccGyqhkqU8BAM4DUa2bZdst8bLfhUuAA0M5gKkCqkzHdDraBqTL8LQJ =H7dn -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/cohuck-gitlab/tags/s390x-20210708' into staging s390x updates: - add gen16 cpumodels - refactor/cleanup some code - bugfixes # gpg: Signature made Thu 08 Jul 2021 12:26:21 BST # gpg: using EDDSA key 69A3B536F5CBFC65208026C1DE88BB5641DE66C1 # gpg: issuer "cohuck@redhat.com" # gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>" [unknown] # gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>" [full] # gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>" [full] # gpg: aka "Cornelia Huck <cohuck@kernel.org>" [unknown] # gpg: aka "Cornelia Huck <cohuck@redhat.com>" [unknown] # Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF # Subkey fingerprint: 69A3 B536 F5CB FC65 2080 26C1 DE88 BB56 41DE 66C1 * remotes/cohuck-gitlab/tags/s390x-20210708: target/s390x: split sysemu part of cpu models target/s390x: move kvm files into kvm/ target/s390x: remove kvm-stub.c target/s390x: use kvm_enabled() to wrap call to kvm_s390_get_hpage_1m target/s390x: make helper.c sysemu-only target/s390x: split cpu-dump from helper.c target/s390x: move sysemu-only code out to cpu-sysemu.c target/s390x: start moving TCG-only code to tcg/ target/s390x: rename internal.h to s390x-internal.h target/s390x: remove tcg-stub.c hw/s390x: only build tod-tcg from the CONFIG_TCG build hw/s390x: tod: make explicit checks for accelerators when initializing hw/s390x: rename tod-qemu.c to tod-tcg.c target/s390x: meson: add target_user_arch s390x/tcg: Fix m5 vs. m4 field for VECTOR MULTIPLY SUM LOGICAL target/s390x: Fix CC set by CONVERT TO FIXED/LOGICAL s390x/cpumodel: add 3931 and 3932 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-07-12 19:15:11 +01:00
David Hildenbrand	53d1b5fcfb	vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus We support coordinated discarding of RAM using the RamDiscardManager for the VFIO_TYPE1 iommus. Let's unlock support for coordinated discards, keeping uncoordinated discards (e.g., via virtio-balloon) disabled if possible. This unlocks virtio-mem + vfio on x86-64. Note that vfio used via "nvme://" by the block layer has to be implemented/unlocked separately. For now, virtio-mem only supports x86-64; we don't restrict RamDiscardManager to x86-64, though: arm64 and s390x are supposed to work as well, and we'll test once unlocking virtio-mem support. The spapr IOMMUs will need special care, to be tackled later, e.g.., once supporting virtio-mem. Note: The block size of a virtio-mem device has to be set to sane sizes, depending on the maximum hotplug size - to not run out of vfio mappings. The default virtio-mem block size is usually in the range of a couple of MBs. The maximum number of mapping is 64k, shared with other users. Assume you want to hotplug 256GB using virtio-mem - the block size would have to be set to at least 8 MiB (resulting in 32768 separate mappings). Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-14-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
David Hildenbrand	0fd7616e0f	vfio: Support for RamDiscardManager in the vIOMMU case vIOMMU support works already with RamDiscardManager as long as guests only map populated memory. Both, populated and discarded memory is mapped into &address_space_memory, where vfio_get_xlat_addr() will find that memory, to create the vfio mapping. Sane guests will never map discarded memory (e.g., unplugged memory blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while memory is getting discarded. However, there are two cases where a malicious guests could trigger pinning of more memory than intended. One case is easy to handle: the guest trying to map discarded memory into an IOMMU. The other case is harder to handle: the guest keeping memory mapped in the IOMMU while it is getting discarded. We would have to walk over all mappings when discarding memory and identify if any mapping would be a violation. Let's keep it simple for now and print a warning, indicating that setting RLIMIT_MEMLOCK can mitigate such attacks. We have to take care of incoming migration: at the point the IOMMUs get restored and start creating mappings in vfio, RamDiscardManager implementations might not be back up and running yet: let's add runstate priorities to enforce the order when restoring. Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-10-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
David Hildenbrand	a74317f636	vfio: Sanity check maximum number of DMA mappings with RamDiscardManager Although RamDiscardManager can handle running into the maximum number of DMA mappings by propagating errors when creating a DMA mapping, we want to sanity check and warn the user early that there is a theoretical setup issue and that virtio-mem might not be able to provide as much memory towards a VM as desired. As suggested by Alex, let's use the number of KVM memory slots to guess how many other mappings we might see over time. Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-9-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
David Hildenbrand	3eed155caf	vfio: Query and store the maximum number of possible DMA mappings Let's query the maximum number of possible DMA mappings by querying the available mappings when creating the container (before any mappings are created). We'll use this informaton soon to perform some sanity checks and warn the user. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-8-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
David Hildenbrand	5e3b981c33	vfio: Support for RamDiscardManager in the !vIOMMU case Implement support for RamDiscardManager, to prepare for virtio-mem support. Instead of mapping the whole memory section, we only map "populated" parts and update the mapping when notified about discarding/population of memory via the RamDiscardListener. Similarly, when syncing the dirty bitmaps, sync only the actually mapped (populated) parts by replaying via the notifier. Using virtio-mem with vfio is still blocked via ram_block_discard_disable()/ram_block_discard_require() after this patch. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Auger Eric <eric.auger@redhat.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: teawater <teawaterz@linux.alibaba.com> Cc: Marek Kedzierski <mkedzier@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210413095531.25603-7-david@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-08 15:54:45 -04:00
Cho, Yu-Chen	67043607d1	target/s390x: move kvm files into kvm/ move kvm files into kvm/ After the reshuffling, update MAINTAINERS accordingly. Make use of the new directory: target/s390x/kvm/ Signed-off-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Cho, Yu-Chen <acho@suse.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210707105324.23400-14-acho@suse.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-07-07 14:01:59 +02:00
Eric Farman	c626710fc7	s390x/css: Add passthrough IRB Wire in the subchannel callback for building the IRB ESW and ECW space for passthrough devices, and copy the hardware's ESW into the IRB we are building. If the hardware presented concurrent sense, then copy that sense data into the IRB's ECW space. Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20210617232537.1337506-5-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-06-21 08:48:21 +02:00
Kirti Wankhede	d742d064c1	vfio/migration: Correct device state from vmstate change for savevm case Set _SAVING flag for device state from vmstate change handler when it gets called from savevm. Currently State transition savevm/suspend is seen as: _RUNNING -> _STOP -> Stop-and-copy -> _STOP State transition savevm/suspend should be: _RUNNING -> Stop-and-copy -> _STOP State transition from _RUNNING to _STOP occurs from vfio_vmstate_change() where when vmstate changes from running to !running, _RUNNING flag is reset but at the same time when vfio_vmstate_change() is called for RUN_STATE_SAVE_VM, _SAVING bit should be set. Reported by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Message-Id: <1623177441-27496-1-git-send-email-kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-06-18 09:10:35 -06:00
Kunkun Jiang	22fca190e2	vfio: Fix unregister SaveVMHandler in vfio_migration_finalize In the vfio_migration_init(), the SaveVMHandler is registered for VFIO device. But it lacks the operation of 'unregister'. It will lead to 'Segmentation fault (core dumped)' in qemu_savevm_state_setup(), if performing live migration after a VFIO device is hot deleted. Fixes: `7c2f5f75f9` (vfio: Register SaveVMHandlers for VFIO device) Reported-by: Qixin Gan <ganqixin@huawei.com> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Message-Id: <20210527123101.289-1-jiangkunkun@huawei.com> Reviewed by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-06-18 08:38:04 -06:00
Stefano Garzarella	d0fb9657a3	docs: fix references to docs/devel/tracing.rst Commit `e50caf4a5c` ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-06-02 06:51:09 +02:00
Eric Farman	dcc9cf3801	vfio-ccw: Attempt to clean up all IRQs on error The vfio_ccw_unrealize() routine makes an unconditional attempt to unregister every IRQ notifier, though they may not have been registered in the first place (when running on an older kernel, for example). Let's mirror this behavior in the error cleanups in vfio_ccw_realize() so that if/when new IRQs are added, it is less confusing to recognize the necessary procedures. The worst case scenario would be some extra messages about an undefined IRQ, but since this is an error exit that won't be the only thing to worry about. And regarding those messages, let's change it to a warning instead of an error, to better reflect their severity. The existing code in both paths handles everything anyway. Signed-off-by: Eric Farman <farman@linux.ibm.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <20210428143652.1571487-1-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-05-20 14:19:30 +02:00
Eric Farman	6178d4689a	vfio-ccw: Permit missing IRQs Commit `690e29b911` ("vfio-ccw: Refactor ccw irq handler") changed one of the checks for the IRQ notifier registration from saying "the host needs to recognize the only IRQ that exists" to saying "the host needs to recognize ANY IRQ that exists." And this worked fine, because the subsequent change to support the CRW IRQ notifier doesn't get into this code when running on an older kernel, thanks to a guard by a capability region. The later addition of the REQ(uest) IRQ by commit `b2f96f9e4f` ("vfio-ccw: Connect the device request notifier") broke this assumption because there is no matching capability region. Thus, running new QEMU on an older kernel fails with: vfio: unexpected number of irqs 2 Let's adapt the message here so that there's a better clue of what IRQ is missing. Furthermore, let's make the REQ(uest) IRQ not fail when attempting to register it, to permit running vfio-ccw on a newer QEMU with an older kernel. Fixes: `b2f96f9e4f` ("vfio-ccw: Connect the device request notifier") Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20210421152053.2379873-1-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-05-20 14:19:30 +02:00
Thomas Huth	2068cabd3f	Do not include cpu.h if it's not really necessary Stop including cpu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-4-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:51 +02:00
Thomas Huth	4c386f8064	Do not include sysemu/sysemu.h if it's not really necessary Stop including sysemu/sysemu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-2-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Thomas Huth	f6527eadeb	hw: Do not include hw/sysbus.h if it is not necessary Many files include hw/sysbus.h without needing it. Remove the superfluous include statements. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210327082804.2259480-1-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Thomas Huth	e06054368c	hw: Remove superfluous includes of hw/hw.h The include/hw/hw.h header only has a prototype for hw_error(), so it does not make sense to include this in files that do not use this function. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210326151848.2217216-1-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Keqian Zhu	758b96b61d	vfio/migrate: Move switch of dirty tracking into vfio_memory_listener For now the switch of vfio dirty page tracking is integrated into @vfio_save_handler. The reason is that some PCI vendor driver may start to track dirty base on _SAVING state of device, so if dirty tracking is started before setting device state, vfio will report full-dirty to QEMU. However, the dirty bmap of all ramblocks are fully set when setup ram saving, so it's not matter whether the device is in _SAVING state when start vfio dirty tracking. Moreover, this logic causes some problems [1]. The object of dirty tracking is guest memory, but the object of @vfio_save_handler is device state, which produces unnecessary coupling and conflicts: 1. Coupling: Their saving granule is different (perVM vs perDevice). vfio will enable dirty_page_tracking for each devices, actually once is enough. 2. Conflicts: The ram_save_setup() traverses all memory_listeners to execute their log_start() and log_sync() hooks to get the first round dirty bitmap, which is used by the bulk stage of ram saving. However, as vfio dirty tracking is not yet started, it can't get dirty bitmap from vfio. Then we give up the chance to handle vfio dirty page at bulk stage. Move the switch of vfio dirty_page_tracking into vfio_memory_listener can solve above problems. Besides, Do not require devices in SAVING state for vfio_sync_dirty_bitmap(). [1] https://www.spinics.net/lists/kvm/msg229967.html Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210309031913.11508-1-zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Kunkun Jiang	1eb7f64275	vfio: Support host translation granule size The cpu_physical_memory_set_dirty_lebitmap() can quickly deal with the dirty pages of memory by bitmap-traveling, regardless of whether the bitmap is aligned correctly or not. cpu_physical_memory_set_dirty_lebitmap() supports pages in bitmap of host page size. So it'd better to set bitmap_pgsize to host page size to support more translation granule sizes. [aw: The Fixes commit below introduced code to restrict migration support to configurations where the target page size intersects the host dirty page support. For example, a 4K guest on a 4K host. Due to the above flexibility in bitmap handling, this restriction unnecessarily prevents mixed target/host pages size that could otherwise be supported. Use host page size for dirty bitmap.] Fixes: `87ea529c50` ("vfio: Get migration capability flags for container") Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> Message-Id: <20210304133446.1521-1-jiangkunkun@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Shenming Lu	ecebe53fe9	vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration In VFIO migration resume phase and some guest startups, there are already unmasked vectors in the vector table when calling vfio_msix_enable(). So in order to avoid inefficiently disabling and enabling vectors repeatedly, let's allocate all needed vectors first and then enable these unmasked vectors one by one without disabling. Signed-off-by: Shenming Lu <lushenming@huawei.com> Message-Id: <20210310030233.1133-4-lushenming@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Shenming Lu	8ce1ff990e	vfio: Set the priority of the VFIO VM state change handler explicitly In the VFIO VM state change handler when stopping the VM, the _RUNNING bit in device_state is cleared which makes the VFIO device stop, including no longer generating interrupts. Then we can save the pending states of all interrupts in the GIC VM state change handler (on ARM). So we have to set the priority of the VFIO VM state change handler explicitly (like virtio devices) to ensure it is called before the GIC's in saving. Signed-off-by: Shenming Lu <lushenming@huawei.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210310030233.1133-3-lushenming@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Shenming Lu	d329f5032e	vfio: Move the saving of the config space to the right place in VFIO migration On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt setup, if the restoring of the VFIO PCI device config space is before the VGIC, an error might occur in the kernel. So we move the saving of the config space to the non-iterable process, thus it will be called after the VGIC according to their priorities. As for the possible dependence of the device specific migration data on it's config space, we can let the vendor driver to include any config info it needs in its own data stream. Signed-off-by: Shenming Lu <lushenming@huawei.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Message-Id: <20210310030233.1133-2-lushenming@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Eric Auger	8dca037b48	vfio: Do not register any IOMMU_NOTIFIER_DEVIOTLB_UNMAP notifier In an attempt to fix smmu/virtio-iommu - vhost regression, commit `958ec334bc` ("vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support") broke virtio-iommu integration. This is due to the fact VFIO registers IOMMU_NOTIFIER_ALL notifiers, which includes IOMMU_NOTIFIER_DEVIOTLB_UNMAP and this latter now is rejected by the virtio-iommu. As a consequence, the registration fails. VHOST behaves like a device with an ATC cache. The VFIO device does not support this scheme yet. Let's register only legacy MAP and UNMAP notifiers. Fixes: `958ec334bc` ("vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support") Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20210209213233.40985-2-eric.auger@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Philippe Mathieu-Daudé	4eda914cac	hw/vfio/pci-quirks: Replace the word 'blacklist' Follow the inclusive terminology from the "Conscious Language in your Open Source Projects" guidelines [] and replace the word "blacklist" appropriately. [] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210205171817.2108907-9-philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Zenghui Yu	4292d50193	vfio: Fix vfio_listener_log_sync function name typo There is an obvious typo in the function name of the .log_sync() callback. Spell it correctly. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Message-Id: <20201204014240.772-1-yuzenghui@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Philippe Mathieu-Daudé	538f049704	sysemu: Let VMChangeStateHandler take boolean 'running' argument The 'running' argument from VMChangeStateHandler does not require other value than 0 / 1. Make it a plain boolean. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210111152020.1422021-3-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-03-09 23:13:57 +01:00
Eric Farman	d6cd66311f	vfio-ccw: Do not read region ret_code after write A pwrite() call returns the number of bytes written (or -1 on error), and vfio-ccw compares this number with the size of the region to determine if an error had occurred or not. If they are not equal, this is a failure and the errno is used to determine exactly how things failed. An errno of zero is possible (though unlikely) in this situation and would be translated to a successful operation. If they ARE equal, the ret_code field is read from the region to determine how to proceed. While the kernel sets the ret_code field as necessary, the region and thus this field is not "written back" to the user. So the value can only be what it was initialized to, which is zero. So, let's convert an unexpected length with errno of zero to a return code of -EFAULT, and explicitly set an expected length to a return code of zero. This will be a little safer and clearer. Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20210303160739.2179378-1-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-03-04 11:24:49 +01:00
Prasad J Pandit	24202d2b56	vfio: add quirk device write method Add vfio quirk device mmio write method to avoid NULL pointer dereference issue. Reported-by: Lei Sun <slei.casper@gmail.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Message-Id: <20200811114133.672647-4-ppandit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-02-08 15:15:32 +01:00
Marc-André Lureau	a7dfbe289e	ui: add an optional get_flags callback to GraphicHwOps Those flags can be used to express different requirements for the display or other needs. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20210204105232.834642-12-marcandre.lureau@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2021-02-04 15:58:54 +01:00
Eric Farman	b2f96f9e4f	vfio-ccw: Connect the device request notifier Now that the vfio-ccw code has a notifier interface to request that a device be unplugged, let's wire that together. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210104202057.48048-4-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2021-01-21 11:19:45 +01:00
Peter Maydell	729cc68373	Remove superfluous timer_del() calls This commit is the result of running the timer-del-timer-free.cocci script on the whole source tree. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201215154107.3255-4-peter.maydell@linaro.org	2021-01-08 15:13:38 +00:00
Eduardo Habkost	1e198715e1	qdev: Rename qdev_get_prop_ptr() to object_field_prop_ptr() The function will be moved to common QOM code, as it is not specific to TYPE_DEVICE anymore. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-31-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-18 15:20:18 -05:00
Eduardo Habkost	ea7c1e5c3e	qdev: Move dev->realized check to qdev_property_set() Every single qdev property setter function manually checks dev->realized. We can just check dev->realized inside qdev_property_set() instead. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-24-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-18 15:20:17 -05:00
Eduardo Habkost	ce35e2295e	qdev: Move softmmu properties to qdev-properties-system.h Move the property types and property macros implemented in qdev-properties-system.c to a new qdev-properties-system.h header. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20201211220529.2290218-16-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-18 15:20:17 -05:00
Eduardo Habkost	828ade86ee	qdev: Make qdev_get_prop_ptr() get Object* arg Make the code more generic and not specific to TYPE_DEVICE. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> #s390 parts Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20201211220529.2290218-10-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-15 10:02:07 -05:00
Kirti Wankhede	bb0990d174	vfio: Change default dirty pages tracking behavior during migration By default dirty pages tracking is enabled during iterative phase (pre-copy phase). Added per device opt-out option 'x-pre-copy-dirty-page-tracking' to disable dirty pages tracking during iterative phase. If the option 'x-pre-copy-dirty-page-tracking=off' is set for any VFIO device, dirty pages tracking during iterative phase will be disabled. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-23 10:05:58 -07:00
Alex Williamson	cf254988a5	vfio: Make migration support experimental Support for migration of vfio devices is still in flux. Developers are attempting to add support for new devices and new architectures, but none are yet readily available for validation. We have concerns whether we're transferring device resources at the right point in the migration, whether we're guaranteeing that updates during pre-copy are migrated, and whether we can provide bit-stream compatibility should any of this change. Even the question of whether devices should participate in dirty page tracking during pre-copy seems contentious. In short, migration support has not had enough soak time and it feels premature to mark it as supported. Create an experimental option such that we can continue to develop. [Retaining previous acks/reviews for a previously identical code change with different specifics in the commit log.] Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-23 08:29:29 -07:00
Stefan Weil	ac9574bc87	docs: Fix some typos (found by codespell) Fix also a similar typo in a code comment. Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20201117193448.393472-1-sw@weilnetz.de> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-11-18 09:29:41 +01:00
Kirti Wankhede	e408aeef86	Fix use after free in vfio_migration_probe Fixes Coverity issue: CID 1436126: Memory - illegal accesses (USE_AFTER_FREE) Fixes: `a9e271ec9b` ("vfio: Add migration region initialization and finalize function") Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: David Edmondson <dme@dme.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-12 15:58:16 -07:00
Jean-Philippe Brucker	1b296c3def	vfio: Don't issue full 2^64 unmap IOMMUs may declare memory regions spanning from 0 to UINT64_MAX. When attempting to deal with such region, vfio_listener_region_del() passes a size of 2^64 to int128_get64() which throws an assertion failure. Even ignoring this, the VFIO_IOMMU_DMA_MAP ioctl cannot handle this size since the size field is 64-bit. Split the request in two. Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-11-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-11-03 16:39:05 -05:00
Bharat Bhushan	b917749842	vfio: Set IOMMU page size as per host supported page size Set IOMMU supported page size mask same as host Linux supported page size mask. Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-9-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-11-03 07:19:27 -05:00
Zhengui li	c624b6b312	vfio: fix incorrect print type The type of input variable is unsigned int while the printer type is int. So fix incorrect print type. Signed-off-by: Zhengui li <lizhengui@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:52 -07:00
Amey Narkhede	88eef59796	hw/vfio: Use lock guard macros Use qemu LOCK_GUARD macros in hw/vfio. Saves manual unlock calls Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:52 -07:00
Matthew Rosato	92fe289ace	vfio: Add routine for finding VFIO_DEVICE_GET_INFO capabilities Now that VFIO_DEVICE_GET_INFO supports capability chains, add a helper function to find specific capabilities in the chain. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:52 -07:00
Matthew Rosato	7486a62845	vfio: Find DMA available capability The underlying host may be limiting the number of outstanding DMA requests for type 1 IOMMU. Add helper functions to check for the DMA available capability and retrieve the current number of DMA mappings allowed. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [aw: vfio_get_info_dma_avail moved inside CONFIG_LINUX] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Matthew Rosato	3ab7a0b40d	vfio: Create shared routine for scanning info capabilities Rather than duplicating the same loop in multiple locations, create a static function to do the work. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	3710586caa	qapi: Add VFIO devices migration stats in Migration stats Added amount of bytes transferred to the VM at destination by all VFIO devices Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	a22651053b	vfio: Make vfio-pci device migration capable If the device is not a failover primary device, call vfio_migration_probe() and vfio_migration_finalize() to enable migration support for those devices that support it respectively to tear it down again. Removed migration blocker from VFIO PCI device specific structure and use migration blocker from generic structure of VFIO device. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	9e7b0442f2	vfio: Add ioctl to get dirty pages bitmap during dma unmap With vIOMMU, IO virtual address range can get unmapped while in pre-copy phase of migration. In that case, unmap ioctl should return pages pinned in that range and QEMU should find its correcponding guest physical addresses and report those dirty. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> [aw: fix error_report types, fix cpu_physical_memory_set_dirty_lebitmap() cast] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	9a04fe0957	vfio: Dirty page tracking when vIOMMU is enabled When vIOMMU is enabled, register MAP notifier from log_sync when all devices in container are in stop and copy phase of migration. Call replay and get dirty pages from notifier callback. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	b6dd6504e3	vfio: Add vfio_listener_log_sync to mark dirty pages vfio_listener_log_sync gets list of dirty pages from container using VFIO_IOMMU_GET_DIRTY_BITMAP ioctl and mark those pages dirty when all devices are stopped and saving state. Return early for the RAM block section of mapped MMIO region. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> [aw: fix error_report types, fix cpu_physical_memory_set_dirty_lebitmap() cast] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	e663f51683	vfio: Add function to start and stop dirty pages tracking Call VFIO_IOMMU_DIRTY_PAGES ioctl to start and stop dirty pages tracking for VFIO devices. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Kirti Wankhede	87ea529c50	vfio: Get migration capability flags for container Added helper functions to get IOMMU info capability chain. Added function to get migration capability information from that capability chain for IOMMU container. Similar change was proposed earlier: https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg03759.html Disable migration for devices if IOMMU module doesn't support migration capability. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Cc: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	3336d21710	vfio: Add load state functions to SaveVMHandlers Sequence during _RESUMING device state: While data for this device is available, repeat below steps: a. read data_offset from where user application should write data. b. write data of data_size to migration region from data_offset. c. write data_size which indicates vendor driver that data is written in staging buffer. For user, data is opaque. User should write data in the same order as received. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	1bc3c535ff	vfio: Add save state functions to SaveVMHandlers Added .save_live_pending, .save_live_iterate and .save_live_complete_precopy functions. These functions handles pre-copy and stop-and-copy phase. In _SAVING\|_RUNNING device state or pre-copy phase: - read pending_bytes. If pending_bytes > 0, go through below steps. - read data_offset - indicates kernel driver to write data to staging buffer. - read data_size - amount of data in bytes written by vendor driver in migration region. - read data_size bytes of data from data_offset in the migration region. - Write data packet to file stream as below: {VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data, VFIO_MIG_FLAG_END_OF_STATE } In _SAVING device state or stop-and-copy phase a. read config space of device and save to migration file stream. This doesn't need to be from vendor driver. Any other special config state from driver can be saved as data in following iteration. b. read pending_bytes. If pending_bytes > 0, go through below steps. c. read data_offset - indicates kernel driver to write data to staging buffer. d. read data_size - amount of data in bytes written by vendor driver in migration region. e. read data_size bytes of data from data_offset in the migration region. f. Write data packet as below: {VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data} g. iterate through steps b to f while (pending_bytes > 0) h. Write {VFIO_MIG_FLAG_END_OF_STATE} When data region is mapped, its user's responsibility to read data from data_offset of data_size before moving to next steps. Added fix suggested by Artem Polyakov to reset pending_bytes in vfio_save_iterate(). Added fix suggested by Zhi Wang to add 0 as data size in migration stream and add END_OF_STATE delimiter to indicate phase complete. Suggested-by: Artem Polyakov <artemp@nvidia.com> Suggested-by: Zhi Wang <zhi.wang.linux@gmail.com> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	7c2f5f75f9	vfio: Register SaveVMHandlers for VFIO device Define flags to be used as delimiter in migration stream for VFIO devices. Added .save_setup and .save_cleanup functions. Map & unmap migration region from these functions at source during saving or pre-copy phase. Set VFIO device state depending on VM's state. During live migration, VM is running when .save_setup is called, _SAVING \| _RUNNING state is set for VFIO device. During save-restore, VM is paused, _SAVING state is set for VFIO device. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	050c588c2e	vfio: Add migration state change notifier Added migration state change notifier to get notification on migration state change. These states are translated to VFIO device state and conveyed to vendor driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	02a7e71b1e	vfio: Add VM state change handler to know state of VM VM state change handler is called on change in VM's state. Based on VM state, VFIO device state should be changed. Added read/write helper functions for migration region. Added function to set device_state. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [aw: lx -> HWADDR_PRIx, remove redundant parens] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	a9e271ec9b	vfio: Add migration region initialization and finalize function Whether the VFIO device supports migration or not is decided based of migration region query. If migration region query is successful and migration region initialization is successful then migration is supported else migration is blocked. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	c5e2fb3ce4	vfio: Add save and load functions for VFIO PCI devices Added functions to save and restore PCI device specific data, specifically config space of PCI device. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	e93b733bcf	vfio: Add vfio_get_object callback to VFIODeviceOps Hook vfio_get_object callback for PCI devices. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Suggested-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Kirti Wankhede	0f7a903ba3	vfio: Add function to unmap VFIO region This function will be used for migration region. Migration region is mmaped when migration starts and will be unmapped when migration is complete. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:50 -07:00
Cornelia Huck	c8726f7b24	vfio-ccw: plug memory leak while getting region info vfio_get_dev_region_info() unconditionally allocates memory for a passed-in vfio_region_info structure (and does not re-use an already allocated structure). Therefore, we have to free the structure we pass to that function in vfio_ccw_get_region() for every region we successfully obtained information for. Fixes: `8fadea24de` ("vfio-ccw: support async command subregion") Fixes: `46ea3841ed` ("vfio-ccw: Add support for the schib region") Fixes: `f030532f2a` ("vfio-ccw: Add support for the CRW region and IRQ") Reported-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200928101701.13540-1-cohuck@redhat.com>	2020-10-02 13:52:49 +02:00
Eduardo Habkost	8063396bf3	Use OBJECT_DECLARE_SIMPLE_TYPE when possible This converts existing DECLARE_INSTANCE_CHECKER usage to OBJECT_DECLARE_SIMPLE_TYPE when possible. $ ./scripts/codeconverter/converter.py -i \ --pattern=AddObjectDeclareSimpleType $(git grep -l '' -- '*.[ch]') Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20200916182519.415636-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-18 14:12:32 -04:00
Daniel P. Berrangé	448058aa99	util: rename qemu_open() to qemu_open_old() We want to introduce a new version of qemu_open() that uses an Error object for reporting problems and make this it the preferred interface. Rename the existing method to release the namespace for the new impl. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-16 10:33:48 +01:00
Peter Maydell	f4ef8c9cc1	QOM boilerplate cleanup Documentation build fix: * memory: Remove kernel-doc comment marker (Eduardo Habkost) QOM cleanups: * Rename QOM macros for consistency between TYPE_* and type checking constants (Eduardo Habkost) QOM new macros: * OBJECT_DECLARE_* and OBJECT_DEFINE_* macros (Daniel P. Berrangé) * DECLARE__CHECKER macros (Eduardo Habkost) Automated QOM boilerplate changes: Automated changes to use DECLARE__CHECKER (Eduardo Habkost Automated changes to use OBJECT_DECLARE* (Eduardo Habkost) -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAl9abc0UHGVoYWJrb3N0 QHJlZGhhdC5jb20ACgkQKAeTb5hNxaYU9Q/8CyK1w2SlItxBhos7zojqnZ9TP1Jt b1YCApQJ+bKSPAUDyefajQA0D9HeR9bFlreiOprQnmZWOqeOvnRIxNGvelJRqRRu KcIA5DIfVMJRkKJQEXairrGdnPmFLWSLEb7AmwxyAhp5G51PCP/3kbudi3T/vrNr OaccUejs5UgImPfO8Fm+0zqZPmblq/xmtU0p77FvDxGNFPPG8ddpu7eKksGD7FYd 5bTJTtUhONYG9EJMUD2TBxnJoy1pi6AYUu4+2T211RpBcxeiyNSSitI8fZTk6BGl 33VwQib9SXjGaE8VsSvHDHhLLec7sqqr2JH3rfvyKF6BOptKWzmSzFdbo2mrRkSy 8jfCImQgTBBMAHBWP+MFTeKuzfhikZx2DbBLzpppHMMvCca6Zc+oYgR2FbVwuPsw H2YL+8Wx4Ws6RXe147toNDRbv75vnS7F3fU800Pcur5VHJWTgSpT/tggzmVPWsdU GeUgceYlXyVk5/fC89ZhhtD9eurfBSzQR4eN7/nie2wD6PFMpZkOjHwLn40uWsyq xRO0F4uYghNU1N8z6NBhEYLTBtEcS1HFEisSLQrnTQH9W0I7mBx3MaZib/uK7NLC b2gT0hossTT8Z46Z8ynoZarwO5EquAMWEQtc9hfZGWacrQEpjVm2DMYMfu83krWb xhgl+mpKqVasAPk= =RjXc -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging QOM boilerplate cleanup Documentation build fix: * memory: Remove kernel-doc comment marker (Eduardo Habkost) QOM cleanups: * Rename QOM macros for consistency between TYPE_* and type checking constants (Eduardo Habkost) QOM new macros: * OBJECT_DECLARE_* and OBJECT_DEFINE_* macros (Daniel P. Berrangé) * DECLARE__CHECKER macros (Eduardo Habkost) Automated QOM boilerplate changes: Automated changes to use DECLARE__CHECKER (Eduardo Habkost Automated changes to use OBJECT_DECLARE* (Eduardo Habkost) # gpg: Signature made Thu 10 Sep 2020 19:17:49 BST # gpg: using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6 # gpg: issuer "ehabkost@redhat.com" # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full] # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/machine-next-pull-request: (33 commits) virtio-vga: Use typedef name for instance_size vhost-user-vga: Use typedef name for instance_size xilinx_axienet: Use typedef name for instance_size lpc_ich9: Use typedef name for instance_size omap_intc: Use typedef name for instance_size xilinx_axidma: Use typedef name for instance_size tusb6010: Rename TUSB to TUSB6010 pc87312: Rename TYPE_PC87312_SUPERIO to TYPE_PC87312 vfio: Rename PCI_VFIO to VFIO_PCI usb: Rename USB_SERIAL_DEV to USB_SERIAL sabre: Rename SABRE_DEVICE to SABRE rs6000_mc: Rename RS6000MC_DEVICE to RS6000MC filter-rewriter: Rename FILTER_COLO_REWRITER to FILTER_REWRITER esp: Rename ESP_STATE to ESP ahci: Rename ICH_AHCI to ICH9_AHCI vmgenid: Rename VMGENID_DEVICE to TYPE_VMGENID vfio: Rename VFIO_AP_DEVICE_TYPE to TYPE_VFIO_AP_DEVICE dev-smartcard-reader: Rename CCID_DEV_NAME to TYPE_USB_CCID_DEV ap-device: Rename AP_DEVICE_TYPE to TYPE_AP_DEVICE gpex: Fix type checking function name ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-09-11 19:26:51 +01:00
Eduardo Habkost	01b4606440	vfio: Rename PCI_VFIO to VFIO_PCI Make the type checking macro name consistent with the TYPE_* constant. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20200902224311.1321159-56-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 13:20:22 -04:00
Eduardo Habkost	8b3a1ee5f2	vfio: Rename VFIO_AP_DEVICE_TYPE to TYPE_VFIO_AP_DEVICE This will make the type name constant consistent with the name of the type checking macro. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20200902224311.1321159-9-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 13:20:22 -04:00
Eduardo Habkost	fab2afff61	ap-device: Rename AP_DEVICE_TYPE to TYPE_AP_DEVICE This will make the type name constant consistent with the name of the type checking macro. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20200902224311.1321159-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 13:20:22 -04:00
Markus Armbruster	b15e402fc8	trace-events: Fix attribution of trace points to source Some trace points are attributed to the wrong source file. Happens when we neglect to update trace-events for code motion, or add events in the wrong place, or misspell the file name. Clean up with help of scripts/cleanup-trace-events.pl. Funnies requiring manual post-processing: * accel/tcg/cputlb.c trace points are in trace-events. * block.c and blockdev.c trace points are in block/trace-events. * hw/block/nvme.c uses the preprocessor to hide its trace point use from cleanup-trace-events.pl. * hw/tpm/tpm_spapr.c uses pseudo trace point tpm_spapr_show_buffer to guard debug code. * include/hw/xen/xen_common.h trace points are in hw/xen/trace-events. * linux-user/trace-events abbreviates a tedious list of filenames to /signal.c. net/colo-compare and net/filter-rewriter.c use pseudo trace points colo_compare_miscompare and colo_filter_rewriter_debug to guard debug code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20200806141334.3646302-5-armbru@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-09-09 17:17:58 +01:00
Eduardo Habkost	8110fa1d94	Use DECLARE_CHECKER macros Generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=TypeCheckMacro $(git grep -l '' -- '*.[ch]') Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-12-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-13-ehabkost@redhat.com> Message-Id: <20200831210740.126168-14-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 09:27:09 -04:00
Eduardo Habkost	db1015e92e	Move QOM typedefs and add missing includes Some typedefs and macros are defined after the type check macros. This makes it difficult to automatically replace their definitions with OBJECT_DECLARE_TYPE. Patch generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=QOMStructTypedefSplit $(git grep -l '' -- '.[ch]') which will split "typdef struct { ... } TypedefName" declarations. Followed by: $ ./scripts/codeconverter/converter.py -i --pattern=MoveSymbols \ $(git grep -l '' -- '.[ch]') which will: - move the typedefs and #defines above the type check macros - add missing #include "qom/object.h" lines if necessary Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-9-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-10-ehabkost@redhat.com> Message-Id: <20200831210740.126168-11-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 09:26:43 -04:00
Chen Qun	9b83b0043f	vfio/platform: Remove dead assignment in vfio_intp_interrupt() Clang static code analyzer show warning: hw/vfio/platform.c:239:9: warning: Value stored to 'ret' is never read ret = event_notifier_test_and_clear(intp->interrupt); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Message-Id: <20200827110311.164316-8-kuhn.chenqun@huawei.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-09-01 12:02:48 +02:00
Eduardo Habkost	42db0fb5e0	vfio/pci: Move QOM macros to header This will make future conversion to OBJECT_DECLARE* easier. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Tested-By: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20200825192110.3528606-43-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-08-27 14:04:55 -04:00
Pan Nengyuan	0216b18b79	hw/vfio/ap: Plug memleak in vfio_ap_get_group() Missing g_error_free() in vfio_ap_get_group() error path. Fix that. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Message-Id: <20200814160241.7915-3-pannengyuan@huawei.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2020-08-27 12:37:03 +02:00
Marc-André Lureau	4f780d5629	meson: convert hw/vfio Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:30:26 -04:00
Paolo Bonzini	2becc36a3e	meson: infrastructure for building emulators Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:30:17 -04:00
Paolo Bonzini	243af0225a	trace: switch position of headers to what Meson requires Meson doesn't enjoy the same flexibility we have with Make in choosing the include path. In particular the tracing headers are using $(build_root)/$(<D). In order to keep the include directives unchanged, the simplest solution is to generate headers with patterns like "trace/trace-audio.h" and place forwarding headers in the source tree such that for example "audio/trace.h" includes "trace/trace-audio.h". This patch is too ugly to be applied to the Makefiles now. It's only a way to separate the changes to the tracing header files from the Meson rewrite of the tracing logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:18:24 -04:00
Gerd Hoffmann	8ec1415935	vfio: fix use-after-free in display Calling ramfb_display_update() might replace the DisplaySurface with the boot display, which in turn will free the currently active DisplaySurface. So clear our DisplaySurface pinter (dpy->region.surface pointer) to (a) avoid use-after-free and (b) force replacing the boot display with the real display when switching back. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Message-id: 20200713124520.23266-1-kraxel@redhat.com	2020-07-16 10:20:12 +02:00
Markus Armbruster	af175e85f9	error: Eliminate error_propagate() with Coccinelle, part 2 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous commit did that with a Coccinelle script I consider fairly trustworthy. This commit uses the same script with the matching of return taken out, i.e. we convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... } to if (!foo(..., errp)) { ... ... } This is unsound: @err could still be read between afterwards. I don't know how to express "no read of @err without an intervening write" in Coccinelle. Instead, I manually double-checked for uses of @err. Suboptimal line breaks tweaked manually. qdev_realize() simplified further to placate scripts/checkpatch.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-36-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	668f62ec62	error: Eliminate error_propagate() with Coccinelle, part 1 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) \| - !fun(args, &err, args2) + !fun(args, errp, args2) \| - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var \| !var \| var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; \| return var; ) } @depends on rule1 \|\| rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	62a35aaa31	qapi: Use returned bool to check for failure, Coccinelle part The previous commit enables conversion of visit_foo(..., &err); if (err) { ... } to if (!visit_foo(..., errp)) { ... } for visitor functions that now return true / false on success / error. Coccinelle script: @@ identifier fun =~ "check_list\|input_type_enum\|lv_start_struct\|lv_type_bool\|lv_type_int64\|lv_type_str\|lv_type_uint64\|output_type_enum\|parse_type_bool\|parse_type_int64\|parse_type_null\|parse_type_number\|parse_type_size\|parse_type_str\|parse_type_uint64\|print_type_bool\|print_type_int64\|print_type_null\|print_type_number\|print_type_size\|print_type_str\|print_type_uint64\|qapi_clone_start_alternate\|qapi_clone_start_list\|qapi_clone_start_struct\|qapi_clone_type_bool\|qapi_clone_type_int64\|qapi_clone_type_null\|qapi_clone_type_number\|qapi_clone_type_str\|qapi_clone_type_uint64\|qapi_dealloc_start_list\|qapi_dealloc_start_struct\|qapi_dealloc_type_anything\|qapi_dealloc_type_bool\|qapi_dealloc_type_int64\|qapi_dealloc_type_null\|qapi_dealloc_type_number\|qapi_dealloc_type_str\|qapi_dealloc_type_uint64\|qobject_input_check_list\|qobject_input_check_struct\|qobject_input_start_alternate\|qobject_input_start_list\|qobject_input_start_struct\|qobject_input_type_any\|qobject_input_type_bool\|qobject_input_type_bool_keyval\|qobject_input_type_int64\|qobject_input_type_int64_keyval\|qobject_input_type_null\|qobject_input_type_number\|qobject_input_type_number_keyval\|qobject_input_type_size_keyval\|qobject_input_type_str\|qobject_input_type_str_keyval\|qobject_input_type_uint64\|qobject_input_type_uint64_keyval\|qobject_output_start_list\|qobject_output_start_struct\|qobject_output_type_any\|qobject_output_type_bool\|qobject_output_type_int64\|qobject_output_type_null\|qobject_output_type_number\|qobject_output_type_str\|qobject_output_type_uint64\|start_list\|visit_check_list\|visit_check_struct\|visit_start_alternate\|visit_start_list\|visit_start_struct\|visit_type_."; expression list args; typedef Error; Error err; @@ - fun(args, &err); - if (err) + if (!fun(args, &err)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-19-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
David Hildenbrand	aff92b8286	vfio: Convert to ram_block_discard_disable() VFIO is (except devices without a physical IOMMU or some mediated devices) incompatible with discarding of RAM. The kernel will pin basically all VM memory. Let's convert to ram_block_discard_disable(), which can now fail, in contrast to qemu_balloon_inhibit(). Leave "x-balloon-allowed" named as it is for now. Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Pierre Morel <pmorel@linux.ibm.com> Cc: Eric Farman <farman@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20200626072248.78761-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-07-02 05:54:59 -04:00
Farhan Ali	f030532f2a	vfio-ccw: Add support for the CRW region and IRQ The crw region can be used to obtain information about Channel Report Words (CRW) from vfio-ccw driver. Currently only channel-path related CRWs are passed to QEMU from vfio-ccw driver. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505125757.98209-7-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2020-06-18 12:13:54 +02:00
Eric Farman	690e29b911	vfio-ccw: Refactor ccw irq handler Make it easier to add new ones in the future. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505125757.98209-5-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2020-06-18 12:13:54 +02:00
Farhan Ali	46ea3841ed	vfio-ccw: Add support for the schib region The schib region can be used to obtain the latest SCHIB from the host passthrough subchannel. Since the guest SCHIB is virtualized, we currently only update the path related information so that the guest is aware of any path related changes when it issues the 'stsch' instruction. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505125757.98209-4-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2020-06-18 12:13:54 +02:00
Eric Farman	2a3b9cbaa7	vfio-ccw: Refactor cleanup of regions While we're at it, add a g_free() for the async_cmd_region that is the last thing currently created. g_free() knows how to handle NULL pointers, so this makes it easier to remember what cleanups need to be performed when new regions are added. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20200505125757.98209-3-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2020-06-18 12:13:54 +02:00
Peter Maydell	7d3660e798	* Miscellaneous fixes and feature enablement (many) * SEV refactoring (David) * Hyper-V initial support (Jon) * i386 TCG fixes (x87 and SSE, Joseph) * vmport cleanup and improvements (Philippe, Liran) * Use-after-free with vCPU hot-unplug (Nengyuan) * run-coverity-scan improvements (myself) * Record/replay fixes (Pavel) * -machine kernel_irqchip=split improvements for INTx (Peter) * Code cleanups (Philippe) * Crash and security fixes (PJP) * HVF cleanups (Roman) -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7jpdAUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMfjwf/X7+0euuE9dwKFKDDMmIi+4lRWnq7 gSOyE1BYSfDIUXRIukf64konXe0VpiotNYlyEaYnnQjkMdGm5E9iXKF+LgEwXj/t NSGkfj5J3VeWRG4JJp642CSN/aZWO8uzkenld3myCnu6TicuN351tDJchiFwAk9f wsXtgLKd67zE8MLVt8AP0rNTbzMHttPXnPaOXDCuwjMHNvMEKnC93UeOeM0M4H5s 3Dl2HvsNWZ2SzUG9mAbWp0bWWuoIb+Ep9//87HWANvb7Z8jratRws18i6tYt1sPx 8zOnUS87sVnh1CQlXBDd9fEcqBUVgR9pAlqaaYavNhFp5eC31euvpDU8Iw== =F4sU -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * Miscellaneous fixes and feature enablement (many) * SEV refactoring (David) * Hyper-V initial support (Jon) * i386 TCG fixes (x87 and SSE, Joseph) * vmport cleanup and improvements (Philippe, Liran) * Use-after-free with vCPU hot-unplug (Nengyuan) * run-coverity-scan improvements (myself) * Record/replay fixes (Pavel) * -machine kernel_irqchip=split improvements for INTx (Peter) * Code cleanups (Philippe) * Crash and security fixes (PJP) * HVF cleanups (Roman) # gpg: Signature made Fri 12 Jun 2020 16:57:04 BST # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (116 commits) target/i386: Remove obsolete TODO file stubs: move Xen stubs to accel/ replay: fix replay shutdown for console mode exec/cpu-common: Move MUSB specific typedefs to 'hw/usb/hcd-musb.h' hw/usb: Move device-specific declarations to new 'hcd-musb.h' header exec/memory: Remove unused MemoryRegionMmio type checkpatch: reversed logic with acpi test checks target/i386: sev: Unify SEVState and SevGuestState target/i386: sev: Remove redundant handle field target/i386: sev: Remove redundant policy field target/i386: sev: Remove redundant cbitpos and reduced_phys_bits fields target/i386: sev: Partial cleanup to sev_state global target/i386: sev: Embed SEVState in SevGuestState target/i386: sev: Rename QSevGuestInfo target/i386: sev: Move local structure definitions into .c file target/i386: sev: Remove unused QSevGuestInfoClass xen: fix build without pci passthrough i386: hvf: Drop HVFX86EmulatorState i386: hvf: Move mmio_buf into CPUX86State i386: hvf: Move lazy_flags into CPUX86State ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/i386/acpi-build.c	2020-06-12 23:06:22 +01:00
Thomas Huth	643a4eacef	hw/vfio/pci-quirks: Fix broken legacy IGD passthrough The #ifdef CONFIG_VFIO_IGD in pci-quirks.c is not working since the required header config-devices.h is not included, so that the legacy IGD passthrough is currently broken. Let's include the right header to fix this issue. Buglink: https://bugs.launchpad.net/qemu/+bug/1882784 Fixes: `29d62771c8` ("hw/vfio: Move the IGD quirk code to a separate file") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-06-11 11:36:40 -06:00
Jon Derrick	ee7932b0bb	hw/vfio: Add VMD Passthrough Quirk The VMD endpoint provides a real PCIe domain to the guest, including bridges and endpoints. Because the VMD domain is enumerated by the guest kernel, the guest kernel will assign Guest Physical Addresses to the downstream endpoint BARs and bridge windows. When the guest kernel performs MMIO to VMD sub-devices, MMU will translate from the guest address space to the physical address space. Because the bridges have been programmed with guest addresses, the bridges will reject the transaction containing physical addresses. VMD device 28C0 natively assists passthrough by providing the Host Physical Address in shadow registers accessible to the guest for bridge window assignment. The shadow registers are valid if bit 1 is set in VMD VMLOCK config register 0x70. In order to support existing VMDs, this quirk provides the shadow registers in a vendor-specific PCI capability to the vfio-passthrough device for all VMD device ids which don't natively assist with passthrough. The Linux VMD driver is updated to check for this new vendor-specific capability. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-06-11 11:36:39 -06:00
Peter Xu	97a3757616	vfio/pci: Use kvm_irqchip_add_irqfd_notifier_gsi() for irqfds VFIO is currently the only one left that is not using the generic function (kvm_irqchip_add_irqfd_notifier_gsi()) to register irqfds. Let VFIO use the common framework too. Follow up patches will introduce extra features for kvm irqfd, so that VFIO can easily leverage that after the switch. Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200318145204.74483-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:10:28 -04:00
Jared Rossi	24e58a7b1d	vfio-ccw: allow non-prefetch ORBs Remove the explicit prefetch check when using vfio-ccw devices. This check does not trigger in practice as all Linux channel programs are intended to use prefetch. Newer Linux kernel versions do not require to force the PFCH flag with vfio-ccw devices anymore. Signed-off-by: Jared Rossi <jrossi@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20200512181535.18630-2-jrossi@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2020-06-05 17:13:11 +02:00
Leonardo Bras	9c7c040702	vfio/nvlink: Remove exec permission to avoid SELinux AVCs If SELinux is setup without 'execmem' permission for qemu, all mmap with (PROT_WRITE \| PROT_EXEC) will fail and print a warning in SELinux log. If "nvlink2-mr" memory allocation fails (fist diff), it will cause guest NUMA nodes to not be correctly configured (V100 memory will not be visible for guest, nor its NUMA nodes). Not having 'execmem' permission is intesting for virtual machines to avoid buffer-overflow based attacks, and it's adopted in distros like RHEL. So, removing the PROT_EXEC flag seems the right thing to do. Browsing some other code that mmaps memory for usage with memory_region_init_ram_device_ptr, I could notice it's usual to not have PROT_EXEC (only PROT_READ \| PROT_WRITE), so it should be no problem around this. Signed-off-by: Leonardo Bras <leobras.c@gmail.com> Message-Id: <20200501055448.286518-1-leobras.c@gmail.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-05-27 15:29:36 +10:00
Gerd Hoffmann	2fc979cb9d	Revert "hw/display/ramfb: initialize fw-config space with xres/ yres" This reverts commit `f79081b4b7`. Patch has broken byteorder handling: RAMFBCfg fields are in bigendian byteorder, the reset function doesn't care so native byteorder is used instead. Given this went unnoticed so far the feature is obviously unused, so just revert the patch. Cc: Hou Qiming <hqm03ster@gmail.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Message-id: 20200429115236.28709-2-kraxel@redhat.com	2020-05-18 15:42:34 +02:00
Markus Armbruster	b69c3c21a5	qdev: Unrealize must not fail Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>	2020-05-15 07:08:14 +02:00
Markus Armbruster	40c2281cc3	Drop more @errp parameters after previous commit Several functions can't fail anymore: ich9_pm_add_properties(), device_add_bootindex_property(), ppc_compat_add_property(), spapr_caps_add_properties(), PropertyInfo.create(). Drop their @errp parameter. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-16-armbru@redhat.com>	2020-05-15 07:08:14 +02:00
Markus Armbruster	d2623129a7	qom: Drop parameter @errp of object_property_add() & friends The only way object_property_add() can fail is when a property with the same name already exists. Since our property names are all hardcoded, failure is a programming error, and the appropriate way to handle it is passing &error_abort. Same for its variants, except for object_property_add_child(), which additionally fails when the child already has a parent. Parentage is also under program control, so this is a programming error, too. We have a bit over 500 callers. Almost half of them pass &error_abort, slightly fewer ignore errors, one test case handles errors, and the remaining few callers pass them to their own callers. The previous few commits demonstrated once again that ignoring programming errors is a bad idea. Of the few ones that pass on errors, several violate the Error API. The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. ich9_pm_add_properties(), sparc32_ledma_realize(), sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize() are wrong that way. When the one appropriate choice of argument is &error_abort, letting users pick the argument is a bad idea. Drop parameter @errp and assert the preconditions instead. There's one exception to "duplicate property name is a programming error": the way object_property_add() implements the magic (and undocumented) "automatic arrayification". Don't drop @errp there. Instead, rename object_property_add() to object_property_try_add(), and add the obvious wrapper object_property_add(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-15-armbru@redhat.com> [Two semantic rebase conflicts resolved]	2020-05-15 07:07:58 +02:00
Daniel Brodsky	6e8a355de6	lockable: replaced locks with lock guard macros where appropriate - ran regexp "qemu_mutex_lock$.$.\n.*if" to find targets - replaced result with QEMU_LOCK_GUARD if all unlocks at function end - replaced result with WITH_QEMU_LOCK_GUARD if unlock not at end Signed-off-by: Daniel Brodsky <dnbrdsky@gmail.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-id: 20200404042108.389635-3-dnbrdsky@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-05-04 16:07:43 +01:00
Alexey Kardashevskiy	79178edd2a	vfio/spapr: Fix page size calculation Coverity detected an issue (CID 1421903) with potential call of clz64(0) which returns 64 which make it do "<<" with a negative number. This checks the mask and avoids undefined behaviour. In practice pgsizes and memory_region_iommu_get_min_page_size() always have some common page sizes and even if they did not, the resulting page size would be 0x8000.0000.0000.0000 (gcc 9.2) and ioctl(VFIO_IOMMU_SPAPR_TCE_CREATE) would fail anyway. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20200324063912.25063-1-aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-04-07 08:55:10 +10:00
Philippe Mathieu-Daudé	180f3fd2d7	hw/vfio/display: Remove superfluous semicolon Fixes: `8b818e059b` Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200218094402.26625-9-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-02-18 20:20:49 +01:00
Michal Privoznik	b09d51c909	Report stringified errno in VFIO related errors In a few places we report errno formatted as a negative integer. This is not as user friendly as it can be. Use strerror() and/or error_setg_errno() instead. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Message-Id: <4949c3ecf1a32189b8a4b5eb4b0fd04c1122501d.1581674006.git.mprivozn@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-02-18 20:20:49 +01:00
Thomas Huth	29d62771c8	hw/vfio: Move the IGD quirk code to a separate file The IGD quirk code defines a separate device, the so-called "vfio-pci-igd-lpc-bridge" which shows up as a user-creatable device in all QEMU binaries that include the vfio code. This is a little bit unfortunate for two reasons: First, this device is completely useless in binaries like qemu-system-s390x. Second we also would like to disable it in downstream RHEL which currently requires some extra patches there since the device does not have a proper Kconfig-style switch yet. So it would be good if the device could be disabled more easily, thus let's move the code to a separate file instead and introduce a proper Kconfig switch for it which gets only enabled by default if we also have CONFIG_PC_PCI enabled. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-02-06 11:55:42 -07:00
Marc-André Lureau	4f67d30b5e	qdev: set properties with device_class_set_props() The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-24 20:59:15 +01:00
Peter Xu	0446f81217	vfio/pci: Don't remove irqchip notifier if not registered The kvm irqchip notifier is only registered if the device supports INTx, however it's unconditionally removed. If the assigned device does not support INTx, this will cause QEMU to crash when unplugging the device from the system. Change it to conditionally remove the notifier only if the notify hook is setup. CC: Eduardo Habkost <ehabkost@redhat.com> CC: David Gibson <david@gibson.dropbear.id.au> CC: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org # v4.2 Reported-by: yanghliu@redhat.com Debugged-by: Eduardo Habkost <ehabkost@redhat.com> Fixes: `c5478fea27` ("vfio/pci: Respond to KVM irqchip change notifier") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1782678 Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-01-06 14:19:42 -07:00
Vladimir Sementsov-Ogievskiy	b5e45b0f48	hw/vfio/ap: drop local_err from vfio_ap_realize No reason for local_err here, use errp directly instead. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20191205174635.18758-21-vsementsov@virtuozzo.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2019-12-18 08:43:19 +01:00
Boris Fiuczynski	91f751dc11	vfio-ccw: Fix error message Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Message-Id: <20191128143015.5231-1-fiuczy@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2019-12-14 10:22:38 +01:00
David Gibson	c5478fea27	vfio/pci: Respond to KVM irqchip change notifier VFIO PCI devices already respond to the pci intx routing notifier, in order to update kernel irqchip mappings when routing is updated. However this won't handle the case where the irqchip itself is replaced by a different model while retaining the same routing. This case can happen on the pseries machine type due to PAPR feature negotiation. To handle that case, add a handler for the irqchip change notifier, which does much the same thing as the routing notifier, but is unconditional, rather than being a no-op when the routing hasn't changed. Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Alex Williamson <alex.williamson@redhat.com>	2019-11-26 10:11:30 +11:00
David Gibson	ad54dbd89d	vfio/pci: Split vfio_intx_update() This splits the vfio_intx_update() function into one part doing the actual reconnection with the KVM irqchip (vfio_intx_update(), now taking an argument with the new routing) and vfio_intx_routing_notifier() which handles calls to the pci device intx routing notifier and calling vfio_intx_update() when necessary. This will make adding support for the irqchip change notifier easier. Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Alex Williamson <alex.williamson@redhat.com>	2019-11-26 10:11:30 +11:00
Paolo Bonzini	29b95c992a	vfio: vfio-pci requires EDID hw/vfio/display.c needs the EDID subsystem, select it. Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-11-18 10:41:49 -07:00
Jens Freimann	ed92369a57	vfio: don't ignore return value of migrate_add_blocker When an error occurs in migrate_add_blocker() it sets a negative return value and uses error pointer we pass in. Instead of just looking at the error pointer check for a negative return value and avoid a coverity error because the return value is set but never used. This fixes CID 1407219. Reported-by: Coverity (CID 1407219) Fixes: `f045a0104c` ("vfio: unplug failover primary device before migration") Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-11-18 10:41:48 -07:00
Michal Privoznik	1335d64323	hw/vfio/pci: Fix double free of migration_blocker When user tries to hotplug a VFIO device, but the operation fails somewhere in the middle (in my testing it failed because of RLIMIT_MEMLOCK forbidding more memory allocation), then a double free occurs. In vfio_realize() the vdev->migration_blocker is allocated, then something goes wrong which causes control to jump onto 'error' label where the error is freed. But the pointer is left pointing to invalid memory. Later, when vfio_instance_finalize() is called, the memory is freed again. In my testing the second hunk was sufficient to fix the bug, but I figured the first hunk doesn't hurt either. ==169952== Invalid read of size 8 ==169952== at 0xA47DCD: error_free (error.c:266) ==169952== by 0x4E0A18: vfio_instance_finalize (pci.c:3040) ==169952== by 0x8DF74C: object_deinit (object.c:606) ==169952== by 0x8DF7BE: object_finalize (object.c:620) ==169952== by 0x8E0757: object_unref (object.c:1074) ==169952== by 0x45079C: memory_region_unref (memory.c:1779) ==169952== by 0x45376B: do_address_space_destroy (memory.c:2793) ==169952== by 0xA5C600: call_rcu_thread (rcu.c:283) ==169952== by 0xA427CB: qemu_thread_start (qemu-thread-posix.c:519) ==169952== by 0x80A8457: start_thread (in /lib64/libpthread-2.29.so) ==169952== by 0x81C96EE: clone (in /lib64/libc-2.29.so) ==169952== Address 0x143137e0 is 0 bytes inside a block of size 48 free'd ==169952== at 0x4A342BB: free (vg_replace_malloc.c:530) ==169952== by 0xA47E05: error_free (error.c:270) ==169952== by 0x4E0945: vfio_realize (pci.c:3025) ==169952== by 0x76A4FF: pci_qdev_realize (pci.c:2099) ==169952== by 0x689B9A: device_set_realized (qdev.c:876) ==169952== by 0x8E2C80: property_set_bool (object.c:2080) ==169952== by 0x8E0EF6: object_property_set (object.c:1272) ==169952== by 0x8E3FC8: object_property_set_qobject (qom-qobject.c:26) ==169952== by 0x8E11DB: object_property_set_bool (object.c:1338) ==169952== by 0x5E7BDD: qdev_device_add (qdev-monitor.c:673) ==169952== by 0x5E81E5: qmp_device_add (qdev-monitor.c:798) ==169952== by 0x9E18A8: do_qmp_dispatch (qmp-dispatch.c:132) ==169952== Block was alloc'd at ==169952== at 0x4A35476: calloc (vg_replace_malloc.c:752) ==169952== by 0x51B1158: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6000.6) ==169952== by 0xA47357: error_setv (error.c:61) ==169952== by 0xA475D9: error_setg_internal (error.c:97) ==169952== by 0x4DF8C2: vfio_realize (pci.c:2737) ==169952== by 0x76A4FF: pci_qdev_realize (pci.c:2099) ==169952== by 0x689B9A: device_set_realized (qdev.c:876) ==169952== by 0x8E2C80: property_set_bool (object.c:2080) ==169952== by 0x8E0EF6: object_property_set (object.c:1272) ==169952== by 0x8E3FC8: object_property_set_qobject (qom-qobject.c:26) ==169952== by 0x8E11DB: object_property_set_bool (object.c:1338) ==169952== by 0x5E7BDD: qdev_device_add (qdev-monitor.c:673) Fixes: `f045a0104c` ("vfio: unplug failover primary device before migration") Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-11-18 10:41:47 -07:00
Jens Freimann	f045a0104c	vfio: unplug failover primary device before migration As usual block all vfio-pci devices from being migrated, but make an exception for failover primary devices. This is achieved by setting unmigratable to 0 but also add a migration blocker for all vfio-pci devices except failover primary devices. These will be unplugged before migration happens by the migration handler of the corresponding virtio-net standby device. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Message-Id: <20191029114905.6856-12-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-10-29 18:55:26 -04:00
Wei Yang	038adc2f58	core: replace getpagesize() with qemu_real_host_page_size There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-26 15:38:06 +02:00
Evgeny Yakovlev	d964d3b5ab	hw/vfio/pci: fix double free in vfio_msi_disable The following guest behaviour patter leads to double free in VFIO PCI: 1. Guest enables MSI interrupts vfio_msi_enable is called, but fails in vfio_enable_vectors. In our case this was because VFIO GPU device was in D3 state. Unhappy path in vfio_msi_enable will g_free(vdev->msi_vectors) but not set this pointer to NULL 2. Guest still sees MSI an enabled after that because emulated config write is done in vfio_pci_write_config unconditionally before calling vfio_msi_enable 3. Guest disables MSI interrupts vfio_msi_disable is called and tries to g_free(vdev->msi_vectors) in vfio_msi_disable_common => double free Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-10-10 11:07:28 -06:00
Eric Auger	549d400587	memory: allow memory_region_register_iommu_notifier() to fail Currently, when a notifier is attempted to be registered and its flags are not supported (especially the MAP one) by the IOMMU MR, we generally abruptly exit in the IOMMU code. The failure could be handled more nicely in the caller and especially in the VFIO code. So let's allow memory_region_register_iommu_notifier() to fail as well as notify_flag_changed() callback. All sites implementing the callback are updated. This patch does not yet remove the exit(1) in the amd_iommu code. in SMMUv3 we turn the warning message into an error message saying that the assigned device would not work properly. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-04 18:49:18 +02:00
Eric Auger	d7d8783647	vfio: Turn the container error into an Error handle The container error integer field is currently used to store the first error potentially encountered during any vfio_listener_region_add() call. However this fails to propagate detailed error messages up to the vfio_connect_container caller. Instead of using an integer, let's use an Error handle. Messages are slightly reworded to accomodate the propagation. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-04 18:49:18 +02:00
Chen Zhang	f75ca62723	vfio: fix a typo Signed-off-by: Chen Zhang <tgfbeta@me.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <8E5A9C27-C76D-46CF-85B0-79121A00B05F@me.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-09-19 11:50:37 +02:00
Tony Nguyen	d5d680cacc	memory: Access MemoryRegion with endianness Preparation for collapsing the two byte swaps adjust_endianness and handle_bswap into the former. Call memory_region_dispatch_{read\|write} with endianness encoded into the "MemOp op" operand. This patch does not change any behaviour as memory_region_dispatch_{read\|write} is yet to handle the endianness. Once it does handle endianness, callers with byte swaps can collapse them into adjust_endianness. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Message-Id: <8066ab3eb037c0388dfadfe53c5118429dd1de3a.1566466906.git.tony.nguyen@bt.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-09-03 08:30:39 -07:00
Tony Nguyen	475fbf0a3c	hw/vfio: Access MemoryRegion with MemOp The memory_region_dispatch_{read\|write} operand "unsigned size" is being converted into a "MemOp op". Convert interfaces by using no-op size_memop. After all interfaces are converted, size_memop will be implemented and the memory_region_dispatch_{read\|write} operand "unsigned size" will be converted into a "MemOp op". As size_memop is a no-op, this patch does not change any behaviour. Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <e70ff5814ac3656974180db6375397c43b0bc8b8.1566466906.git.tony.nguyen@bt.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-09-03 08:30:38 -07:00
Markus Armbruster	54d31236b9	sysemu: Split sysemu/runstate.h off sysemu/sysemu.h sysemu/sysemu.h is a rather unfocused dumping ground for stuff related to the system-emulator. Evidence: * It's included widely: in my "build everything" tree, changing sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h, down from 5400 due to the previous two commits). * It pulls in more than a dozen additional headers. Split stuff related to run state management into its own header sysemu/runstate.h. Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400 to 4200. Touching new sysemu/runstate.h recompiles some 500 objects. Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also add qemu/main-loop.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-30-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [Unbreak OS-X build]	2019-08-16 13:37:36 +02:00
Markus Armbruster	d5938f29fe	Clean up inclusion of sysemu/sysemu.h In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 5400 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Almost a third of its inclusions are actually superfluous. Delete them. Downgrade two more to qapi/qapi-types-run-state.h, and move one from char/serial.h to char/serial.c. hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and stubs/semihost.c define variables declared in sysemu/sysemu.h without including it. The compiler is cool with that, but include it anyway. This doesn't reduce actual use much, as it's still included into widely included headers. The next commit will tackle that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-27-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>	2019-08-16 13:31:53 +02:00
Markus Armbruster	a27bd6c779	Include hw/qdev-properties.h less In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>	2019-08-16 13:31:53 +02:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	dc5e9ac716	Include qemu/queue.h slightly less Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-20-armbru@redhat.com>	2019-08-16 13:31:52 +02:00

... 2 3 4 5 6 ...

672 Commits