mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Peter Maydell	6fc6931231	virtio,pc,pci: features, cleanups, fixes more memslots support in libvhost-user support PCIe Gen5/Gen6 link speeds in pcie more traces in vdpa network simulation devices support in vdpa SMBIOS type 9 descriptor implementation Bump max_cpus to 4096 vcpus in q35 aw-bits and granule options in VIRTIO-IOMMU Support report NUMA nodes for device memory using GI in acpi Beginning of shutdown event support in pvpanic fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/ 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg= =C0UU -----END PGP SIGNATURE----- Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging virtio,pc,pci: features, cleanups, fixes more memslots support in libvhost-user support PCIe Gen5/Gen6 link speeds in pcie more traces in vdpa network simulation devices support in vdpa SMBIOS type 9 descriptor implementation Bump max_cpus to 4096 vcpus in q35 aw-bits and granule options in VIRTIO-IOMMU Support report NUMA nodes for device memory using GI in acpi Beginning of shutdown event support in pvpanic fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/ # 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt # +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ # uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr # 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G # 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg= # =C0UU # -----END PGP SIGNATURE----- # gpg: Signature made Tue 12 Mar 2024 22:03:31 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (68 commits) docs/specs/pvpanic: document shutdown event hw/cxl: Fix missing reserved data in CXL Device DVSEC hmat acpi: Fix out of bounds access due to missing use of indirection hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory. qemu-options.hx: Document the virtio-iommu-pci aw-bits option hw/arm/virt: Set virtio-iommu aw-bits default value to 48 hw/i386/q35: Set virtio-iommu aw-bits default value to 39 virtio-iommu: Add an option to define the input range width virtio-iommu: Trace domain range limits as unsigned int qemu-options.hx: Document the virtio-iommu-pci granule option virtio-iommu: Change the default granule to the host page size virtio-iommu: Add a granule property hw/i386/acpi-build: Add support for SRAT Generic Initiator structures hw/acpi: Implement the SRAT GI affinity structure qom: new object to associate device to NUMA node hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it hw/i386/pc: Set "normal" boot device order in pc_basic_device_init() hw/i386/pc: Avoid one use of the current_machine global hw/i386/pc: Remove "rtc_state" link again Revert "hw/i386/pc: Confine system flash handling to pc_sysfw" ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # hw/core/machine.c	2024-03-13 15:11:53 +00:00
Lukas Stockner	c08da86dc4	pcie: Support PCIe Gen5/Gen6 link speeds This patch extends the PCIe link speed option so that slots can be configured as supporting 32GT/s (Gen5) or 64GT/s (Gen5) speeds. This is as simple as setting the appropriate bit in LnkCap2 and the appropriate value in LnkCap and LnkCtl2. Signed-off-by: Lukas Stockner <lstockner@genesiscloud.com> Message-Id: <20240215012326.3272366-1-lstockner@genesiscloud.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2024-03-12 17:56:55 -04:00
Vladimir Sementsov-Ogievskiy	86f0aa1d43	hw/pci: add some convenient trace-events for pcie and shpc hotplug Add trace-events that may help to debug problems with hotplugging. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240301154146.761531-2-vsementsov@yandex-team.ru> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2024-03-11 22:09:42 +01:00
Akihiko Odaki	7c228c5f33	pcie: Specify 0 for ARI next function numbers The current implementers of ARI are all SR-IOV devices. The ARI next function number field is undefined for VF according to PCI Express Base Specification Revision 5.0 Version 1.0 section 9.3.7.7. The PF still requires some defined value so end the linked list formed with the field by specifying 0 as required for any ARI implementation according to section 7.8.7.2. For migration, the field will keep having 1 as its value on the old QEMU machine versions. Fixes: `2503461691` ("pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt") Fixes: `44c2c09488` ("hw/nvme: Add support for SR-IOV") Fixes: `3a977deebe` ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230710153838.33917-3-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-07-10 18:59:32 -04:00
Akihiko Odaki	445416e301	pcie: Use common ARI next function number Currently the only implementers of ARI is SR-IOV devices, and they behave similar. Share the ARI next function number. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Ani Sinha <anisinha@redhat.com> Message-Id: <20230710153838.33917-2-akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-07-10 18:59:32 -04:00
Leonardo Bras	625b370c45	pcie: Add hotplug detect state register to cmask When trying to migrate a machine type pc-q35-6.0 or lower, with this cmdline options, -device driver=pcie-root-port,port=18,chassis=19,id=pcie-root-port18,bus=pcie.0,addr=0x12 \ -device driver=nec-usb-xhci,p2=4,p3=4,id=nex-usb-xhci0,bus=pcie-root-port18,addr=0x12.0x1 the following bug happens after all ram pages were sent: qemu-kvm: get_pci_config_device: Bad config data: i=0x6e read: 0 device: 40 cmask: ff wmask: 0 w1cmask:19 qemu-kvm: Failed to load PCIDevice:config qemu-kvm: Failed to load pcie-root-port:parent_obj.parent_obj.parent_obj qemu-kvm: error while loading state for instance 0x0 of device '0000:00:12.0/pcie-root-port' qemu-kvm: load of migration failed: Invalid argument This happens on pc-q35-6.0 or lower because of: { "ICH9-LPC", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" } In this scenario, hotplug_handler_plug() calls pcie_cap_slot_plug_cb(), which sets dev->config byte 0x6e with bit PCI_EXP_SLTSTA_PDS to signal PCI hotplug for the guest. After a while the guest will deal with this hotplug and qemu will clear the above bit. Then, during migration, get_pci_config_device() will compare the configs of both the freshly created device and the one that is being received via migration, which will differ due to the PCI_EXP_SLTSTA_PDS bit and cause the bug to reproduce. To avoid this fake incompatibility, there are tree fields in PCIDevice that can help: - wmask: Used to implement R/W bytes, and - w1cmask: Used to implement RW1C(Write 1 to Clear) bytes - cmask: Used to enable config checks on load. According to PCI Express® Base Specification Revision 5.0 Version 1.0, table 7-27 (Slot Status Register) bit 6, the "Presence Detect State" is listed as RO (read-only), so it only makes sense to make use of the cmask field. So, clear PCI_EXP_SLTSTA_PDS bit on cmask, so the fake incompatibility on get_pci_config_device() does not abort the migration. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215819 Signed-off-by: Leonardo Bras <leobras@redhat.com> Message-Id: <20230706045546.593605-3-leobras@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>	2023-07-10 18:59:32 -04:00
Alex Williamson	cf2916f6a1	pcie: Add a PCIe capability version helper Report the PCIe capability version for a device Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Robin Voetter <robin@streamhpc.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Vladimir Sementsov-Ogievskiy	1768e97b91	pcie: set power indicator to off on reset by default It should not be zero, the only valid values are ON, OFF and BLINK. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-13-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 19:13:41 -05:00
Vladimir Sementsov-Ogievskiy	5aaed9caf1	pcie: introduce pcie_sltctl_powered_off() helper In pcie_cap_slot_write_config() we check for PCI_EXP_SLTCTL_PWR_OFF in a bad form. We should distinguish PCI_EXP_SLTCTL_PWR which is a "mask" and PCI_EXP_SLTCTL_PWR_OFF which is value for that mask. Better code is in pcie_cap_slot_unplug_request_cb() and in pcie_cap_update_power(). Let's use same pattern everywhere. To simplify things add also a helper. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-12-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 03:10:49 -05:00
Vladimir Sementsov-Ogievskiy	f90d932094	pcie: pcie_cap_slot_enable_power() use correct helper _by_mask() helpers shouldn't be used here (and that's the only one). _by_mask() helpers do shift their value argument, but in pcie.c code we use values that are already shifted appropriately. Happily, PCI_EXP_SLTCTL_PWR_ON is zero, so shift doesn't matter. But if we apply same helper for PCI_EXP_SLTCTL_PWR_OFF constant it will do wrong thing. So, let's use instead pci_word_test_and_clear_mask() which is already used in the file to clear PCI_EXP_SLTCTL_PWR_OFF bit in pcie_cap_slot_init() and pcie_cap_slot_reset(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-11-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 03:10:49 -05:00
Vladimir Sementsov-Ogievskiy	0a80f1cd06	pcie_regs: drop duplicated indicator value macros We already have indicator values in include/standard-headers/linux/pci_regs.h , no reason to reinvent them in include/hw/pci/pcie_regs.h. (and we already have usage of PCI_EXP_SLTCTL_PWR_IND_BLINK and PCI_EXP_SLTCTL_PWR_IND_OFF in hw/pci/pcie.c, so let's be consistent) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-9-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 03:10:49 -05:00
Vladimir Sementsov-Ogievskiy	cd6992c6b5	pcie: pcie_cap_slot_write_config(): use correct macro PCI_EXP_SLTCTL_PIC_OFF is a value, and PCI_EXP_SLTCTL_PIC is a mask. Happily PCI_EXP_SLTCTL_PIC_OFF is a maximum value for this mask and is equal to the mask itself. Still the code looks like a bug. Let's make it more reader-friendly. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Anton Kuchin <antonkuchin@yandex-team.ru> Message-Id: <20230216180356.156832-8-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 03:10:49 -05:00
Igor Mammedov	1d77e15718	pci: acpi hotplug: rename x-native-hotplug to x-do-not-expose-native-hotplug-cap When ACPI PCI hotplug for Q35 was introduced (6.1), it was implemented by hiding HPC capability on PCIE slot. That however led to a number of regressions and to fix it, it was decided to keep HPC cap exposed in ACPI PCI hotplug case and force guest in ACPI PCI hotplug mode by other means [1]. That reduced meaning of x-native-hotplug to a compat knob [2] for broken 6.1 machine type. Rename property to match its current purpose. 1) `211afe5c69` (hw/i386/acpi-build: Deny control on PCIe Native Hot-plug in _OSC) 2) `c318bef762` (hw/acpi/ich9: Add compat prop to keep HPC bit set for 6.1 machine type) Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20230112140312.3096331-10-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-01-28 06:21:29 -05:00
Frederic Barrat	2e8656710a	pcie: Don't try triggering a LSI when not defined This patch skips [de]asserting a LSI interrupt if the device doesn't have any LSI defined. Doing so would trigger an assert in pci_irq_handler(). The PCIE root port implementation in qemu requests a LSI (INTA), but a subclass may want to change that behavior since it's a valid configuration. For example on the POWER8/POWER9/POWER10 systems, the root bridge doesn't request any LSI. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220408131303.147840-2-fbarrat@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2022-04-20 18:00:30 -03:00
Igor Mammedov	6b0969f1ec	acpi: pcihp: pcie: set power on cap on parent slot on creation a PCIDevice has power turned on at the end of pci_qdev_realize() however later on if PCIe slot isn't populated with any children it's power is turned off. It's fine if native hotplug is used as plug callback will power slot on among other things. However when ACPI hotplug is enabled it replaces native PCIe plug callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and as result slot stays powered off. It works fine as ACPI hotplug on guest side takes care of enumerating/initializing hotplugged device. But when later guest is migrated, call chain introduced by] commit `d5daff7d31` (pcie: implement slot power control for pcie root ports) pcie_cap_slot_post_load() -> pcie_cap_update_power() -> pcie_set_power_device() -> pci_set_power() -> pci_update_mappings() will disable earlier initialized BARs for the hotplugged device in powered off slot due to commit `23786d1344` (pci: implement power state) which disables BARs if power is off. Fix it by setting PCI_EXP_SLTCTL_PCC to PCI_EXP_SLTCTL_PWR_ON on slot (root port/downstream port) at the time a device hotplugged into it. As result PCI_EXP_SLTCTL_PWR_ON is migrated to target and above call chain keeps device plugged into it powered on. Fixes: `d5daff7d31` ("pcie: implement slot power control for pcie root ports") Fixes: `23786d1344` ("pci: implement power state") Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 Suggested-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220301151200.3507298-3-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 05:08:23 -05:00
Knut Omang	7c0fa8dff8	pcie: Add support for Single Root I/O Virtualization (SR/IOV) This patch provides the building blocks for creating an SR/IOV PCIe Extended Capability header and register/unregister SR/IOV Virtual Functions. Signed-off-by: Knut Omang <knuto@ifi.uio.no> Message-Id: <20220217174504.1051716-2-lukasz.maniak@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 05:08:23 -05:00
Daniella Lee	bacf58ca18	Fix bad overflow check in hw/pci/pcie.c Orginal qemu commit hash:14d02cfbe4adaeebe7cb833a8cc71191352cf03b In function pcie_add_capability, an assert contains the "offset < offset + size" expression. Both variable offset and variable size are uint16_t, the comparison is always true due to type promotion. The next expression may be the same. It might be like this: Thread 1 "qemu-system-x86" hit Breakpoint 1, pcie_add_capability ( dev=0x555557ce5f10, cap_id=1, cap_ver=2 '\002', offset=256, size=72) at ../hw/pci/pcie.c:930 930 { (gdb) n 931 assert(offset >= PCI_CONFIG_SPACE_SIZE); (gdb) n 932 assert(offset < offset + size); (gdb) p offset $1 = 256 (gdb) p offset < offset + size $2 = 1 (gdb) set offset=65533 (gdb) p offset < offset + size $3 = 1 (gdb) p offset < (uint16_t)(offset + size) $4 = 0 Signed-off-by: Daniella Lee <daniellalee111@gmail.com> Message-Id: <20211126061324.47331-1-daniellalee111@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-29 08:49:36 -05:00
Gerd Hoffmann	18416c62e3	pcie: expire pending delete Add an expire time for pending delete, once the time is over allow pressing the attention button again. This makes pcie hotplug behave more like acpi hotplug, where one can try sending an 'device_del' monitor command again in case the guest didn't respond to the first attempt. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-7-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-15 11:10:11 -05:00
Gerd Hoffmann	0d33415a4e	pcie: fast unplug when slot power is off In case the slot is powered off (and the power indicator turned off too) we can unplug right away, without round-trip to the guest. Also clear pending attention button press, there is nothing to care about any more. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-6-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-15 11:10:11 -05:00
Gerd Hoffmann	44242d4d3d	pcie: factor out pcie_cap_slot_unplug() No functional change. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-5-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-15 11:10:11 -05:00
Gerd Hoffmann	81124b3c7a	pcie: add power indicator blink check Refuse to push the attention button in case the guest is busy with some hotplug operation (as indicated by the power indicator blinking). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-4-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-15 11:10:11 -05:00
Gerd Hoffmann	d5daff7d31	pcie: implement slot power control for pcie root ports With this patch hot-plugged pci devices will only be visible to the guest if the guests hotplug driver has enabled slot power. This should fix the hot-plug race which one can hit when hot-plugging a pci device at boot, while the guest is in the middle of the pci bus scan. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-3-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-15 11:10:11 -05:00
Peter Xu	2914fc61d5	pci: Export pci_for_each_device_under_bus*() They're actually more commonly used than the helper without _under_bus, because most callers do have the pci bus on hand. After exporting we can switch a lot of the call sites to use these two helpers. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20211028043129.38871-3-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-01 19:36:11 -04:00
Julia Suvorova	3f3cbbb236	hw/pci/pcie: Do not set HPC flag if acpihp is used Instead of changing the hot-plug type in _OSC register, do not set the 'Hot-Plug Capable' flag. This way guest will choose ACPI hot-plug if it is preferred and leave the option to use SHPC with pcie-pci-bridge. The ability to control hot-plug for each downstream port is retained, while 'hotplug=off' on the port means all hot-plug types are disabled. Signed-off-by: Julia Suvorova <jusual@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210713004205.775386-4-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-07-16 04:33:35 -04:00
Jason Wang	d83f46d189	virtio-pci: compat page aligned ATS Commit `4c70875372` ("pci: advertise a page aligned ATS") advertises the page aligned via ATS capability (RO) to unbrek recent Linux IOMMU drivers since 5.2. But it forgot the compat the capability which breaks the migration from old machine type: (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read: 0 device: 20 cmask: ff wmask: 0 w1cmask:0 This patch introduces a new parameter "x-ats-page-aligned" for virtio-pci device and turns it on for machine type which is newer than 5.1. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: qemu-stable@nongnu.org Fixes: `4c70875372` ("pci: advertise a page aligned ATS") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20210406040330.11306-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-04-06 07:11:36 -04:00
Laurent Vivier	df72184ec1	pcie: don't set link state active if the slot is empty When the pcie slot is initialized, by default PCI_EXP_LNKSTA_DLLLA (Data Link Layer Link Active) is set in PCI_EXP_LNKSTA (Link Status) without checking if the slot is empty or not. This is confusing for the kernel because as it sees the link is up it tries to read the vendor ID and fails: (From https://bugzilla.kernel.org/show_bug.cgi?id=211691) [ 1.661105] pcieport 0000:00:02.2: pciehp: Slot Capabilities : 0x0002007b [ 1.661115] pcieport 0000:00:02.2: pciehp: Slot Status : 0x0010 [ 1.661123] pcieport 0000:00:02.2: pciehp: Slot Control : 0x07c0 [ 1.661138] pcieport 0000:00:02.2: pciehp: Slot #0 AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+ Interlock+ NoCompl- IbPresDis- LLActRep+ [ 1.662581] pcieport 0000:00:02.2: pciehp: pciehp_get_power_status: SLOTCTRL 6c value read 7c0 [ 1.662597] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662703] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0010 from Slot Status [ 1.662706] pcieport 0000:00:02.2: pciehp: pcie_enable_notification: SLOTCTRL 6c write cmd 1031 [ 1.662730] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662748] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662750] pcieport 0000:00:02.2: pciehp: Slot(0-2): Link Up [ 2.896132] pcieport 0000:00:02.2: pciehp: pciehp_check_link_status: lnk_status = 2204 [ 2.896135] pcieport 0000:00:02.2: pciehp: Slot(0-2): No device found [ 2.896900] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0010 from Slot Status [ 2.896903] pcieport 0000:00:02.2: pciehp: pciehp_power_off_slot: SLOTCTRL 6c write cmd 400 [ 3.656901] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0009 from Slot Status This is really a problem with virtio-net failover that hotplugs a VFIO card during the boot process. The kernel can shutdown the slot while QEMU is hotplugging it, and this likely ends by an automatic unplug of the card. At the end of the boot sequence the card has disappeared. To fix that, don't set the "Link Active" state in the init function, but rely on the plug function to do it, as the mechanism has already been introduced by `2f2b18f60b`. Fixes: `2f2b18f60b` ("pcie: set link state inactive/active after hot unplug/plug") Cc: zhengxiang9@huawei.com Fixes: `3d67447fe7` ("pcie: Fill PCIESlot link fields to support higher speeds and widths") Cc: alex.williamson@redhat.com Fixes: `b2101eae63` ("pcie: Set the "link active" in the link status register") Cc: benh@kernel.crashing.org Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20210212135250.2738750-5-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-02-23 10:06:55 -05:00
Jason Wang	4c70875372	pci: advertise a page aligned ATS After Linux kernel commit 61363c1474b1 ("iommu/vt-d: Enable ATS only if the device uses page aligned address."), ATS will be only enabled if device advertises a page aligned request. Unfortunately, vhost-net is the only user and we don't advertise the aligned request capability in the past since both vhost IOTLB and address_space_get_iotlb_entry() can support non page aligned request. Though it's not clear that if the above kernel commit makes sense. Let's advertise a page aligned ATS here to make vhost device IOTLB work with Intel IOMMU again. Note that in the future we may extend pcie_ats_init() to accept parameters like queue depth and page alignment. Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20200909081731.24688-1-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-10-30 06:48:53 -04:00
Markus Armbruster	9bc6bfdf67	qdev: Drop qbus_set_hotplug_handler() parameter @errp qbus_set_hotplug_handler() is a simple wrapper around object_property_set_link(). object_property_set_link() fails when the property doesn't exist, is not settable, or its .check() method fails. These are all programming errors here, so passing &error_abort to qbus_set_hotplug_handler() is appropriate. Most of its callers do. Exceptions: * pcie_cap_slot_init(), shpc_init(), spapr_phb_realize() pass NULL, i.e. they ignore errors. * spapr_machine_init() passes &error_fatal. * s390_pcihost_realize(), virtio_serial_device_realize(), s390_pcihost_plug() pass the error to their callers. The latter two keep going after the error, which looks wrong. Drop the @errp parameter, and instead pass &error_abort to object_property_set_link(). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200630090351.1247703-15-armbru@redhat.com>	2020-07-02 06:25:29 +02:00
Markus Armbruster	981c3dcd94	qdev: Convert to qdev_unrealize() with Coccinelle For readability, and consistency with qbus_realize(). Coccinelle script: @ depends on !(file in "hw/core/qdev.c")@ typedef DeviceState; DeviceState *dev; symbol false, error_abort; @@ - object_property_set_bool(OBJECT(dev), false, "realized", &error_abort); + qdev_unrealize(dev); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression dev; symbol false, error_abort; @@ - object_property_set_bool(OBJECT(dev), false, "realized", &error_abort); + qdev_unrealize(DEVICE(dev)); Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-8-armbru@redhat.com>	2020-06-15 21:36:30 +02:00
Julia Suvorova	0dabc0f654	hw/pci/pcie: Move hot plug capability check to pre_plug callback Check for hot plug capability earlier to avoid removing devices attached during the initialization process. Run qemu with an unattached drive: -drive file=$FILE,if=none,id=drive0 \ -device pcie-root-port,id=rp0,slot=3,bus=pcie.0,hotplug=off Hotplug a block device: device_add virtio-blk-pci,id=blk0,drive=drive0,bus=rp0 If hotplug fails on plug_cb, drive0 will be deleted. Fixes: `0501e1aa1d` ("hw/pci/pcie: Forbid hot-plug if it's disabled on the slot") Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200604125947.881210-1-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-06-09 14:18:04 -04:00
Markus Armbruster	b69c3c21a5	qdev: Unrealize must not fail Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>	2020-05-15 07:08:14 +02:00
Julia Suvorova	6a1e073378	hw/pci/pcie: Replace PCI_DEVICE() casts with existing variable A little cleanup is possible because of hotplug_pdev introduction. Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200427182440.92433-3-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2020-05-04 10:25:02 -04:00
Julia Suvorova	0501e1aa1d	hw/pci/pcie: Forbid hot-plug if it's disabled on the slot Raise an error when trying to hot-plug/unplug a device through QMP to a device with disabled hot-plug capability. This makes the device behaviour more consistent and provides an explanation of the failure in the case of asynchronous unplug. Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200427182440.92433-2-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>	2020-05-04 10:25:02 -04:00
Julia Suvorova	530a096318	pcie_root_port: Add hotplug disabling option Make hot-plug/hot-unplug on PCIe Root Ports optional to allow libvirt manage it and restrict unplug for the whole machine. This is going to prevent user-initiated unplug in guests (Windows mostly). Hotplug is enabled by default. Usage: -device pcie-root-port,hotplug=off,... If you want to disable hot-unplug on some downstream ports of one switch, disable hot-unplug on PCIe Root Port connected to the upstream port as well as on the selected downstream ports. Discussion related: https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg00530.html Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200226174607.205941-1-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>	2020-03-08 09:18:29 -04:00
Jens Freimann	c000a9bd06	pci: mark device having guest unplug request pending Set pending_deleted_event in DeviceState for failover primary devices that were successfully unplugged by the Guest OS. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191029114905.6856-5-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-10-29 18:55:26 -04:00
Jens Freimann	a99c4da9fc	pci: mark devices partially unplugged Only the guest unplug request was triggered. This is needed for the failover feature. In case of a failed migration we need to plug the device back to the guest. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191029114905.6856-4-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-10-29 18:55:26 -04:00
Michael S. Tsirkin	d85d65cc29	pcie: minor cleanups for slot control/status Rename function arguments to make intent clearer. Better documentation for slot control logic. Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>	2019-07-01 09:17:30 -04:00
Michael S. Tsirkin	110c477c2e	pcie: work around for racy guest init During boot, linux guests tend to clear all bits in pcie slot status register which is used for hotplug. If they clear bits that weren't set this is racy and will lose events: not a big problem for manual hotplug on bare-metal, but a problem for us. For example, the following is broken ATM: /x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \ -device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device virtio-balloon-pci,id=balloon,bus=pcie_root_port_0 \ -monitor stdio disk.qcow2 (qemu)device_del balloon (qemu)cont Balloon isn't deleted as it should. As a work-around, detect this attempt to clear slot status and revert status to what it was before the write. Note: in theory this can be detected as a duplicate button press which cancels the previous press. Does not seem to happen in practice as guests seem to only have this bug during init. Note2: the right thing to do is probably to fix Linux to read status before clearing it, and act on the bits that are set. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Igor Mammedov <imammedo@redhat.com>	2019-07-01 09:17:30 -04:00
Michael S. Tsirkin	2841ab435b	pcie: check that slt ctrl changed before deleting During boot, linux would sometimes overwrites control of a powered off slot before powering it on. Unfortunately QEMU interprets that as a power off request and ejects the device. For example: /x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \ -device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -monitor stdio disk.qcow2 (qemu)device_add virtio-balloon-pci,id=balloon,bus=pcie_root_port_0 (qemu)cont Balloon is deleted during guest boot. To fix, save control beforehand and check that power or led state actually change before ejecting. Note: this is more a hack than a solution, ideally we'd find a better way to detect ejects, or move away from ejects completely and instead monitor whether it's safe to delete device due to e.g. its power state. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Igor Mammedov <imammedo@redhat.com>	2019-07-01 09:17:30 -04:00
Michael S. Tsirkin	861dc73518	pcie: don't skip multi-mask events If we are trying to set multiple bits at once, testing that just one of them is already set gives a false positive. As a result we won't interrupt guest if e.g. presence detection change and attention button press are both set. This happens with multi-function device removal. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-07-01 09:11:02 -04:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Knut Omang	db891a9ba3	pcie: Add a simple PCIe ACS (Access Control Services) helper function Implementing an ACS capability on downstream ports and multifunction endpoints indicates isolation and IOMMU visibility to a finer granularity. This creates smaller IOMMU groups in the guest and thus more flexibility in assigning endpoints to guest userspace or an L2 guest. Signed-off-by: Knut Omang <knut.omang@oracle.com> Message-Id: <07489975121696f5573b0a92baaf3486ef51e35d.1550768238.git-series.knut.omang@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2019-03-12 22:31:21 -04:00
David Hildenbrand	07578b0ad6	qdev: Let the hotplug_handler_unplug() caller delete the device When unplugging a device, at one point the device will be destroyed via object_unparent(). This will, one the one hand, unrealize the removed device hierarchy, and on the other hand, destroy/free the device hierarchy. When chaining hotplug handlers, we want to overwrite a bus hotplug handler by the machine hotplug handler, to be able to perform some part of the plug/unplug and to forward the calls to the bus hotplug handler. For now, the bus hotplug handler would trigger an object_unparent(), not allowing us to perform some unplug action on a device after we forwarded the call to the bus hotplug handler. The device would be gone at that point. machine_unplug_handler(dev) /* eventually do unplug stuff / bus_unplug_handler(dev) / dev is gone, we can't do more unplug stuff / So move the object_unparent() to the original caller of the unplug. For now, keep the unrealize() at the original places of the object_unparent(). For implicitly chained hotplug handlers (e.g. pc code calling acpi hotplug handlers), the object_unparent() has to be done by the outermost caller. So when calling hotplug_handler_unplug() from inside an unplug handler, nothing is to be done. hotplug_handler_unplug(dev) -> calls machine_unplug_handler() machine_unplug_handler(dev) { / eventually do unplug stuff / bus_unplug_handler(dev) -> calls unrealize(dev) / we can do more unplug stuff but device already unrealized / } object_unparent(dev) In the long run, every unplug action should be factored out of the unrealize() function into the unplug handler (especially for PCI). Then we can get rid of the additonal unrealize() calls and object_unparent() will properly unrealize the device hierarchy after the device has been unplugged. hotplug_handler_unplug(dev) -> calls machine_unplug_handler() machine_unplug_handler(dev) { / eventually do unplug stuff / bus_unplug_handler(dev) -> only unplugs, does not unrealize / we can do more unplug stuff */ } object_unparent(dev) -> will unrealize The original approach was suggested by Igor Mammedov for the PCI part, but I extended it to all hotplug handlers. I consider this one step into the right direction. To summarize: - object_unparent() on synchronous unplugs is done by common code -- "Caller of hotplug_handler_unplug" - object_unparent() on asynchronous unplugs ("unplug requests") has to be done manually -- "Caller of hotplug_handler_unplug" Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190228122849.4296-2-david@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-03-06 11:51:08 -03:00
Alex Williamson	88c869198a	pci: Sanity test minimum downstream LNKSTA The entire link status register for SR-IOV VFs is defined as RsvdZ, reads simply return zero. Usually this is nothing more than lspci reporting inconsequentially broken values: LnkSta: Speed unknown, Width x0, ... However, now that we're using the downstream endpoint link status to fill in the value at the parent downstream port, invalid values become a problem. In particular, the PCIe hotplug driver in Linux looks for a valid negotiated link width and will fail to enumerate hot-added downstream endpoints without non-zero value here, ex: pciehp 0000:00:02.0:pcie004: Slot(0): Attention button pressed pciehp 0000:00:02.0:pcie004: Slot(0) Powering on due to button press pciehp 0000:00:02.0:pcie004: Slot(0): Card present pciehp 0000:00:02.0:pcie004: Slot(0): Link Up pciehp 0000:00:02.0:pcie004: link training error: status 0x2000 pciehp 0000:00:02.0:pcie004: Failed to check link status Resolve by using minimum width and speed values for the downstream port link status when the endpoint fails to provide valid values. Long term, we may want to implement emulation in the vfio-pci host driver to suppliment this field with the PF value as the SR-IOV spec seems to allow, but the solution here is compatible should that be implemented later. Fixes: `727b48661f` ("pci: Sync PCIe downstream port LNKSTA on read") Reported-by: Jens Freimann <jfreimann@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Message-Id: <155060310248.19547.14979269067689441201.stgit@gimli.home> Tested-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-02-22 10:51:31 -05:00
Michael Roth	94d1cc5f03	qdev: pass an Object * to qbus_set_hotplug_handler() Certain devices types, like memory/CPU, are now being handled using a hotplug interface provided by a top-level MachineClass. Hotpluggable host bridges are another such device where it makes sense to use a machine-level hotplug handler. However, unlike those devices, host-bridges have a parent bus (the main system bus), and devices with a parent bus use a different mechanism for registering their hotplug handlers: qbus_set_hotplug_handler(). This interface currently expects a handler to be a subclass of DeviceClass, but this is not the case for MachineClass, which derives directly from ObjectClass. Internally, the interface only requires an ObjectClass, so expose that in qbus_set_hotplug_handler(). Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <154999589921.690774.3640149277362188566.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-02-17 21:54:02 +11:00
David Hildenbrand	b9731850d7	pci/pcie: stop plug/unplug if the slot is locked We better stop right away. For now, errors would be partially ignored (so the guest might get informed or the device might get unplugged), although actual plug/unplug will be reported as failed to the user. While at it, properly move the check to the pre_plug handler for the plug case, as we can test the slot state before the device will be realized. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-01-14 19:31:04 -05:00
David Hildenbrand	a1952d01e7	pci/pcie: perform unplug via the hotplug handler Introduce and use the "unplug" callback. This is a preparation for multi-stage hotplug handlers, whereby the bus hotplug handler is overwritten by the machine hotplug handler. This handler will then pass control to the bus hotplug handler. So to get this running cleanly, we also have to make sure to go via the hotplug handler chain when actually unplugging a device after an unplug request. Lookup the hotplug handler and call "unplug". Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-12-20 11:19:12 -05:00
David Hildenbrand	5571727a63	pci/pcie: rename hotplug handler callbacks The callbacks are also called for cold plugged devices. Drop the "hot" to better match the actual callback names. While at it, also rename pcie_cap_slot_hotplug_common() to pcie_cap_slot_plug_common(). Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-12-20 11:19:12 -05:00
Alex Williamson	3d67447fe7	pcie: Fill PCIESlot link fields to support higher speeds and widths Make use of the PCIESlot speed and width fields to update link information beyond those configured in pcie_cap_v1_fill(). This is only called for devices supporting a version 2 capability and automatically skips any non-PCIESlot devices. Only devices with increased link values generate any visible config space differences. Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-12-19 16:48:16 -05:00
Alex Williamson	727b48661f	pci: Sync PCIe downstream port LNKSTA on read The PCIe link speed and width between a downstream device and its upstream port is negotiated on real hardware and susceptible to dynamic changes due to signal issues and power management. In the emulated device case there is no real hardware link, but we still might wish to have some consistency between endpoint and downstream port via a virtual negotiation. There is of course a real link for assigned devices and this same virtual negotiation allows the downstream port to match the endpoint, synchronizing on every read to support underlying physical hardware dynamically adjusting the link. This negotiation is intentionally unidirectional for compatibility. If the endpoint exceeds the capabilities of the downstream port or there is no endpoint device, the downstream port reports negotiation to its maximum speed and width, matching the previous case where negotiation was absent. De-tuning the endpoint to match a virtual link doesn't seem to benefit anyone and is a condition we've thus far reported without functional issues. Note that PCI_EXP_LNKSTA is already ignored for migration compatibility via pcie_cap_v1_fill(). Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-12-19 16:48:16 -05:00

1 2

96 Commits