mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Thomas Huth	2b85e0cda4	hw/intc: Extract the IRQ counting functions into a separate file These IRQ counting functions will soon be required in binaries that do not include the APIC code, too, so let's extract them into a separate file that can be linked independently of the APIC code. While we're at it, change the apic_* prefix into kvm_* since the functions are used from the i8259 PIC (i.e. not the APIC), too. Reviewed-by: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20230110095351.611724-2-thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-01-13 16:22:57 +01:00
Bernhard Beschow	f0712099a2	hw/usb/hcd-uhci: Introduce TYPE_ defines for device models Suggested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20221204190553.3274-7-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-01-13 16:22:57 +01:00
Bernhard Beschow	738c2eb47f	hw/isa/piix3: Decouple INTx-to-LNKx routing which is board-specific pci_map_irq_fn's in general seem to be board-specific. So move PIIX3's pci_slot_get_pirq() to board code to not have PIIX3 make assuptions about its board. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230109172347.1830-6-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-01-13 16:22:57 +01:00
Bernhard Beschow	f021f4e9d2	hw/pci/pci: Factor out pci_bus_map_irqs() from pci_bus_irqs() pci_bus_irqs() coupled together the assignment of pci_set_irq_fn and pci_map_irq_fn to a PCI bus. This coupling gets in the way when the pci_map_irq_fn is board-specific while the pci_set_irq_fn is device- specific. For example, both of QEMU's PIIX south bridge models have different pci_map_irq_fn implementations which are board-specific rather than device-specific. These implementations should therefore reside in board code. The pci_set_irq_fn's, however, should stay in the device models because they access memory internal to the model. Factoring out pci_bus_map_irqs() from pci_bus_irqs() allows the assignments to be decoupled, resolving the problem described above. Note also how pci_vpb_realize() which gets touched in this commit assigns different pci_map_irq_fn's depending on the board. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230109172347.1830-5-shentey@gmail.com> [PMD: Factor out in vfu_object_set_bus_irq()] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-01-13 16:22:57 +01:00
Peter Maydell	d6271b6572	virtio,pc,pci: features, cleanups, fixes mostly vhost-vdpa: guest announce feature emulation when using shadow virtqueue support for configure interrupt startup speed ups an acpi change to only generate cluster node in PPTT when specified for arm misc fixes, cleanups Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmO6eGMPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpoUIIALqC3UtJcK3AuAMbeqVokxl5CPwoeXMyi+rT 0QuN8m8dpBtJFpy3Vyq0afixOFmlwvORW5ye4QI97OyIhtLJq00buzQsgHjNoPo3 zN2L0BDyofDmfFHgCxcEbv2aAO8TaqRSHmKffEFmf8JDMDL9Ev1QvPTWHhfm2eJf VKPHOtCA/3WXBD9JNfYJ0YuzCrrJaMhIO6/5tqv9yjMxWTfEFa1J2Sr2tWkRLuDk FPfApy7afjI705Guv6PllZ3JdOMwf7iZaoBK6mSdCDSyi1xciYM0VeWi8SLD4qbM N+9NkUQOIYS5ZC4BXrULy6HDUsECJ71I0pvHveX7nwbK6xPD4RQ= =0tPe -----END PGP SIGNATURE----- Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging virtio,pc,pci: features, cleanups, fixes mostly vhost-vdpa: guest announce feature emulation when using shadow virtqueue support for configure interrupt startup speed ups an acpi change to only generate cluster node in PPTT when specified for arm misc fixes, cleanups Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Sun 08 Jan 2023 08:01:39 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (50 commits) vhost-scsi: fix memleak of vsc->inflight acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block tests: acpi: aarch64: Add .topology tables tests: acpi: aarch64: Add topology test for aarch64 tests: acpi: Add and whitelist .topology blobs tests: virt: Update expected ACPI tables for virt test hw/acpi/aml-build: Only generate cluster node in PPTT when specified tests: virt: Allow changes to PPTT test table virtio-pci: fix proxy->vector_irqfd leak in virtio_pci_set_guest_notifiers vdpa: commit all host notifier MRs in a single MR transaction vhost: configure all host notifiers in a single MR transaction vhost: simplify vhost_dev_enable_notifiers vdpa: harden the error path if get_iova_range failed vdpa-dev: get iova range explicitly docs/devel: Rules on #include in headers include: Include headers where needed include/hw/virtio: Break inclusion loop include/hw/cxl: Break inclusion loop cxl_pci.h and cxl_cdat_h include/hw/pci: Include hw/pci/pci.h where needed include/hw/pci: Split pci_device.h off pci.h ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2023-01-09 10:07:12 +00:00
Markus Armbruster	edf5ca5dbe	include/hw/pci: Split pci_device.h off pci.h PCIDeviceClass and PCIDevice are defined in pci.h. Many users of the header don't actually need them. Similar structs live in their own headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h, PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and PCIESlot in pcie_port.h. Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with the code that needs them. Adjust include directives. This also enables the next commit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221222100330.380143-6-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-01-08 01:54:22 -05:00
Markus Armbruster	674b0a5784	include/hw/pci: Break inclusion loop pci_bridge.h and cxl.h hw/pci/pci_bridge.h and hw/cxl/cxl.h include each other. Fortunately, breaking the loop is merely a matter of deleting unnecessary includes from headers, and adding them back in places where they are now missing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221222100330.380143-2-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-01-08 01:54:22 -05:00
Bernhard Beschow	fa5975f49e	i386, mips: Resolve redundant ACPI and APM dependencies Now that all ACPI controllers select the ACPI and APM dependencies themselves, these explicit dependencies became redundant. Remove them. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20221216130355.41667-7-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-01-08 01:54:21 -05:00
Bernhard Beschow	d9237edbb5	hw/acpi/Kconfig: Do not needlessly build TYPE_PIIX4_PM in non-PC/Malta machines TYPE_PIIX4_PM is only used in machines where PIIX chipsets are used which is currently PC and Malta. There is no point building it for the other ACPI_X86 machines. Note that this also removes unneeded ACPI_PIIX4 from PEGASOS2. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20221216130355.41667-4-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>	2023-01-08 01:54:21 -05:00
Paolo Bonzini	fb418b51b7	i386: SGX: remove deprecated member of SGXInfo Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-01-06 00:51:02 +01:00
Paolo Bonzini	eaaaf8abdc	KVM: remove support for kernel-irqchip=off -machine kernel-irqchip=off is broken for many guest OSes; kernel-irqchip=split is the replacement that works, so remove the deprecated support for the former. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-01-06 00:51:02 +01:00
Igor Mammedov	ad4942746c	pci: drop redundant PCIDeviceClass::is_bridge field and use cast to TYPE_PCI_BRIDGE instead. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20221129101341.185621-3-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org>	2022-12-21 07:32:24 -05:00
Daniel P. Berrangé	a6b6414f0c	hw/isa: enable TCO watchdog reboot pin strap by default The TCO watchdog implementation default behaviour from POV of the guest OS relies on the initial values for two I/O ports: * TCO1_CNT == 0x0 Since bit 11 (TCO Timer Halt) is clear, the watchdog state is considered to be initially running * GCS == 0x20 Since bit 5 (No Reboot) is set, the watchdog will not trigger when the timer expires This is a safe default, because the No Reboot bit will prevent the watchdog from triggering if the guest OS is unaware of its existance, or is slow in configuring it. When a Linux guest initializes the TCO watchdog, it will attempt to clear the "No Reboot" flag, and read the value back. If the clear was honoured, the driver will treat this as an indicator that the watchdog is functional and create the guest watchdog device. QEMU implements a second "no reboot" flag, however, via pin straps which overrides the behaviour of the guest controlled "no reboot" flag: commit `5add35bec1` Author: Paulo Alcantara <pcacjr@gmail.com> Date: Sun Jun 28 14:58:58 2015 -0300 ich9: implement strap SPKR pin logic This second 'noreboot' pin was defaulted to high, which also inhibits triggering of the requested watchdog actions, unless QEMU is launched with the magic flag "-global ICH9-LPC.noreboot=false". This is a bad default as we are exposing a watchdog to every guest OS using the q35 machine type, but preventing it from actually doing what it is designed to do. What is worse is that the guest OS and its apps have no way to know that the watchdog is never going to fire, due to this second 'noreboot' pin. If a guest OS had no watchdog device at all, then apps whose operation and/or data integrity relies on a watchdog can refuse to launch, and alert the administrator of the problematic deployment. With Q35 machines unconditionally exposing a watchdog though, apps will think their deployment is correct but in fact have no protection at all. This patch flips the default of the second 'no reboot' flag, so that configured watchdog actions will be honoured out of the box for the 7.2 Q35 machine type onwards, if the guest enables use of the watchdog. See also related bug reports https://bugzilla.redhat.com/show_bug.cgi?id=2080207 https://bugzilla.redhat.com/show_bug.cgi?id=2136889 https://bugzilla.redhat.com/show_bug.cgi?id=2137346 Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20221216125749.596075-5-berrange@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-12-21 06:35:28 -05:00
Cornelia Huck	5719a179e0	pc: clean up compat machines We can move setting default_cpu_version into the base machine options, and we need to unset alias and is_default only once. Suggested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20221212152145.124317-3-cohuck@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com>	2022-12-21 06:35:28 -05:00
Cornelia Huck	db723c80b1	hw: Add compat machines for 8.0 Add 8.0 machine types for arm/i440fx/m68k/q35/s390x/spapr. Reviewed-by: Cédric Le Goater <clg@kaod.org> [ppc] Reviewed-by: Thomas Huth <thuth@redhat.com> [s390x] Reviewed-by: Greg Kurz <groug@kaod.org> [ppc] Signed-off-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20221212152145.124317-2-cohuck@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org>	2022-12-21 06:35:28 -05:00
Peter Maydell	dfa6ba6bae	Replace use of qdev_reset_all() with device_cold_reset() The legacy function qdev_reset_all() performs a recursive reset, starting from a qdev. However, it does not permit any of the devices in the tree to use three-phase reset, because device reset goes through the device_legacy_reset() function that only calls the single DeviceClass::reset method. Switch to using the device_cold_reset() function instead. This also performs a recursive reset, where first the children are reset and then finally the parent, but it uses the new (...in 2020...) Resettable mechanism, which supports both the old style single-reset method and also the new 3-phase reset handling. This commit changes the five remaining uses of this function. Commit created with: sed -i -e 's/qdev_reset_all/device_cold_reset/g' hw/i386/xen/xen_platform.c hw/input/adb.c hw/remote/vfio-user-obj.c hw/s390x/s390-virtio-ccw.c hw/usb/dev-uas.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-12-16 15:55:32 +00:00
Markus Armbruster	d1c81c3496	qapi: Use returned bool to check for failure (again) Commit `012d4c96e2` changed the visitor functions taking Error ** to return bool instead of void, and the commits following it used the new return value to simplify error checking. Since then a few more uses in need of the same treatment crept in. Do that. All pretty mechanical except for * balloon_stats_get_all() This is basically the same transformation commit `012d4c96e2` applied to the virtual walk example in include/qapi/visitor.h. * set_max_queue_size() Additionally replace "goto end of function" by return. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2022-12-14 16:19:35 +01:00
Stefan Hajnoczi	f21f1cfeb9	pci,pc,virtio: features, tests, fixes, cleanups lots of acpi rework first version of biosbits infrastructure ASID support in vhost-vdpa core_count2 support in smbios PCIe DOE emulation virtio vq reset HMAT support part of infrastructure for viommu support in vhost-vdpa VTD PASID support fixes, tests all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNpXDkPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpD0AH/2G8ZPrgrxJC9y3uD5/5J6QRzO+TsDYbg5ut uBf4rKSHHzcu6zdyAfsrhbAKKzyD4HrEGNXZrBjnKM1xCiB/SGBcDIWntwrca2+s 5Dpbi4xvd4tg6tVD4b47XNDCcn2uUbeI0e2M5QIbtCmzdi/xKbFAfl5G8DQp431X Kmz79G4CdKWyjVlM0HoYmdCw/4FxkdjD02tE/Uc5YMrePNaEg5Bw4hjCHbx1b6ur 6gjeXAtncm9s4sO0l+sIdyiqlxiTry9FSr35WaQ0qPU+Og5zaf1EiWfdl8TRo4qU EAATw5A4hyw11GfOGp7oOVkTGvcNB/H7aIxD7emdWZV8+BMRPKo= =zTCn -----END PGP SIGNATURE----- Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging pci,pc,virtio: features, tests, fixes, cleanups lots of acpi rework first version of biosbits infrastructure ASID support in vhost-vdpa core_count2 support in smbios PCIe DOE emulation virtio vq reset HMAT support part of infrastructure for viommu support in vhost-vdpa VTD PASID support fixes, tests all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNpXDkPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpD0AH/2G8ZPrgrxJC9y3uD5/5J6QRzO+TsDYbg5ut # uBf4rKSHHzcu6zdyAfsrhbAKKzyD4HrEGNXZrBjnKM1xCiB/SGBcDIWntwrca2+s # 5Dpbi4xvd4tg6tVD4b47XNDCcn2uUbeI0e2M5QIbtCmzdi/xKbFAfl5G8DQp431X # Kmz79G4CdKWyjVlM0HoYmdCw/4FxkdjD02tE/Uc5YMrePNaEg5Bw4hjCHbx1b6ur # 6gjeXAtncm9s4sO0l+sIdyiqlxiTry9FSr35WaQ0qPU+Og5zaf1EiWfdl8TRo4qU # EAATw5A4hyw11GfOGp7oOVkTGvcNB/H7aIxD7emdWZV8+BMRPKo= # =zTCn # -----END PGP SIGNATURE----- # gpg: Signature made Mon 07 Nov 2022 14:27:53 EST # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (83 commits) checkpatch: better pattern for inline comments hw/virtio: introduce virtio_device_should_start tests/acpi: update tables for new core count test bios-tables-test: add test for number of cores > 255 tests/acpi: allow changes for core_count2 test bios-tables-test: teach test to use smbios 3.0 tables hw/smbios: add core_count2 to smbios table type 4 vhost-user: Support vhost_dev_start vhost: Change the sequence of device start intel-iommu: PASID support intel-iommu: convert VTD_PE_GET_FPD_ERR() to be a function intel-iommu: drop VTDBus intel-iommu: don't warn guest errors when getting rid2pasid entry vfio: move implement of vfio_get_xlat_addr() to memory.c tests: virt: Update expected .acpihmatvirt tables tests: acpi: aarch64/virt: add a test for hmat nodes with no initiators hw/arm/virt: Enable HMAT on arm virt machine tests: Add HMAT AArch64/virt empty table files tests: acpi: q35: update expected blobs .hmat-noinitiators expected HMAT: tests: acpi: q35: add test for hmat nodes without initiators ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-11-07 18:43:56 -05:00
Jason Wang	1b2b12376c	intel-iommu: PASID support This patch introduce ECAP_PASID via "x-pasid-mode". Based on the existing support for scalable mode, we need to implement the following missing parts: 1) tag VTDAddressSpace with PASID and support IOMMU/DMA translation with PASID 2) tag IOTLB with PASID 3) PASID cache and its flush 4) PASID based IOTLB invalidation For simplicity PASID cache is not implemented so we can simply implement the PASID cache flush as a no and leave it to be implemented in the future. For PASID based IOTLB invalidation, since we haven't had L1 stage support, the PASID based IOTLB invalidation is not implemented yet. For PASID based device IOTLB invalidation, it requires the support for vhost so we forbid enabling device IOTLB when PASID is enabled now. Those work could be done in the future. Note that though PASID based IOMMU translation is ready but no device can issue PASID DMA right now. In this case, PCI_NO_PASID is used as PASID to identify the address without PASID. vtd_find_add_as() has been extended to provision address space with PASID which could be utilized by the future extension of PCI core to allow device model to use PASID based DMA translation. This feature would be useful for: 1) prototyping PASID support for devices like virtio 2) future vPASID work 3) future PRS and vSVA work Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221028061436.30093-5-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Jason Wang	940e552786	intel-iommu: convert VTD_PE_GET_FPD_ERR() to be a function We used to have a macro for VTD_PE_GET_FPD_ERR() but it has an internal goto which prevents it from being reused. This patch convert that macro to a dedicated function and let the caller to decide what to do (e.g using goto or not). This makes sure it can be re-used for other function that requires fault reporting. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221028061436.30093-4-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com>	2022-11-07 14:08:17 -05:00
Jason Wang	da8d439c80	intel-iommu: drop VTDBus We introduce VTDBus structure as an intermediate step for searching the address space. This works well with SID based matching/lookup. But when we want to support SID plus PASID based address space lookup, this intermediate steps turns out to be a burden. So the patch simply drops the VTDBus structure and use the PCIBus and devfn as the key for the g_hash_table(). This simplifies the codes and the future PASID extension. To prevent being slower for past vtd_find_as_from_bus_num() callers, a vtd_as cache indexed by the bus number is introduced to store the last recent search result of a vtd_as belongs to a specific bus. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221028061436.30093-3-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com>	2022-11-07 14:08:17 -05:00
Jason Wang	fb1d084b44	intel-iommu: don't warn guest errors when getting rid2pasid entry We use to warn on wrong rid2pasid entry. But this error could be triggered by the guest and could happens during initialization. So let's don't warn in this case. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221028061436.30093-2-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com>	2022-11-07 14:08:17 -05:00
Bernhard Beschow	b496a17d45	hw/i386/acpi-build: Resolve north rather than south bridges The code currently assumes Q35 iff ICH9 and i440fx iff PIIX. Now that more AML generation has been moved into the south bridges and since the machines define themselves primarily through their north bridges, let's switch to resolving the north bridges for AML generation instead. This also allows for easier experimentation with different south bridges in the "pc" machine, e.g. with PIIX4 and VT82xx. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Message-Id: <20221028103419.93398-4-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Bernhard Beschow	bbaa5c41fa	hw/i386/acpi-build: Resolve redundant attribute The is_piix4 attribute is set once in one location and read once in another. Doing both in one location allows for removing the attribute altogether. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20221026133110.91828-3-shentey@gmail.com> Message-Id: <20221028103419.93398-3-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Bernhard Beschow	6f56d6de99	hw/i386/acpi-build: Remove unused struct Ammends commit `b23046abe7` 'pc: acpi-build: simplify PCI bus tree generation'. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20221026133110.91828-2-shentey@gmail.com> Message-Id: <20221028103419.93398-2-shentey@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Gregory Price	2486dd0457	hw/i386/pc.c: CXL Fixed Memory Window should not reserve e820 in bios Early-boot e820 records will be inserted by the bios/efi/early boot software and be reported to the kernel via insert_resource. Later, when CXL drivers iterate through the regions again, they will insert another resource and make the RESERVED memory area a child. This RESERVED memory area causes the memory region to become unusable, and as a result attempting to create memory regions with `cxl create-region ...` Will fail due to the RESERVED area intersecting with the CXL window. During boot the following traceback is observed: 0xffffffff81101650 in insert_resource_expand_to_fit () 0xffffffff83d964c5 in e820__reserve_resources_late () 0xffffffff83e03210 in pcibios_resource_survey () 0xffffffff83e04f4a in pcibios_init () Which produces a call to reserve the CFMWS area: (gdb) p new $54 = {start = 0x290000000, end = 0x2cfffffff, name = "Reserved", flags = 0x200, desc = 0x7, parent = 0x0, sibling = 0x0, child = 0x0} Later the Kernel parses ACPI tables and reserves the exact same area as the CXL Fixed Memory Window: 0xffffffff811016a4 in insert_resource_conflict () insert_resource () 0xffffffff81a81389 in cxl_parse_cfmws () 0xffffffff818c4a81 in call_handler () acpi_parse_entries_array () (gdb) p/x new $59 = {start = 0x290000000, end = 0x2cfffffff, name = "CXL Window 0", flags = 0x200, desc = 0x0, parent = 0x0, sibling = 0x0, child = 0x0} This produces the following output in /proc/iomem: 590000000-68fffffff : CXL Window 0 590000000-68fffffff : Reserved This reserved area causes `get_free_mem_region()` to fail due to a check against `__region_intersects()`. Due to this reserved area, the intersect check will only ever return REGION_INTERSECTS, which causes `cxl create-region` to always fail. Signed-off-by: Gregory Price <gregory.price@memverge.com> Message-Id: <20221026205912.8579-1-gregory.price@memverge.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2022-11-07 14:08:17 -05:00
Igor Mammedov	d12dbd44e4	acpi: pc/35: sanitize _GPE declaration order Move _GPE block declaration before it gets referenced by other hotplug handlers. While at it move PCI hotplug (_E01) handler after PCI tree description to avoid forward reference to to not yet declared methods/devices. PS: Forward 'usage' usualy is fine as long as it's hidden within method, however 'iasl' may print warnings. So be nice to iasl/guest OS and do things in proper order. PS2: Also follow up patches will move some of hotplug code from PCI tree to _E01 and that also requires PCI Device nodes build first, before Scope can reuse that from global context. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20221017102146.2254096-11-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Igor Mammedov	6d2146147b	acpi: enumerate SMB bridge automatically along with other PCI devices to make that happen (bridge sits at _ADR: 0x001F0003), relax PCI enumeration logic to include devices with function > 0 if device has something to say about itself (i.e. has build_dev_aml callback set). Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20221017102146.2254096-8-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Igor Mammedov	47a373faa6	acpi: pc/q35: drop ad-hoc PCI-ISA bridge AML routines and let bus ennumeration generate AML PCI-ISA bridges that are built in PIIX/Q35 are building its own AML using AcpiDevAmlIf interface. Now build_append_pci_bus_devices() gained AcpiDevAmlIf interface support to get AML of devices atached to PCI slots. So drop ad-hoc build_q35_isa_bridge()/build_piix4_isa_bridge() and let PCI bus enumeration to include PCI-ISA bridge AML when it's enumerated by build_append_pci_bus_devices(). AML change is mostly contextual, which moves whole ISA hierarchy directly under PCI host bridge instead of it being described as separate \SB.PCI0.ISA block. Note: If bus/slot that hosts ISA bridge has BSEL set, it will gain new ASUN and _DMS entries (i.e. acpi-index support, but it should not cause any functional change and that is fine from PCI Firmware spec point of view), potentially it's possible to suppress that by adding a flag to PCIDevice but I don't see a reason to do that yet, I'd rather treat bridge just as any other PCI device if it's possible. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20221017102146.2254096-4-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 14:08:17 -05:00
Igor Mammedov	cfead31326	acpi: pc: vga: use AcpiDevAmlIf interface to build VGA device descriptors NB: We do not expect any functional change in any ACPI tables with this change. It's only a refactoring. NB2: Some targets (or1k) do not support acpi and CONFIG_ACPI is off for them. However, modules are reused between all architectures so CONFIG_ACPI is on. For those architectures, dummy stub function definitions help to resolve symbols. This change uses more of these and so it adds a couple of dummy stub definitions so that symbols for those can be resolved. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20221017102146.2254096-2-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ani Sinha <ani@anisinha.ca> CC: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Ani Sinha <ani@anisinha.ca> Message-Id: <20221107152744.868434-1-ani@anisinha.ca>	2022-11-07 14:00:29 -05:00
Stefan Hajnoczi	7f5acfcb66	* bug fixes * reduced memory footprint for IPI virtualization on Intel processors * asynchronous teardown support (Linux only) -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNiVykUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0Swf/YxjphCtFgYYSO14WP+7jAnfRZLhm 0xWChWP8rco5I352OBFeFU64Av5XoLGNn6SZLl8lcg86lQ/G0D27jxu6wOcDDHgw 0yTDO1gevj51UKsbxoC66OWSZwKTEo398/BHPDcI2W41yOFycSdtrPgspOrFRVvf 7M3nNjuNPsQorZeuu8NGr3jakqbt99ZDXcyDEWbrEAcmy2JBRMbGgT0Kdnc6aZfW CvL+1ljxzldNwGeNBbQW2QgODbfHx5cFZcy4Daze35l5Ra7K/FrgAzr6o/HXptya 9fEs5LJQ1JWI6JtpaWwFy7fcIIOsJ0YW/hWWQZSDt9JdAJFE5/+vF+Kz5Q== =CgrO -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging * bug fixes * reduced memory footprint for IPI virtualization on Intel processors * asynchronous teardown support (Linux only) # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNiVykUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0Swf/YxjphCtFgYYSO14WP+7jAnfRZLhm # 0xWChWP8rco5I352OBFeFU64Av5XoLGNn6SZLl8lcg86lQ/G0D27jxu6wOcDDHgw # 0yTDO1gevj51UKsbxoC66OWSZwKTEo398/BHPDcI2W41yOFycSdtrPgspOrFRVvf # 7M3nNjuNPsQorZeuu8NGr3jakqbt99ZDXcyDEWbrEAcmy2JBRMbGgT0Kdnc6aZfW # CvL+1ljxzldNwGeNBbQW2QgODbfHx5cFZcy4Daze35l5Ra7K/FrgAzr6o/HXptya # 9fEs5LJQ1JWI6JtpaWwFy7fcIIOsJ0YW/hWWQZSDt9JdAJFE5/+vF+Kz5Q== # =CgrO # -----END PGP SIGNATURE----- # gpg: Signature made Wed 02 Nov 2022 07:40:25 EDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: target/i386: Fix test for paging enabled util/log: Close per-thread log file on thread termination target/i386: Set maximum APIC ID to KVM prior to vCPU creation os-posix: asynchronous teardown for shutdown on Linux target/i386: Fix calculation of LOCK NEG eflags Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-11-03 10:54:37 -04:00
Ani Sinha	4ad08e8a57	hw/i386/e820: remove legacy reserved entries for e820 e820 reserved entries were used before the dynamic entries with fw config files were intoduced. Please see the following change: 7d67110f2d9a6("pc: add etc/e820 fw_cfg file") Identical support was introduced into seabios as well with the following commit: ce39bd4031820 ("Add support for etc/e820 fw_cfg file") Both the above commits are now quite old. QEMU machines 1.7 and newer no longer use the reserved entries. Seabios uses fw config files and dynamic e820 entries by default and only falls back to using reserved entries when it has to work with old qemu (versions earlier than 1.7). Please see functions qemu_cfg_e820() and qemu_early_e820(). It is safe to remove legacy FW_CFG_E820_TABLE and associated code now as QEMU 7.0 has deprecated i440fx machines 1.7 and older. It would be incredibly rare to run the latest qemu version with a very old version of seabios that did not support fw config files for e820. As far as I could see, edk2/ovfm never supported reserved entries and uses fw config files from the beginning. So there should be no incompatibilities with ovfm as well. CC: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Ani Sinha <ani@anisinha.ca> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20220831045311.33083-1-ani@anisinha.ca> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-02 06:56:31 -04:00
Bernhard Beschow	bb2e9b1d66	hw/ide/piix: Introduce TYPE_ macros for PIIX IDE controllers Suggested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20221022150508.26830-9-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2022-10-31 11:32:07 +01:00
Bernhard Beschow	503a35e7fd	hw/i386/pc: Create DMA controllers in south bridges Just like in the real hardware (and in PIIX4), create the DMA controllers in the south bridges. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20221022150508.26830-2-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2022-10-31 11:32:07 +01:00
Zeng Guang	19e2a9fb9d	target/i386: Set maximum APIC ID to KVM prior to vCPU creation Specify maximum possible APIC ID assigned for current VM session to KVM prior to the creation of vCPUs. By this setting, KVM can set up VM-scoped data structure indexed by the APIC ID, e.g. Posted-Interrupt Descriptor pointer table to support Intel IPI virtualization, with the most optimal memory footprint. It can be achieved by calling KVM_ENABLE_CAP for KVM_CAP_MAX_VCPU_ID capability once KVM has enabled it. Ignoring the return error if KVM doesn't support this capability yet. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220825025246.26618-1-guang.zeng@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-31 09:46:34 +01:00
Jason A. Donenfeld	14b29fea74	x86: do not re-randomize RNG seed on snapshot load Snapshot loading is supposed to be deterministic, so we shouldn't re-randomize the various seeds used. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-4-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Jason A. Donenfeld	7966d70f6f	reset: allow registering handlers that aren't called by snapshot loading Snapshot loading only expects to call deterministic handlers, not non-deterministic ones. So introduce a way of registering handlers that won't be called when reseting for snapshots. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-2-Jason@zx2c4.com [PMM: updated json doc comment with Markus' text; fixed checkpatch style nit] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-10-27 11:34:31 +01:00
Maciej S. Szmigiero	ec19444a53	hyperv: fix SynIC SINT assertion failure on guest reset Resetting a guest that has Hyper-V VMBus support enabled triggers a QEMU assertion failure: hw/hyperv/hyperv.c:131: synic_reset: Assertion `QLIST_EMPTY(&synic->sint_routes)' failed. This happens both on normal guest reboot or when using "system_reset" HMP command. The failing assertion was introduced by commit `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") to catch dangling SINT routes on SynIC reset. The root cause of this problem is that the SynIC itself is reset before devices using SINT routes have chance to clean up these routes. Since there seems to be no existing mechanism to force reset callbacks (or methods) to be executed in specific order let's use a similar method that is already used to reset another interrupt controller (APIC) after devices have been reset - by invoking the SynIC reset from the machine reset handler via a new x86_cpu_after_reset() function co-located with the existing x86_cpu_reset() in target/i386/cpu.c. Opportunistically move the APIC reset handler there, too. Fixes: `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") # exposed the bug Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <cb57cee2e29b20d06f81dce054cbcea8b5d497e8.1664552976.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:04 +02:00
Igor Mammedov	3216ab2acb	x86: pci: acpi: consolidate PCI slots creation No functional changes nor AML bytecode changes. Consolidate code that generates empty and populated slot descriptors. Besides eliminating duplication, it helps consolidate conditions for generating parts of Device{} desriptor in one place, which makes code more compact and easier to read. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-18-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-10-09 16:38:46 -04:00
Igor Mammedov	87debd32a8	x86: pci: acpi: reorder Device's _DSM method align _DSM method in empty slot descriptor with a populated slot position. Expected change: + Device (SE8) + { + Name (_ADR, 0x001D0000) // _ADR: Address + Name (ASUN, 0x1D) Method (_DSM, 4, Serialized) // _DSM: Device-Specific Method { Local0 = Package (0x02) { BSEL, ASUN } Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0)) } - } - Device (SE8) - { - Name (_ADR, 0x001D0000) // _ADR: Address - Name (ASUN, 0x1D) Name (_SUN, 0x1D) // _SUN: Slot User Number Method (_EJ0, 1, NotSerialized) // _EJx: Eject Device { PCEJ (BSEL, _SUN) } + } i.e. put _DSM right after ASUN, with _SUN/_EJ0 following it. that will eliminate contextual changes (causing test failures) when follow up patches merge code generating populated and empty slots descriptors. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-16-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-10-09 16:38:46 -04:00
Igor Mammedov	366047d625	x86: pci: acpi: reorder Device's _ADR and _SUN fields no functional change, align order of fields in empty slot descriptor with a populated slot ordering. Expected diff: - Name (_SUN, 0x0X) // _SUN: Slot User Number Name (_ADR, 0xY) // _ADR: Address ... + Name (_SUN, 0xX) // _SUN: Slot User Number that will eliminate contextual changes (causing test failures) when follow up patches merge code generating populated and empty slots descriptors. Put mandatory _ADR as the 1st field, then ASUN as it can be present for both pupulated and empty slots and only then _SUN which is present only when slot is hotpluggable. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-13-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-10-09 16:38:45 -04:00
Igor Mammedov	5840a16364	x86: acpi: cleanup PCI device _DSM duplication add ASUN variable to hotpluggable slots and use it instead of _SUN which has the same value to reuse _DMS code on both branches (hot- and non-hotpluggable). No functional change. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-10-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-10-09 16:38:45 -04:00
Igor Mammedov	467d099a29	x86: acpi: _DSM: use Package to pass parameters Numer of possible arguments to pass to a method is limited in ACPI. The following patches will need to pass over more parameters to PDSM method, will hit that limit. Prepare for this by passing structure (Package) to method, which let us workaround arguments limitation. Pass to PDSM all standard arguments of _DSM as is, and pack custom parameters into Package that is passed as the last argument to PDSM. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-7-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-10-09 16:38:45 -04:00
Igor Mammedov	a12cf6923c	acpi: x86: refactor PDSM method to reduce nesting .., it will help with code readability and make easier to extend method in followup patches Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-6-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-10-09 16:38:45 -04:00
Igor Mammedov	e05acc360e	acpi: x86: deduplicate HPET AML building HPET AML doesn't depend on piix4 nor q35, move code buiding it to common scope to avoid duplication. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-3-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-10-09 16:38:45 -04:00
Peter Xu	20ca47429e	Revert "intel_iommu: Fix irqchip / X2APIC configuration checks" It's true that when vcpus<=255 we don't require the length of 32bit APIC IDs. However here since we already have EIM=ON it means the hypervisor will declare the VM as x2apic supported (e.g. VT-d ECAP register will have EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width even if vcpus<=255. In short, commit `77250171bd` breaks any simple cmdline that wants to boot a VM with >=9 but <=255 vcpus with: -device intel-iommu,intremap=on For anyone who does not want to enable x2apic, we can use eim=off in the intel-iommu parameters to skip enabling KVM x2apic. This partly reverts commit `77250171bd`, while keeping the valid bit on checking split irqchip, but revert the other change. One thing to mention is that this patch may break migration compatibility of such VM, however that's probably the best thing we can do, because the old behavior was simply wrong and not working for >8 vcpus. For <=8 vcpus, there could be a light guest ABI change (by enabling KVM x2apic after this patch), but logically it shouldn't affect the migration from working. Also, this is not the 1st commit to change x2apic behavior. Igor provided a full history of how this evolved for the past few years: https://lore.kernel.org/qemu-devel/20220922154617.57d1a1fb@redhat.com/ Relevant commits for reference: `fb506e701e` ("intel_iommu: reject broken EIM", 2016-10-17) `c1bb5418e3` ("target/i386: Support up to 32768 CPUs without IRQ remapping", 2020-12-10) `77250171bd` ("intel_iommu: Fix irqchip / X2APIC configuration checks", 2022-05-16) `dc89f32d92` ("target/i386: Fix sanity check on max APIC ID / X2APIC enablement", 2022-05-16) We may want to have this for stable too (mostly for 7.1.0 only). Adding a fixes tag. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Claudio Fontana <cfontana@suse.de> Cc: Igor Mammedov <imammedo@redhat.com> Fixes: `77250171bd` ("intel_iommu: Fix irqchip / X2APIC configuration checks") Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220926153206.10881-1-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>	2022-10-09 16:38:45 -04:00
Jason A. Donenfeld	cc63374a5a	x86: re-initialize RNG seed when selecting kernel We don't want it to be possible to re-read the RNG seed after ingesting it, because this ruins forward secrecy. Currently, however, the setup data section can just be re-read. Since the kernel is always read after the setup data, use the selection of the kernel as a trigger to re-initialize the RNG seed, just like we do on reboot, to preserve forward secrecy. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220922152847.3670513-1-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-01 21:16:36 +02:00
Jason A. Donenfeld	ffe2d2382e	x86: re-enable rng seeding via SetupData This reverts `3824e25db1` ("x86: disable rng seeding via setup_data"), but for 7.2 rather than 7.1, now that modifying setup_data is safe to do. Cc: Laurent Vivier <laurent@vivier.eu> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Ard Biesheuvel <ardb@kernel.org> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220921093134.2936487-4-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-27 11:30:59 +02:00
Jason A. Donenfeld	763a2828bf	x86: reinitialize RNG seed on system reboot Since this is read from fw_cfg on each boot, the kernel zeroing it out alone is insufficient to prevent it from being used twice. And indeed on reboot we always want a new seed, not the old one. So re-fill it in this circumstance. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220921093134.2936487-3-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-27 11:30:59 +02:00
Jason A. Donenfeld	eebb38a563	x86: use typedef for SetupData struct The preferred style is SetupData as a typedef, not setup_data as a plain struct. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ard Biesheuvel <ardb@kernel.org> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220921093134.2936487-2-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-27 11:30:59 +02:00

1 2 3 4 5 ...

2202 Commits