mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
David Woodhouse	a78c54c4f9	i386/xen: Initialize Xen backends from pc_basic_device_init() for emulation Now that all the work is done to enable the PV backends to work without actual Xen, instantiate the bus from pc_basic_device_init() for emulated mode. This allows us finally to launch an emulated Xen guest with PV disk. qemu-system-x86_64 -serial mon:stdio -M q35 -cpu host -display none \ -m 1G -smp 2 -accel kvm,xen-version=0x4000a,kernel-irqchip=split \ -kernel bzImage -append "console=ttyS0 root=/dev/xvda1" \ -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \ -device xen-disk,drive=disk,vdev=xvda If we use -M pc instead of q35, we can even add an IDE disk and boot a guest image normally through grub. But q35 gives us AHCI and that isn't unplugged by the Xen magic, so the guests ends up seeing "both" disks. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	de26b26197	hw/xen: Implement soft reset for emulated gnttab This is only part of it; we will also need to get the PV back end drivers to tear down their own mappings (or do it for them, but they kind of need to stop using the pointers too). Some more work on the actual PV back ends and xen-bus code is going to be needed to really make soft reset and migration fully functional, and this part is the basis for that. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	d05864d23b	hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore We don't actually access the guest's page through the grant, because this isn't real Xen, and we can just use the page we gave it in the first place. Map the grant anyway, mostly for cosmetic purposes so it looks like it's in use in the guest-visible grant table. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	0324751272	hw/xen: Add emulated implementation of XenStore operations Now that we have an internal implementation of XenStore, we can populate the xenstore_backend_ops to allow PV backends to talk to it. Watches can't be processed with immediate callbacks because that would call back into XenBus code recursively. Defer them to a QEMUBH to be run as appropriate from the main loop. We use a QEMUBH per XS handle, and it walks all the watches (there shouldn't be many per handle) to fire any which have pending events. We could have done it differently but this allows us to use the same struct watch_event as we have for the guest side, and keeps things relatively simple. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	b08d88e30f	hw/xen: Add emulated implementation of grant table operations This is limited to mapping a single grant at a time, because under Xen the pages are mapped contiguously into qemu's address space, and that's very hard to do when those pages actually come from anonymous mappings in qemu in the first place. Eventually perhaps we can look at using shared mappings of actual objects for system RAM, and then we can make new mappings of the same backing store (be it deleted files, shmem, whatever). But for now let's stick to a page at a time. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	4dfd5fb178	hw/xen: Hook up emulated implementation for event channel operations We provided the backend-facing evtchn functions very early on as part of the core Xen platform support, since things like timers and xenstore need to use them. By what may or may not be an astonishing coincidence, those functions just happen all to have exactly the right function prototypes to slot into the evtchn_backend_ops table and be called by the PV backends. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	e2abfe5ec6	hw/xen: Rename xen_common.h to xen_native.h This header is now only for native Xen code, not PV backends that may be used in Xen emulation. Since the toolstack libraries may depend on the specific version of Xen headers that they pull in (and will set the __XEN_TOOLS__ macro to enable internal definitions that they depend on), the rule is that xen_native.h (and thus the toolstack library headers) must be included before any of the headers in include/hw/xen/interface. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
Paul Durrant	ba2a92db1f	hw/xen: Add xenstore operations to allow redirection to internal emulation Signed-off-by: Paul Durrant <pdurrant@amazon.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	b6cacfea0b	hw/xen: Add evtchn operations to allow redirection to internal emulation The existing implementation calling into the real libxenevtchn moves to a new file hw/xen/xen-operations.c, and is called via a function table which in a subsequent commit will also be able to invoke the emulated event channel support. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
Paul Durrant	831b0db8ab	hw/xen: Create initial XenStore nodes Signed-off-by: Paul Durrant <pdurrant@amazon.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	766804b101	hw/xen: Implement core serialize/deserialize methods for xenstore_impl This implements the basic migration support in the back end, with unit tests that give additional confidence in the node-counting already in the tree. However, the existing PV back ends like xen-disk don't support migration yet. They will reset the ring and fail to continue where they left off. We will fix that in future, but not in time for the 8.0 release. Since there's also an open question of whether we want to serialize the full XenStore or only the guest-owned nodes in /local/domain/${domid}, for now just mark the XenStore device as unmigratable. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
Paul Durrant	be1934dfef	hw/xen: Implement XenStore permissions Store perms as a GList of strings, check permissions. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	7cabbdb70d	hw/xen: Watches on XenStore transactions Firing watches on the nodes that still exist is relatively easy; just walk the tree and look at the nodes with refcount of one. Firing watches on deleted nodes is more fun. We add 'modified_in_tx' and 'deleted_in_tx' flags to each node. Nodes with those flags cannot be shared, as they will always be unique to the transaction in which they were created. When xs_node_walk would need to create a node as scaffolding and it encounters a deleted_in_tx node, it can resurrect it simply by clearing its deleted_in_tx flag. If that node originally had any data, they're gone, and the modified_in_tx flag will have been set when it was first deleted. We then attempt to send appropriate watches when the transaction is committed, properly delete the deleted_in_tx nodes, and remove the modified_in_tx flag from the others. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	7248b87cb0	hw/xen: Implement XenStore transactions Given that the whole thing supported copy on write from the beginning, transactions end up being fairly simple. On starting a transaction, just take a ref of the existing root; swap it back in on a successful commit. The main tree has a transaction ID too, and we keep a record of the last transaction ID given out. if the main tree is ever modified when it isn't the latest, it gets a new transaction ID. A commit can only succeed if the main tree hasn't moved on since it was forked. Strictly speaking, the XenStore protocol allows a transaction to succeed as long as nothing it read or wrote has changed in the interim, but no implementations do that; any change is sufficient to abort a transaction. This does not yet fire watches on the changed nodes on a commit. That bit is more fun and will come in a follow-on commit. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	6e1330090d	hw/xen: Implement XenStore watches Starts out fairly simple: a hash table of watches based on the path. Except there can be multiple watches on the same path, so the watch ends up being a simple linked list, and the head of that list is in the hash table. Which makes removal a bit of a PITA but it's not so bad; we just special-case "I had to remove the head of the list and now I have to replace it in / remove it from the hash table". And if we don't remove the head, it's a simple linked-list operation. We do need to fire watches on deleted nodes, so instead of just a simple xs_node_unref() on the topmost victim, we need to recurse down and fire watches on them all. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	3ef7ff83ca	hw/xen: Add basic XenStore tree walk and write/read/directory support This is a fairly simple implementation of a copy-on-write tree. The node walk function starts off at the root, with 'inplace == true'. If it ever encounters a node with a refcount greater than one (including the root node), then that node is shared with other trees, and cannot be modified in place, so the inplace flag is cleared and we copy on write from there on down. Xenstore write has 'mkdir -p' semantics and will create the intermediate nodes if they don't already exist, so in that case we flip the inplace flag back to true as we populate the newly-created nodes. We put a copy of the absolute path into the buffer in the struct walk_op, with two NUL terminators at the end. As xs_node_walk() goes down the tree, it replaces the next '/' separator with a NUL so that it can use the 'child name' in place. The next recursion down then puts the '/' back and repeats the exercise for the next path element... if it doesn't hit that second NUL termination which indicates the true end of the path. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
David Woodhouse	0254c4d19d	hw/xen: Add xenstore wire implementation and implementation stubs This implements the basic wire protocol for the XenStore commands, punting all the actual implementation to xs_impl_* functions which all just return errors for now. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-07 17:04:30 +00:00
Chuck Zmudzinski	4f67543bb8	xen/pt: reserve PCI slot 2 for Intel igd-passthru Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, as noted in docs/igd-assign.txt in the Qemu source code. Currently, when the xl toolstack is used to configure a Xen HVM guest with Intel IGD passthrough to the guest with the Qemu upstream device model, a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy a different slot. This problem often prevents the guest from booting. The only available workarounds are not good: Configure Xen HVM guests to use the old and no longer maintained Qemu traditional device model available from xenbits.xen.org which does reserve slot 2 for the Intel IGD or use the "pc" machine type instead of the "xenfv" machine type and add the xen platform device at slot 3 using a command line option instead of patching qemu to fix the "xenfv" machine type directly. The second workaround causes some degredation in startup performance such as a longer boot time and reduced resolution of the grub menu that is displayed on the monitor. This patch avoids that reduced startup performance when using the Qemu upstream device model for Xen HVM guests configured with the igd-passthru=on option. To implement this feature in the Qemu upstream device model for Xen HVM guests, introduce the following new functions, types, and macros: * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS * typedef XenPTQdevRealize function pointer * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 * xen_igd_reserve_slot and xen_igd_clear_slot functions Michael Tsirkin: * Introduce XEN_PCI_IGD_DOMAIN, XEN_PCI_IGD_BUS, XEN_PCI_IGD_DEV, and XEN_PCI_IGD_FN - use them to compute the value of XEN_PCI_IGD_SLOT_MASK The new xen_igd_reserve_slot function uses the existing slot_reserved_mask member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using the xl toolstack with the gfx_passthru option enabled, which sets the igd-passthru=on option to Qemu for the Xen HVM machine type. The new xen_igd_reserve_slot function also needs to be implemented in hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, in which case it does nothing. The new xen_igd_clear_slot function overrides qdev->realize of the parent PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was created in hw/i386/pc_piix.c for the case when igd-passthru=on. Move the call to xen_host_pci_device_get, and the associated error handling, from xen_pt_realize to the new xen_igd_clear_slot function to initialize the device class and vendor values which enables the checks for the Intel IGD to succeed. The verification that the host device is an Intel IGD to be passed through is done by checking the domain, bus, slot, and function values as well as by checking that gfx_passthru is enabled, the device class is VGA, and the device vendor in Intel. Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Message-Id: <b1b4a21fe9a600b1322742dda55a40e9961daa57.1674346505.git.brchuckz@aol.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>	2023-03-06 11:27:37 +00:00
Peter Maydell	66577e9e1c	virtio,pc,pci: features, cleanups, fixes vhost-user support without ioeventfd word replacements in vhost user spec shpc improvements cleanups, fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQBO8QPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpMUMH/3/FVp4qaF4CDwCHn7xWFRJpOREIhX/iWfUu lGkwxnB7Lfyqdg7i4CAfgMf2emWKZchEE2DamfCo5bIX0IgRU3DWcOdR9ePvJ29J cKwIYpxZcB4RYSoWL5OUakQLCT3JOu4XWaXeVjyHABjQhf3lGpwN4KmIOBGOy/N6 0YHOQScW2eW62wIOwhAEuYQceMt6KU32Uw3tLnMbJliiBf3a/hPctVNM9TFY9pcd UYHGfBx/zD45owf1lTVEQFDg0eqPZKWW29g5haiOd5oAyXHHolzu+bt3bU7lH46b f7iP12LqDudyrgoF5YWv3NJ4HaGm5V3kPqNqLLF/mjF7alxG+N8= =hN3h -----END PGP SIGNATURE----- Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging virtio,pc,pci: features, cleanups, fixes vhost-user support without ioeventfd word replacements in vhost user spec shpc improvements cleanups, fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQBO8QPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpMUMH/3/FVp4qaF4CDwCHn7xWFRJpOREIhX/iWfUu # lGkwxnB7Lfyqdg7i4CAfgMf2emWKZchEE2DamfCo5bIX0IgRU3DWcOdR9ePvJ29J # cKwIYpxZcB4RYSoWL5OUakQLCT3JOu4XWaXeVjyHABjQhf3lGpwN4KmIOBGOy/N6 # 0YHOQScW2eW62wIOwhAEuYQceMt6KU32Uw3tLnMbJliiBf3a/hPctVNM9TFY9pcd # UYHGfBx/zD45owf1lTVEQFDg0eqPZKWW29g5haiOd5oAyXHHolzu+bt3bU7lH46b # f7iP12LqDudyrgoF5YWv3NJ4HaGm5V3kPqNqLLF/mjF7alxG+N8= # =hN3h # -----END PGP SIGNATURE----- # gpg: Signature made Fri 03 Mar 2023 00:13:56 GMT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (53 commits) tests/data/acpi/virt: drop (most) duplicate files. hw/cxl/mailbox: Use new UUID network order define for cel_uuid qemu/uuid: Add UUID static initializer qemu/bswap: Add const_le64() tests: acpi: Update q35/DSDT.cxl for removed duplicate UID hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge tests/acpi: Allow update of q35/DSDT.cxl hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL hw/pci-bridge/cxl_downstream: Fix type naming mismatch hw/mem/cxl_type3: Improve error handling in realize() MAINTAINERS: Add Fan Ni as Compute eXpress Link QEMU reviewer intel-iommu: send UNMAP notifications for domain or global inv desc smmu: switch to use memory_region_unmap_iommu_notifier_range() memory: introduce memory_region_unmap_iommu_notifier_range() intel-iommu: fail DEVIOTLB_UNMAP without dt mode intel-iommu: fail MAP notifier without caching mode memory: Optimize replay of guest mapping chardev/char-socket: set s->listener = NULL in char_socket_finalize hw/pci: Trace IRQ routing on PCI topology ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2023-03-03 13:35:54 +00:00
Jonathan Cameron	ab99a33d66	hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge Noticed as this prevents iASL disasembling the DSDT table. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Tested-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20230206172816.8201-7-Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 19:13:52 -05:00
Peter Xu	3e090e3489	intel-iommu: send UNMAP notifications for domain or global inv desc We don't send UNMAP notification upon domain or global invalidation which will lead the notifier can't work correctly. One example is to use vhost remote IOTLB without enabling device IOTLB. Fixing this by sending UNMAP notification. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230223065924.42503-6-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 19:13:52 -05:00
Jason Wang	09adb0e021	intel-iommu: fail DEVIOTLB_UNMAP without dt mode Without dt mode, device IOTLB notifier won't work since guest won't send device IOTLB invalidation descriptor in this case. Let's fail early instead of misbehaving silently. Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Viktor Prutyanov <viktor@daynix.com> Buglink: https://bugzilla.redhat.com/2156876 Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230223065924.42503-3-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 19:13:52 -05:00
Jason Wang	b8d78277c0	intel-iommu: fail MAP notifier without caching mode Without caching mode, MAP notifier won't work correctly since guest won't send IOTLB update event when it establishes new mappings in the I/O page tables. Let's fail the IOMMU notifiers early instead of misbehaving silently. Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Viktor Prutyanov <viktor@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230223065924.42503-2-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 19:13:52 -05:00
Zhenzhong Duan	6da2434186	memory: Optimize replay of guest mapping On x86, there are two notifiers registered due to vtd-ir memory region splitting the whole address space. During replay of the address space for each notifier, the whole address space is scanned which is unnecessory. We only need to scan the space belong to notifier montiored space. Assert when notifier is used to monitor beyond iommu memory region's address space. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20230215065238.713041-1-zhenzhong.duan@intel.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-02 19:13:52 -05:00
Michael S. Tsirkin	167f487358	Revert "hw/i386: pass RNG seed via setup_data entry" This reverts commit `67f7e426e5`. Additionally to the automatic revert, I went over the code and dropped all mentions of legacy_no_rng_seed manually, effectively reverting a combination of 2 additional commits: commit `ffe2d2382e` Author: Jason A. Donenfeld <Jason@zx2c4.com> Date: Wed Sep 21 11:31:34 2022 +0200 x86: re-enable rng seeding via SetupData commit `3824e25db1` Author: Gerd Hoffmann <kraxel@redhat.com> Date: Wed Aug 17 10:39:40 2022 +0200 x86: disable rng seeding via setup_data Fixes: `67f7e426e5` ("hw/i386: pass RNG seed via setup_data entry") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2023-03-02 03:10:46 -05:00
Michael S. Tsirkin	ae80d81cfa	Revert "x86: return modified setup_data only if read as memory, not as file" This reverts commit `e935b73508`. Fixes: `e935b73508` ("x86: return modified setup_data only if read as memory, not as file") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2023-03-02 03:10:46 -05:00
Michael S. Tsirkin	ea96a78477	Revert "x86: use typedef for SetupData struct" This reverts commit `eebb38a563`. Fixes: `eebb38a563` ("x86: use typedef for SetupData struct") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2023-03-02 03:10:46 -05:00
Michael S. Tsirkin	fdc27ced04	Revert "x86: reinitialize RNG seed on system reboot" This reverts commit `763a2828bf`. Fixes: `763a2828bf` ("x86: reinitialize RNG seed on system reboot") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2023-03-02 03:10:46 -05:00
Michael S. Tsirkin	b4bfa0a31d	Revert "x86: re-initialize RNG seed when selecting kernel" This reverts commit `cc63374a5a`. Fixes: `cc63374a5a` ("x86: re-initialize RNG seed when selecting kernel") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2023-03-02 03:10:46 -05:00
Michael S. Tsirkin	ef82d893de	Revert "x86: do not re-randomize RNG seed on snapshot load" This reverts commit `14b29fea74`. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: `14b29fea74` ("x86: do not re-randomize RNG seed on snapshot load") Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2023-03-02 03:10:46 -05:00
Michael S. Tsirkin	b34f2fd17e	Revert "x86: don't let decompressed kernel image clobber setup_data" This reverts commit `eac7a7791b`. Fixes: `eac7a7791b` ("x86: don't let decompressed kernel image clobber setup_data") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dov Murik <dovmurik@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2023-03-02 03:10:46 -05:00
David Woodhouse	79807f3e6b	hw/xen: Subsume xen_be_register_common() into xen_be_init() Every caller of xen_be_init() checks and exits on error, then calls xen_be_register_common(). Just make xen_be_init() abort for itself and return void, and register the common devices too. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:09:22 +00:00
David Woodhouse	e16aff4cc2	kvm/i386: Add xen-evtchn-max-pirq property The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI devices. Allow it to be increased if the user desires. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:09:22 +00:00
David Woodhouse	6096cf7877	hw/xen: Support MSI mapping to PIRQ The way that Xen handles MSI PIRQs is kind of awful. There is a special MSI message which targets a PIRQ. The vector in the low bits of data must be zero. The low 8 bits of the PIRQ# are in the destination ID field, the extended destination ID field is unused, and instead the high bits of the PIRQ# are in the high 32 bits of the address. Using the high bits of the address means that we can't intercept and translate these messages in kvm_send_msi(), because they won't be caught by the APIC — addresses like 0x1000fee46000 aren't in the APIC's range. So we catch them in pci_msi_trigger() instead, and deliver the event channel directly. That isn't even the worst part. The worst part is that Xen snoops on writes to devices' MSI vectors while they are masked. When a MSI message is written which looks like it targets a PIRQ, it remembers the device and vector for later. When the guest makes a hypercall to bind that PIRQ# (snooped from a marked MSI vector) to an event channel port, Xen unmasks that MSI vector on the device. Xen guests using PIRQ delivery of MSI don't ever actually unmask the MSI for themselves. Now that this is working we can finally enable XENFEAT_hvm_pirqs and let the guest use it all. Tested with passthrough igb and emulated e1000e + AHCI. CPU0 CPU1 0: 65 0 IO-APIC 2-edge timer 1: 0 14 xen-pirq 1-ioapic-edge i8042 4: 0 846 xen-pirq 4-ioapic-edge ttyS0 8: 1 0 xen-pirq 8-ioapic-edge rtc0 9: 0 0 xen-pirq 9-ioapic-level acpi 12: 257 0 xen-pirq 12-ioapic-edge i8042 24: 9600 0 xen-percpu -virq timer0 25: 2758 0 xen-percpu -ipi resched0 26: 0 0 xen-percpu -ipi callfunc0 27: 0 0 xen-percpu -virq debug0 28: 1526 0 xen-percpu -ipi callfuncsingle0 29: 0 0 xen-percpu -ipi spinlock0 30: 0 8608 xen-percpu -virq timer1 31: 0 874 xen-percpu -ipi resched1 32: 0 0 xen-percpu -ipi callfunc1 33: 0 0 xen-percpu -virq debug1 34: 0 1617 xen-percpu -ipi callfuncsingle1 35: 0 0 xen-percpu -ipi spinlock1 36: 8 0 xen-dyn -event xenbus 37: 0 6046 xen-pirq -msi ahci[0000:00:03.0] 38: 1 0 xen-pirq -msi-x ens4 39: 0 73 xen-pirq -msi-x ens4-rx-0 40: 14 0 xen-pirq -msi-x ens4-rx-1 41: 0 32 xen-pirq -msi-x ens4-tx-0 42: 47 0 xen-pirq -msi-x ens4-tx-1 Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:09:22 +00:00
David Woodhouse	4f81baa33e	hw/xen: Support GSI mapping to PIRQ If I advertise XENFEAT_hvm_pirqs then a guest now boots successfully as long as I tell it 'pci=nomsi'. [root@localhost ~]# cat /proc/interrupts CPU0 0: 52 IO-APIC 2-edge timer 1: 16 xen-pirq 1-ioapic-edge i8042 4: 1534 xen-pirq 4-ioapic-edge ttyS0 8: 1 xen-pirq 8-ioapic-edge rtc0 9: 0 xen-pirq 9-ioapic-level acpi 11: 5648 xen-pirq 11-ioapic-level ahci[0000:00:04.0] 12: 257 xen-pirq 12-ioapic-edge i8042 ... Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:09:20 +00:00
David Woodhouse	aa98ee38a5	hw/xen: Implement emulated PIRQ hypercall support This wires up the basic infrastructure but the actual interrupts aren't there yet, so don't advertise it to the guest. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:09:01 +00:00
David Woodhouse	799c23548f	i386/xen: Implement HYPERVISOR_physdev_op Just hook up the basic hypercalls to stubs in xen_evtchn.c for now. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:08:26 +00:00
David Woodhouse	bdfdb74882	hw/xen: Automatically add xen-platform PCI device for emulated Xen guests It isn't strictly mandatory but Linux guests at least will only map their grant tables over the dummy BAR that it provides, and don't have sufficient wit to map them in any other unused part of their guest address space. So include it by default for minimal surprise factor. As I come to document "how to run a Xen guest in QEMU", this means one fewer thing to tell the user about, according to the mantra of "if it needs documenting, fix it first, then document what remains". Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:08:26 +00:00
David Woodhouse	f3341e7b91	hw/xen: Add basic ring handling to xenstore Extract requests, return ENOSYS to all of them. This is enough to allow older Linux guests to boot, as they need something back but it doesn't matter much what. A full implementation of a single-tentant internal XenStore copy-on-write tree with transactions and watches is waiting in the wings to be sent in a subsequent round of patches along with hooking up the actual PV disk back end in qemu, but this is enough to get guests booting for now. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:08:26 +00:00
David Woodhouse	c08f5d0e53	hw/xen: Add xen_xenstore device for xenstore emulation Just the basic shell, with the event channel hookup. It only dumps the buffer for now; a real ring implmentation will come in a subsequent patch. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:08:26 +00:00
David Woodhouse	794fba23a5	hw/xen: Add backend implementation of interdomain event channel support The provides the QEMU side of interdomain event channels, allowing events to be sent to/from the guest. The API mirrors libxenevtchn, and in time both this and the real Xen one will be available through ops structures so that the PV backend drivers can use the correct one as appropriate. For now, this implementation can be used directly by our XenStore which will be for emulated mode only. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:08:25 +00:00
Joao Martins	b746a77926	i386/xen: handle PV timer hypercalls Introduce support for one shot and periodic mode of Xen PV timers, whereby timer interrupts come through a special virq event channel with deadlines being set through: 1) set_timer_op hypercall (only oneshot) 2) vcpu_op hypercall for {set,stop}_{singleshot,periodic}_timer hypercalls Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:52 +00:00
David Woodhouse	b46f9745b1	hw/xen: Implement GNTTABOP_query_size Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:52 +00:00
David Woodhouse	28b7ae94a2	i386/xen: Implement HYPERVISOR_grant_table_op and GNTTABOP_[gs]et_verson Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:52 +00:00
David Woodhouse	e33cb789af	hw/xen: Support mapping grant frames Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:52 +00:00
David Woodhouse	a28b0fc034	hw/xen: Add xen_gnttab device for grant table emulation Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:52 +00:00
David Woodhouse	2aff696b10	hw/xen: Support HVM_PARAM_CALLBACK_TYPE_PCI_INTX callback The guest is permitted to specify an arbitrary domain/bus/device/function and INTX pin from which the callback IRQ shall appear to have come. In QEMU we can only easily do this for devices that actually exist, and even that requires us "knowing" that it's a PCMachine in order to find the PCI root bus — although that's OK really because it's always true. We also don't get to get notified of INTX routing changes, because we can't do that as a passive observer; if we try to register a notifier it will overwrite any existing notifier callback on the device. But in practice, guests using PCI_INTX will only ever use pin A on the Xen platform device, and won't swizzle the INTX routing after they set it up. So this is just fine. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:50 +00:00
David Woodhouse	ddf0fd9ae1	hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback The GSI callback (and later PCI_INTX) is a level triggered interrupt. It is asserted when an event channel is delivered to vCPU0, and is supposed to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0 is cleared again. Thankfully, Xen does not assert the GSI if the guest sets its own evtchn_upcall_pending field; we only need to assert the GSI when we have delivered an event for ourselves. So that's the easy part, kind of. There's a slight complexity in that we need to hold the BQL before we can call qemu_set_irq(), and we definitely can't do that while holding our own port_lock (because we'll need to take that from the qemu-side functions that the PV backend drivers will call). So if we end up wanting to set the IRQ in a context where we don't already hold the BQL, defer to a BH. However, we do need to poll for the evtchn_upcall_pending flag being cleared. In an ideal world we would poll that when the EOI happens on the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd pairs — one is used to trigger the interrupt, and the other works in the other direction to 'resample' on EOI, and trigger the first eventfd again if the line is still active. However, QEMU doesn't seem to do that. Even VFIO level interrupts seem to be supported by temporarily unmapping the device's BARs from the guest when an interrupt happens, then trapping all MMIO to the device and sending the 'resample' event on every MMIO access until the IRQ is cleared! Maybe in future we'll plumb the 'resample' concept through QEMU's irq framework but for now we'll do what Xen itself does: just check the flag on every vmexit if the upcall GSI is known to be asserted. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:06:44 +00:00
Joao Martins	507cb64d6e	i386/xen: add monitor commands to test event injection Specifically add listing, injection of event channels. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
David Woodhouse	a15b10978f	hw/xen: Implement EVTCHNOP_reset Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00

1 2 3 4 5 ...

2306 Commits