mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Xiaoyao Li	a3b5376521	i386/cpuid: Remove subleaf constraint on CPUID leaf 1F No such constraint that subleaf index needs to be less than 64. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by:Yang Weijiang <weijiang.yang@intel.com> Message-ID: <20240125024016.2521244-3-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-02-16 13:56:08 +01:00
Xiaoyao Li	10f92799af	i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F Existing code misses a decrement of cpuid_i when skip leaf 0x1F. There's a blank CPUID entry(with leaf, subleaf as 0, and all fields stuffed 0s) left in the CPUID array. It conflicts with correct CPUID leaf 0. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by:Yang Weijiang <weijiang.yang@intel.com> Message-ID: <20240125024016.2521244-2-xiaoyao.li@intel.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-02-16 13:56:08 +01:00
Stefan Hajnoczi	195801d700	system/cpus: rename qemu_mutex_lock_iothread() to bql_lock() The Big QEMU Lock (BQL) has many names and they are confusing. The actual QemuMutex variable is called qemu_global_mutex but it's commonly referred to as the BQL in discussions and some code comments. The locking APIs, however, are called qemu_mutex_lock_iothread() and qemu_mutex_unlock_iothread(). The "iothread" name is historic and comes from when the main thread was split into into KVM vcpu threads and the "iothread" (now called the main loop thread). I have contributed to the confusion myself by introducing a separate --object iothread, a separate concept unrelated to the BQL. The "iothread" name is no longer appropriate for the BQL. Rename the locking APIs to: - void bql_lock(void) - void bql_unlock(void) - bool bql_locked(void) There are more APIs with "iothread" in their names. Subsequent patches will rename them. There are also comments and documentation that will be updated in later patches. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Fabiano Rosas <farosas@suse.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Acked-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-id: 20240102153529.486531-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2024-01-08 10:45:43 -05:00
Michael Roth	5746f70d68	i386/sev: Avoid SEV-ES crash due to missing MSR_EFER_LMA bit Commit `7191f24c7f` ("accel/kvm/kvm-all: Handle register access errors") added error checking for KVM_SET_SREGS/KVM_SET_SREGS2. In doing so, it exposed a long-running bug in current KVM support for SEV-ES where the kernel assumes that MSR_EFER_LMA will be set explicitly by the guest kernel, in which case EFER write traps would result in KVM eventually seeing MSR_EFER_LMA get set and recording it in such a way that it would be subsequently visible when accessing it via KVM_GET_SREGS/etc. However, guest kernels currently rely on MSR_EFER_LMA getting set automatically when MSR_EFER_LME is set and paging is enabled via CR0_PG_MASK. As a result, the EFER write traps don't actually expose the MSR_EFER_LMA bit, even though it is set internally, and when QEMU subsequently tries to pass this EFER value back to KVM via KVM_SET_SREGS* it will fail various sanity checks and return -EINVAL, which is now considered fatal due to the aforementioned QEMU commit. This can be addressed by inferring the MSR_EFER_LMA bit being set when paging is enabled and MSR_EFER_LME is set, and synthesizing it to ensure the expected bits are all present in subsequent handling on the host side. Ultimately, this handling will be implemented in the host kernel, but to avoid breaking QEMU's SEV-ES support when using older host kernels, the same handling can be done in QEMU just after fetching the register values via KVM_GET_SREGS*. Implement that here. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Akihiko Odaki <akihiko.odaki@daynix.com> Cc: Philippe Mathieu-Daudé <philmd@linaro.org> Cc: Lara Lazier <laramglazier@gmail.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: <kvm@vger.kernel.org> Fixes: `7191f24c7f` ("accel/kvm/kvm-all: Handle register access errors") Signed-off-by: Michael Roth <michael.roth@amd.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20231206155821.1194551-1-michael.roth@amd.com>	2023-12-06 14:34:11 -05:00
David Woodhouse	8473607bcf	i386/xen: advertise XEN_HVM_CPUID_UPCALL_VECTOR in CPUID This will allow Linux guests (since v6.0) to use the per-vCPU upcall vector delivered as MSI through the local APIC. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-11-07 08:54:20 +00:00
Paolo Bonzini	39dd3e1f55	kvm: i8254: require KVM_CAP_PIT2 and KVM_CAP_PIT_STATE2 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 19:53:38 +02:00
Paolo Bonzini	52b04ea49d	kvm: i386: require KVM_CAP_SET_IDENTITY_MAP_ADDR This was introduced in KVM in Linux 2.6.32, we can require it unconditionally. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 19:53:38 +02:00
Paolo Bonzini	700766ba60	kvm: i386: require KVM_CAP_ADJUST_CLOCK This was introduced in KVM in Linux 2.6.33, we can require it unconditionally. KVM_CLOCK_TSC_STABLE was only added in Linux 4.9, for now do not require it (though it would allow the removal of some pretty yucky code). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 19:53:38 +02:00
Paolo Bonzini	86f2438fc2	kvm: i386: require KVM_CAP_MCE This was introduced in KVM in Linux 2.6.34, we can require it unconditionally. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 19:53:38 +02:00
Paolo Bonzini	1a44a79ddf	kvm: i386: require KVM_CAP_SET_VCPU_EVENTS and KVM_CAP_X86_ROBUST_SINGLESTEP Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 19:53:30 +02:00
Paolo Bonzini	8bba0a3b76	kvm: i386: require KVM_CAP_XSAVE This was introduced in KVM in Linux 2.6.36, and could already be used at the time to save/restore FPU data even on older processor. We can require it unconditionally and stop using KVM_GET/SET_FPU. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 17:35:15 +02:00
Paolo Bonzini	f57a4dd311	kvm: i386: require KVM_CAP_DEBUGREGS This was introduced in KVM in Linux 2.6.35, we can require it unconditionally. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 17:35:15 +02:00
Paolo Bonzini	4b2991666c	kvm: i386: move KVM_CAP_IRQ_ROUTING detection to kvm_arch_required_capabilities Simple code cleanup. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 17:35:15 +02:00
Paolo Bonzini	cc5e719e2c	kvm: require KVM_CAP_SIGNAL_MSI This was introduced in KVM in Linux 3.5, we can require it unconditionally in kvm_irqchip_send_msi(). However, not all architectures have to implement it so check it only in x86, the only architecture that ever had MSI injection but not KVM_CAP_SIGNAL_MSI. ARM uses it to detect the presence of the ITS emulation in the kernel, introduced in Linux 4.8. Assume that it's there and possibly fail when realizing the arm-its-kvm device. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-25 17:35:14 +02:00
Steve Sistare	c8a7fc5179	migration: simplify blockers Modify migrate_add_blocker and migrate_del_blocker to take an Error reason. This allows migration to own the Error object, so that if an error occurs in migrate_add_blocker, migration code can free the Error and clear the client handle, simplifying client code. It also simplifies the migrate_del_blocker call site. In addition, this is a pre-requisite for a proposed future patch that would add a mode argument to migration requests to support live update, and maintain a list of blockers for each mode. A blocker may apply to a single mode or to multiple modes, and passing Error will allow one Error object to be registered for multiple modes. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Tested-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1697634216-84215-1-git-send-email-steven.sistare@oracle.com>	2023-10-20 08:51:41 +02:00
Xiaoyao Li	83629b1461	target/i386/cpu: Fix CPUID_HT exposure When explicitly booting a multiple vcpus vm with "-cpu +ht", it gets warning of warning: host doesn't support requested feature: CPUID.01H:EDX.ht [bit 28] Make CPUID_HT as supported unconditionally can resolve the warning. However it introduces another issue that it also expose CPUID_HT to guest when "-cpu host/max" with only 1 vcpu. To fix this, need mark CPUID_HT as the no_autoenable_flags. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20231010060539.210258-1-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-17 15:20:46 +02:00
Paolo Bonzini	e0c3ef715b	target/i386/kvm: eliminate shadowed local variables These are harmless are they die immediately after their use. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-09-26 16:41:49 +02:00
Michael Tokarev	bad5cfcd60	i386: spelling fixes Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2023-09-20 07:54:34 +03:00
Philippe Mathieu-Daudé	fc30abf846	sysemu/kvm: Restrict kvm_has_pit_state2() to x86 targets kvm_has_pit_state2() is only defined for x86 targets (in target/i386/kvm/kvm.c). Its declaration is pointless on all other targets. Have it return a boolean. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230904124325.79040-13-philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-09-07 13:32:37 +02:00
Anton Johansson	b8a6eb1862	sysemu/kvm: Use vaddr for kvm_arch_[insert\|remove]_hw_breakpoint Changes the signature of the target-defined functions for inserting/removing kvm hw breakpoints. The address and length arguments are now of vaddr type, which both matches the type used internally in accel/kvm/kvm-all.c and makes the api target-agnostic. Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230807155706.9580-4-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-08-24 11:21:35 -07:00
Akihiko Odaki	5e0d65909c	kvm: Introduce kvm_arch_get_default_type hook kvm_arch_get_default_type() returns the default KVM type. This hook is particularly useful to derive a KVM type that is valid for "none" machine model, which is used by libvirt to probe the availability of KVM. For MIPS, the existing mips_kvm_type() is reused. This function ensures the availability of VZ which is mandatory to use KVM on the current QEMU. Cc: qemu-stable@nongnu.org Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-id: 20230727073134.134102-2-akihiko.odaki@daynix.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: added doc comment for new function] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-08-22 17:31:02 +01:00
Tao Su	ba3709feaa	target/i386: Allow MCDT_NO if host supports MCDT_NO bit indicates HW contains the security fix and doesn't need to be mitigated to avoid data-dependent behaviour for certain instructions. It needs no hypervisor support. Treat it as supported regardless of what KVM reports. Signed-off-by: Tao Su <tao1.su@linux.intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20230706054949.66556-4-tao1.su@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-07-07 12:52:27 +02:00
Miroslav Rezanina	ddc7cb30f8	Fix build without CONFIG_XEN_EMU Upstream commit `ddf0fd9ae1` "hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback" added kvm_xen_maybe_deassert_callback usage to target/i386/kvm/kvm.c file without conditional preprocessing check. This breaks any build not using CONFIG_XEN_EMU. Protect call by conditional preprocessing to allow build without CONFIG_XEN_EMU. Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20230308130557.2420-1-mrezanin@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-03-15 11:52:24 +01:00
David Woodhouse	e16aff4cc2	kvm/i386: Add xen-evtchn-max-pirq property The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI devices. Allow it to be increased if the user desires. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:09:22 +00:00
David Woodhouse	6096cf7877	hw/xen: Support MSI mapping to PIRQ The way that Xen handles MSI PIRQs is kind of awful. There is a special MSI message which targets a PIRQ. The vector in the low bits of data must be zero. The low 8 bits of the PIRQ# are in the destination ID field, the extended destination ID field is unused, and instead the high bits of the PIRQ# are in the high 32 bits of the address. Using the high bits of the address means that we can't intercept and translate these messages in kvm_send_msi(), because they won't be caught by the APIC — addresses like 0x1000fee46000 aren't in the APIC's range. So we catch them in pci_msi_trigger() instead, and deliver the event channel directly. That isn't even the worst part. The worst part is that Xen snoops on writes to devices' MSI vectors while they are masked. When a MSI message is written which looks like it targets a PIRQ, it remembers the device and vector for later. When the guest makes a hypercall to bind that PIRQ# (snooped from a marked MSI vector) to an event channel port, Xen unmasks that MSI vector on the device. Xen guests using PIRQ delivery of MSI don't ever actually unmask the MSI for themselves. Now that this is working we can finally enable XENFEAT_hvm_pirqs and let the guest use it all. Tested with passthrough igb and emulated e1000e + AHCI. CPU0 CPU1 0: 65 0 IO-APIC 2-edge timer 1: 0 14 xen-pirq 1-ioapic-edge i8042 4: 0 846 xen-pirq 4-ioapic-edge ttyS0 8: 1 0 xen-pirq 8-ioapic-edge rtc0 9: 0 0 xen-pirq 9-ioapic-level acpi 12: 257 0 xen-pirq 12-ioapic-edge i8042 24: 9600 0 xen-percpu -virq timer0 25: 2758 0 xen-percpu -ipi resched0 26: 0 0 xen-percpu -ipi callfunc0 27: 0 0 xen-percpu -virq debug0 28: 1526 0 xen-percpu -ipi callfuncsingle0 29: 0 0 xen-percpu -ipi spinlock0 30: 0 8608 xen-percpu -virq timer1 31: 0 874 xen-percpu -ipi resched1 32: 0 0 xen-percpu -ipi callfunc1 33: 0 0 xen-percpu -virq debug1 34: 0 1617 xen-percpu -ipi callfuncsingle1 35: 0 0 xen-percpu -ipi spinlock1 36: 8 0 xen-dyn -event xenbus 37: 0 6046 xen-pirq -msi ahci[0000:00:03.0] 38: 1 0 xen-pirq -msi-x ens4 39: 0 73 xen-pirq -msi-x ens4-rx-0 40: 14 0 xen-pirq -msi-x ens4-rx-1 41: 0 32 xen-pirq -msi-x ens4-tx-0 42: 47 0 xen-pirq -msi-x ens4-tx-1 Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:09:22 +00:00
David Woodhouse	6f43f2ee49	kvm/i386: Add xen-gnttab-max-frames property Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:52 +00:00
David Woodhouse	ddf0fd9ae1	hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback The GSI callback (and later PCI_INTX) is a level triggered interrupt. It is asserted when an event channel is delivered to vCPU0, and is supposed to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0 is cleared again. Thankfully, Xen does not assert the GSI if the guest sets its own evtchn_upcall_pending field; we only need to assert the GSI when we have delivered an event for ourselves. So that's the easy part, kind of. There's a slight complexity in that we need to hold the BQL before we can call qemu_set_irq(), and we definitely can't do that while holding our own port_lock (because we'll need to take that from the qemu-side functions that the PV backend drivers will call). So if we end up wanting to set the IRQ in a context where we don't already hold the BQL, defer to a BH. However, we do need to poll for the evtchn_upcall_pending flag being cleared. In an ideal world we would poll that when the EOI happens on the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd pairs — one is used to trigger the interrupt, and the other works in the other direction to 'resample' on EOI, and trigger the first eventfd again if the line is still active. However, QEMU doesn't seem to do that. Even VFIO level interrupts seem to be supported by temporarily unmapping the device's BARs from the guest when an interrupt happens, then trapping all MMIO to the device and sending the 'resample' event on every MMIO access until the IRQ is cleared! Maybe in future we'll plumb the 'resample' concept through QEMU's irq framework but for now we'll do what Xen itself does: just check the flag on every vmexit if the upcall GSI is known to be asserted. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:06:44 +00:00
Joao Martins	c345104cd1	i386/xen: handle VCPUOP_register_vcpu_info Handle the hypercall to set a per vcpu info, and also wire up the default vcpu_info in the shared_info page for the first 32 vCPUs. To avoid deadlock within KVM a vCPU thread must set its own vcpu_info rather than it being set from the context in which the hypercall is invoked. Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration, and restore it in kvm_arch_put_registers() appropriately. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	55a3f666b4	i386/xen: handle guest hypercalls This means handling the new exit reason for Xen but still crashing on purpose. As we implement each of the hypercalls we will then return the right return code. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	5e691a955a	i386/kvm: Set Xen vCPU ID in KVM There are (at least) three different vCPU ID number spaces. One is the internal KVM vCPU index, based purely on which vCPU was chronologically created in the kernel first. If userspace threads are all spawned and create their KVM vCPUs in essentially random order, then the KVM indices are basically random too. The second number space is the APIC ID space, which is consistent and useful for referencing vCPUs. MSIs will specify the target vCPU using the APIC ID, for example, and the KVM Xen APIs also take an APIC ID from userspace whenever a vCPU needs to be specified (as opposed to just using the appropriate vCPU fd). The third number space is not normally relevant to the kernel, and is the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But Xen timer hypercalls use it, and Xen timer hypercalls really want to be accelerated in the kernel rather than handled in userspace, so the kernel needs to be told. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	f66b8a83c5	i386/kvm: handle Xen HVM cpuid leaves Introduce support for emulating CPUID for Xen HVM guests. It doesn't make sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally when the xen-version machine property is set. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Obtain xen_version from KVM property, make it automatic] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	61491cf441	i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support This just initializes the basic Xen support in KVM for now. Only permitted on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap overlay, event channel, grant tables and other stuff will exist. There's no point having the basic hypercall support if nothing else works. Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used later by support devices. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Paolo Bonzini	3023c9b4d1	target/i386: KVM: allow fast string operations if host supports them These are just a flag that documents the performance characteristic of an instruction; it needs no hypervisor support. So include them even if KVM does not show them. In particular, FZRM/FSRS/FSRC have only been added very recently, but they are available on Sapphire Rapids processors. Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-02-27 18:44:57 +01:00
Markus Armbruster	e2c1c34f13	include/block: Untangle inclusion loops We have two inclusion loops: block/block.h -> block/block-global-state.h -> block/block-common.h -> block/blockjob.h -> block/block.h block/block.h -> block/block-io.h -> block/block-common.h -> block/blockjob.h -> block/block.h I believe these go back to Emanuele's reorganization of the block API, merged a few months ago in commit `d7e2fe4aac`. Fortunately, breaking them is merely a matter of deleting unnecessary includes from headers, and adding them back in places where they are now missing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221221133551.3967339-2-armbru@redhat.com>	2023-01-20 07:24:28 +01:00
Markus Armbruster	d1c81c3496	qapi: Use returned bool to check for failure (again) Commit `012d4c96e2` changed the visitor functions taking Error ** to return bool instead of void, and the commits following it used the new return value to simplify error checking. Since then a few more uses in need of the same treatment crept in. Do that. All pretty mechanical except for * balloon_stats_get_all() This is basically the same transformation commit `012d4c96e2` applied to the virtual walk example in include/qapi/visitor.h. * set_max_queue_size() Additionally replace "goto end of function" by return. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2022-12-14 16:19:35 +01:00
Zeng Guang	19e2a9fb9d	target/i386: Set maximum APIC ID to KVM prior to vCPU creation Specify maximum possible APIC ID assigned for current VM session to KVM prior to the creation of vCPUs. By this setting, KVM can set up VM-scoped data structure indexed by the APIC ID, e.g. Posted-Interrupt Descriptor pointer table to support Intel IPI virtualization, with the most optimal memory footprint. It can be achieved by calling KVM_ENABLE_CAP for KVM_CAP_MAX_VCPU_ID capability once KVM has enabled it. Ignoring the return error if KVM doesn't support this capability yet. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220825025246.26618-1-guang.zeng@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-31 09:46:34 +01:00
Markus Armbruster	0a553c12c7	Drop useless casts from g_malloc() & friends to pointer These memory allocation functions return void *, and casting to another pointer type is useless clutter. Drop these casts. If you really want another pointer type, consider g_new(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20220923120025.448759-3-armbru@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-10-22 23:15:40 +02:00
Maciej S. Szmigiero	ec19444a53	hyperv: fix SynIC SINT assertion failure on guest reset Resetting a guest that has Hyper-V VMBus support enabled triggers a QEMU assertion failure: hw/hyperv/hyperv.c:131: synic_reset: Assertion `QLIST_EMPTY(&synic->sint_routes)' failed. This happens both on normal guest reboot or when using "system_reset" HMP command. The failing assertion was introduced by commit `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") to catch dangling SINT routes on SynIC reset. The root cause of this problem is that the SynIC itself is reset before devices using SINT routes have chance to clean up these routes. Since there seems to be no existing mechanism to force reset callbacks (or methods) to be executed in specific order let's use a similar method that is already used to reset another interrupt controller (APIC) after devices have been reset - by invoking the SynIC reset from the machine reset handler via a new x86_cpu_after_reset() function co-located with the existing x86_cpu_reset() in target/i386/cpu.c. Opportunistically move the APIC reset handler there, too. Fixes: `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") # exposed the bug Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <cb57cee2e29b20d06f81dce054cbcea8b5d497e8.1664552976.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:04 +02:00
Alexander Graf	37656470f6	KVM: x86: Implement MSR_CORE_THREAD_COUNT MSR The MSR_CORE_THREAD_COUNT MSR describes CPU package topology, such as number of threads and cores for a given package. This is information that QEMU has readily available and can provide through the new user space MSR deflection interface. This patch propagates the existing hvf logic from patch `027ac0cb51` ("target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT") to KVM. Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20221004225643.65036-4-agraf@csgraf.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:01 +02:00
Alexander Graf	860054d8ce	i386: kvm: Add support for MSR filtering KVM has grown support to deflect arbitrary MSRs to user space since Linux 5.10. For now we don't expect to make a lot of use of this feature, so let's expose it the easiest way possible: With up to 16 individually maskable MSRs. This patch adds a kvm_filter_msr() function that other code can call to install a hook on KVM MSR reads or writes. Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20221004225643.65036-3-agraf@csgraf.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:01 +02:00
Chenyi Qiang	e2e69f6bb9	i386: add notify VM exit support There are cases that malicious virtual machine can cause CPU stuck (due to event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and IRQ) can be delivered. It leads the CPU to be unavailable to host or other VMs. Notify VM exit is introduced to mitigate such kind of attacks, which will generate a VM exit if no event window occurs in VM non-root mode for a specified amount of time (notify window). A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space so that the user can query the capability and set the expected notify window when creating VMs. The format of the argument when enabling this capability is as follows: Bit 63:32 - notify window specified in qemu command Bit 31:0 - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to enable the feature.) Users can configure the feature by a new (x86 only) accel property: qemu -accel kvm,notify-vmexit=run\|internal-error\|disable,notify-window=n The default option of notify-vmexit is run, which will enable the capability and do nothing if the exit happens. The internal-error option raises a KVM internal error if it happens. The disable option does not enable the capability. The default value of notify-window is 0. It is valid only when notify-vmexit is not disabled. The valid range of notify-window is non-negative. It is even safe to set it to zero since there's an internal hardware threshold to be added to ensure no false positive. Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit qualification (no cases are anticipated that would set this bit), which means VM context is corrupted. It would be reflected in the flags of KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM internal error unconditionally. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-5-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:00 +02:00
Paolo Bonzini	3dba0a335c	kvm: allow target-specific accelerator properties Several hypervisor capabilities in KVM are target-specific. When exposed to QEMU users as accelerator properties (i.e. -accel kvm,prop=value), they should not be available for all targets. Add a hook for targets to add their own properties to -accel kvm, for now no such property is defined. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220929072014.20705-3-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-10 09:23:16 +02:00
Chenyi Qiang	12f89a39cf	i386: kvm: extend kvm_{get, put}_vcpu_events to support pending triple fault For the direct triple faults, i.e. hardware detected and KVM morphed to VM-Exit, KVM will never lose them. But for triple faults sythesized by KVM, e.g. the RSM path, if KVM exits to userspace before the request is serviced, userspace could migrate the VM and lose the triple fault. A new flag KVM_VCPUEVENT_VALID_TRIPLE_FAULT is defined to signal that the event.triple_fault_pending field contains a valid state if the KVM_CAP_X86_TRIPLE_FAULT_EVENT capability is enabled. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-2-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-10 09:23:16 +02:00
Stefan Hajnoczi	fafd35a6da	Pull request trivial patches branch 20220930-v2 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM7XoISHGxhdXJlbnRA dml2aWVyLmV1AAoJEPMMOL0/L748D/0QAKbYtTWjhFPeapjZVoTv13YrTvczWrcF omL6IZivVq0t7hun4iem0DwmvXJELMGexEOTvEJOzM19IIlvvwvOsI8xnxpcMnEY 6GKVbs53Ba0bg2yh7Dll2W9jkou9eX27DwUHMVF8KX7qqsbU+WyD/vdGZitgGt+T 8yna7kzVvNVsdB3+DbIatI5RzzHeu4OqeuH/WCtAyzCaLB64UYTcHprskxIp4+wp dR+EUSoDEr9Qx4PC+uVEsTFK1zZjyAYNoNIkh6fhlkRvDJ1uA75m3EJ57P8xPPqe VbVkPMKi0d4c52m6XvLsQhyYryLx/qLLUAkJWVpY66aHcapYbZAEAfZmNGTQLrOJ qIOJzIkOdU6l3pRgXVdVCgkHRc2HETwET2LyVbNkUz/vBlW2wOZQbZFbezComael bQ/gNBYqP+eOGnZzeWbKBGHr/9QDBClNufidIMC+sOiUw0iSifzjkFwvH7IElx6K EQCOSV6pOhKVlinTpmBbk1XD3xDkQ7ZidiLT9g+P1c8dExrXBhWOnfUHueISb8+s KKMozuxQ/6/3c/DP5hwI9cKPEWEbqJfq1kMuxIvEivKGwUIqX2yq4VJ+hSlYJ+CW nGjXZldtf4KwH+cTsxyPmdZRR5Q7+ODr5Xo7GNvEKBuDsHs7uUl1c3vvOykQgje9 +dyJR6TfbQWn =aK29 -----END PGP SIGNATURE----- Merge tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu into staging Pull request trivial patches branch 20220930-v2 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM7XoISHGxhdXJlbnRA # dml2aWVyLmV1AAoJEPMMOL0/L748D/0QAKbYtTWjhFPeapjZVoTv13YrTvczWrcF # omL6IZivVq0t7hun4iem0DwmvXJELMGexEOTvEJOzM19IIlvvwvOsI8xnxpcMnEY # 6GKVbs53Ba0bg2yh7Dll2W9jkou9eX27DwUHMVF8KX7qqsbU+WyD/vdGZitgGt+T # 8yna7kzVvNVsdB3+DbIatI5RzzHeu4OqeuH/WCtAyzCaLB64UYTcHprskxIp4+wp # dR+EUSoDEr9Qx4PC+uVEsTFK1zZjyAYNoNIkh6fhlkRvDJ1uA75m3EJ57P8xPPqe # VbVkPMKi0d4c52m6XvLsQhyYryLx/qLLUAkJWVpY66aHcapYbZAEAfZmNGTQLrOJ # qIOJzIkOdU6l3pRgXVdVCgkHRc2HETwET2LyVbNkUz/vBlW2wOZQbZFbezComael # bQ/gNBYqP+eOGnZzeWbKBGHr/9QDBClNufidIMC+sOiUw0iSifzjkFwvH7IElx6K # EQCOSV6pOhKVlinTpmBbk1XD3xDkQ7ZidiLT9g+P1c8dExrXBhWOnfUHueISb8+s # KKMozuxQ/6/3c/DP5hwI9cKPEWEbqJfq1kMuxIvEivKGwUIqX2yq4VJ+hSlYJ+CW # nGjXZldtf4KwH+cTsxyPmdZRR5Q7+ODr5Xo7GNvEKBuDsHs7uUl1c3vvOykQgje9 # +dyJR6TfbQWn # =aK29 # -----END PGP SIGNATURE----- # gpg: Signature made Mon 03 Oct 2022 18:13:22 EDT # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu: docs: Update TPM documentation for usage of a TPM 2 Use g_new() & friends where that makes obvious sense Drop superfluous conditionals around g_free() block/qcow2-bitmap: Add missing cast to silent GCC error checkpatch: ignore target/hexagon/imported/* files mem/cxl_type3: fix GPF DVSEC .gitignore: add .cache/ to .gitignore hw/virtio/vhost-shadow-virtqueue: Silence GCC error "maybe-uninitialized" Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-10-04 14:04:18 -04:00
Markus Armbruster	76eb88b12b	Drop superfluous conditionals around g_free() There is no need to guard g_free(P) with if (P): g_free(NULL) is safe. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220923090428.93529-1-armbru@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-10-04 00:10:11 +02:00
Ray Zhang	c4ef867f29	target/i386/kvm: fix kvmclock_current_nsec: Assertion `time.tsc_timestamp <= migration_tsc' failed New KVM_CLOCK flags were added in the kernel.(c68dc1b577eabd5605c6c7c08f3e07ae18d30d5d) ``` + #define KVM_CLOCK_VALID_FLAGS \ + (KVM_CLOCK_TSC_STABLE \| KVM_CLOCK_REALTIME \| KVM_CLOCK_HOST_TSC) case KVM_CAP_ADJUST_CLOCK: - r = KVM_CLOCK_TSC_STABLE; + r = KVM_CLOCK_VALID_FLAGS; ``` kvm_has_adjust_clock_stable needs to handle additional flags, so that s->clock_is_reliable can be true and kvmclock_current_nsec doesn't need to be called. Signed-off-by: Ray Zhang <zhanglei002@gmail.com> Message-Id: <20220922100523.2362205-1-zhanglei002@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-01 21:16:36 +02:00
Vitaly Kuznetsov	45ed68a1a3	i386: do kvm_put_msr_feature_control() first thing when vCPU is reset kvm_put_sregs2() fails to reset 'locked' CR4/CR0 bits upon vCPU reset when it is in VMX root operation. Do kvm_put_msr_feature_control() before kvm_put_sregs2() to (possibly) kick vCPU out of VMX root operation. It also seems logical to do kvm_put_msr_feature_control() before kvm_put_nested_state() and not after it, especially when 'real' nested state is set. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 07:42:37 +02:00
Vitaly Kuznetsov	3cafdb6750	i386: reset KVM nested state upon CPU reset Make sure env->nested_state is cleaned up when a vCPU is reset, it may be stale after an incoming migration, kvm_arch_put_registers() may end up failing or putting vCPU in a weird state. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 07:42:37 +02:00
Vitaly Kuznetsov	3aae0854b2	i386: Hyper-V Direct TLB flush hypercall Hyper-V TLFS allows for L0 and L1 hypervisors to collaborate on L2's TLB flush hypercalls handling. With the correct setup, L2's TLB flush hypercalls can be handled by L0 directly, without the need to exit to L1. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	aa6bb5fad5	i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls KVM kind of supported "extended GVA ranges" (up to 4095 additional GFNs per hypercall) since the implementation of Hyper-V PV TLB flush feature (Linux-4.18) as regardless of the request, full TLB flush was always performed. "Extended GVA ranges for TLB flush hypercalls" feature bit wasn't exposed then. Now, as KVM gains support for fine-grained TLB flush handling, exposing this feature starts making sense. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00

1 2 3

124 Commits