mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Ankur Arora	5dbcd01a8d	i386/xen: implement HVMOP_set_param This is the hook for adding the HVM_PARAM_CALLBACK_IRQ parameter in a subsequent commit. Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Split out from another commit] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
Ankur Arora	105b47fdf2	i386/xen: implement HVMOP_set_evtchn_upcall_vector The HVMOP_set_evtchn_upcall_vector hypercall sets the per-vCPU upcall vector, to be delivered to the local APIC just like an MSI (with an EOI). This takes precedence over the system-wide delivery method set by the HVMOP_set_param hypercall with HVM_PARAM_CALLBACK_IRQ. It's used by Windows and Xen (PV shim) guests but normally not by Linux. Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Rework for upstream kernel changes and split from HVMOP_set_param] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
Joao Martins	3b06f29b24	i386/xen: implement HYPERVISOR_event_channel_op Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Ditch event_channel_op_compat which was never available to HVM guests] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
Joao Martins	5092db87e4	i386/xen: handle VCPUOP_register_runstate_memory_area Allow guest to setup the vcpu runstates which is used as steal clock. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
Joao Martins	f068930277	i386/xen: handle VCPUOP_register_vcpu_time_info In order to support Linux vdso in Xen. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	c345104cd1	i386/xen: handle VCPUOP_register_vcpu_info Handle the hypercall to set a per vcpu info, and also wire up the default vcpu_info in the shared_info page for the first 32 vCPUs. To avoid deadlock within KVM a vCPU thread must set its own vcpu_info rather than it being set from the context in which the hypercall is invoked. Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration, and restore it in kvm_arch_put_registers() appropriately. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	d70bd6a485	i386/xen: implement HYPERVISOR_vcpu_op This is simply when guest tries to register a vcpu_info and since vcpu_info placement is optional in the minimum ABI therefore we can just fail with -ENOSYS Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	671bfdcd47	i386/xen: implement HYPERVISOR_hvm_op This is when guest queries for support for HVMOP_pagetable_dying. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	782a79601e	i386/xen: implement XENMEM_add_to_physmap_batch Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	fb0fd2ce38	i386/xen: implement HYPERVISOR_memory_op Specifically XENMEM_add_to_physmap with space XENMAPSPACE_shared_info to allow the guest to set its shared_info page. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Use the xen_overlay device, add compat support] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	110a0ea59f	i386/xen: manage and save/restore Xen guest long_mode setting Xen will "latch" the guest's 32-bit or 64-bit ("long mode") setting when the guest writes the MSR to fill in the hypercall page, or when the guest sets the event channel callback in HVM_PARAM_CALLBACK_IRQ. KVM handles the former and sets the kernel's long_mode flag accordingly. The latter will be handled in userspace. Keep them in sync by noticing when a hypercall is made in a mode that doesn't match qemu's idea of the guest mode, and resyncing from the kernel. Do that same sync right before serialization too, in case the guest has set the hypercall page but hasn't yet made a system call. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	c789b9ef5f	i386/xen: Implement SCHEDOP_poll and SCHEDOP_yield They both do the same thing and just call sched_yield. This is enough to stop the Linux guest panicking when running on a host kernel which doesn't intercept SCHEDOP_poll and lets it reach userspace. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	79b7067dc6	i386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown It allows to shutdown itself via hypercall with any of the 3 reasons: 1) self-reboot 2) shutdown 3) crash Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather than leading to triple faults if it remains unimplemented. In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset Xen shared pages and other enlightenments and leave a clean slate for the new kernel without the hypervisor helpfully writing information at unexpected addresses. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Ditch sched_op_compat which was never available for HVM guests, Add SCHEDOP_soft_reset] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	bedcc13924	i386/xen: implement HYPERVISOR_xen_version This is just meant to serve as an example on how we can implement hypercalls. xen_version specifically since Qemu does all kind of feature controllability. So handling that here seems appropriate. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Implement kvm_gva_rw() safely] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	55a3f666b4	i386/xen: handle guest hypercalls This means handling the new exit reason for Xen but still crashing on purpose. As we implement each of the hypercalls we will then return the right return code. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	5e691a955a	i386/kvm: Set Xen vCPU ID in KVM There are (at least) three different vCPU ID number spaces. One is the internal KVM vCPU index, based purely on which vCPU was chronologically created in the kernel first. If userspace threads are all spawned and create their KVM vCPUs in essentially random order, then the KVM indices are basically random too. The second number space is the APIC ID space, which is consistent and useful for referencing vCPUs. MSIs will specify the target vCPU using the APIC ID, for example, and the KVM Xen APIs also take an APIC ID from userspace whenever a vCPU needs to be specified (as opposed to just using the appropriate vCPU fd). The third number space is not normally relevant to the kernel, and is the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But Xen timer hypercalls use it, and Xen timer hypercalls really want to be accelerated in the kernel rather than handled in userspace, so the kernel needs to be told. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	f66b8a83c5	i386/kvm: handle Xen HVM cpuid leaves Introduce support for emulating CPUID for Xen HVM guests. It doesn't make sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally when the xen-version machine property is set. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Obtain xen_version from KVM property, make it automatic] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	61491cf441	i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support This just initializes the basic Xen support in KVM for now. Only permitted on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap overlay, event channel, grant tables and other stuff will exist. There's no point having the basic hypercall support if nothing else works. Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used later by support devices. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Paolo Bonzini	3023c9b4d1	target/i386: KVM: allow fast string operations if host supports them These are just a flag that documents the performance characteristic of an instruction; it needs no hypervisor support. So include them even if KVM does not show them. In particular, FZRM/FSRS/FSRC have only been added very recently, but they are available on Sapphire Rapids processors. Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-02-27 18:44:57 +01:00
Markus Armbruster	e2c1c34f13	include/block: Untangle inclusion loops We have two inclusion loops: block/block.h -> block/block-global-state.h -> block/block-common.h -> block/blockjob.h -> block/block.h block/block.h -> block/block-io.h -> block/block-common.h -> block/blockjob.h -> block/block.h I believe these go back to Emanuele's reorganization of the block API, merged a few months ago in commit `d7e2fe4aac`. Fortunately, breaking them is merely a matter of deleting unnecessary includes from headers, and adding them back in places where they are now missing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221221133551.3967339-2-armbru@redhat.com>	2023-01-20 07:24:28 +01:00
Markus Armbruster	d1c81c3496	qapi: Use returned bool to check for failure (again) Commit `012d4c96e2` changed the visitor functions taking Error ** to return bool instead of void, and the commits following it used the new return value to simplify error checking. Since then a few more uses in need of the same treatment crept in. Do that. All pretty mechanical except for * balloon_stats_get_all() This is basically the same transformation commit `012d4c96e2` applied to the virtual walk example in include/qapi/visitor.h. * set_max_queue_size() Additionally replace "goto end of function" by return. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2022-12-14 16:19:35 +01:00
Zeng Guang	19e2a9fb9d	target/i386: Set maximum APIC ID to KVM prior to vCPU creation Specify maximum possible APIC ID assigned for current VM session to KVM prior to the creation of vCPUs. By this setting, KVM can set up VM-scoped data structure indexed by the APIC ID, e.g. Posted-Interrupt Descriptor pointer table to support Intel IPI virtualization, with the most optimal memory footprint. It can be achieved by calling KVM_ENABLE_CAP for KVM_CAP_MAX_VCPU_ID capability once KVM has enabled it. Ignoring the return error if KVM doesn't support this capability yet. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220825025246.26618-1-guang.zeng@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-31 09:46:34 +01:00
Markus Armbruster	0a553c12c7	Drop useless casts from g_malloc() & friends to pointer These memory allocation functions return void *, and casting to another pointer type is useless clutter. Drop these casts. If you really want another pointer type, consider g_new(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20220923120025.448759-3-armbru@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-10-22 23:15:40 +02:00
Maciej S. Szmigiero	ec19444a53	hyperv: fix SynIC SINT assertion failure on guest reset Resetting a guest that has Hyper-V VMBus support enabled triggers a QEMU assertion failure: hw/hyperv/hyperv.c:131: synic_reset: Assertion `QLIST_EMPTY(&synic->sint_routes)' failed. This happens both on normal guest reboot or when using "system_reset" HMP command. The failing assertion was introduced by commit `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") to catch dangling SINT routes on SynIC reset. The root cause of this problem is that the SynIC itself is reset before devices using SINT routes have chance to clean up these routes. Since there seems to be no existing mechanism to force reset callbacks (or methods) to be executed in specific order let's use a similar method that is already used to reset another interrupt controller (APIC) after devices have been reset - by invoking the SynIC reset from the machine reset handler via a new x86_cpu_after_reset() function co-located with the existing x86_cpu_reset() in target/i386/cpu.c. Opportunistically move the APIC reset handler there, too. Fixes: `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") # exposed the bug Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <cb57cee2e29b20d06f81dce054cbcea8b5d497e8.1664552976.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:04 +02:00
Alexander Graf	37656470f6	KVM: x86: Implement MSR_CORE_THREAD_COUNT MSR The MSR_CORE_THREAD_COUNT MSR describes CPU package topology, such as number of threads and cores for a given package. This is information that QEMU has readily available and can provide through the new user space MSR deflection interface. This patch propagates the existing hvf logic from patch `027ac0cb51` ("target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT") to KVM. Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20221004225643.65036-4-agraf@csgraf.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:01 +02:00
Alexander Graf	860054d8ce	i386: kvm: Add support for MSR filtering KVM has grown support to deflect arbitrary MSRs to user space since Linux 5.10. For now we don't expect to make a lot of use of this feature, so let's expose it the easiest way possible: With up to 16 individually maskable MSRs. This patch adds a kvm_filter_msr() function that other code can call to install a hook on KVM MSR reads or writes. Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20221004225643.65036-3-agraf@csgraf.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:01 +02:00
Chenyi Qiang	e2e69f6bb9	i386: add notify VM exit support There are cases that malicious virtual machine can cause CPU stuck (due to event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and IRQ) can be delivered. It leads the CPU to be unavailable to host or other VMs. Notify VM exit is introduced to mitigate such kind of attacks, which will generate a VM exit if no event window occurs in VM non-root mode for a specified amount of time (notify window). A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space so that the user can query the capability and set the expected notify window when creating VMs. The format of the argument when enabling this capability is as follows: Bit 63:32 - notify window specified in qemu command Bit 31:0 - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to enable the feature.) Users can configure the feature by a new (x86 only) accel property: qemu -accel kvm,notify-vmexit=run\|internal-error\|disable,notify-window=n The default option of notify-vmexit is run, which will enable the capability and do nothing if the exit happens. The internal-error option raises a KVM internal error if it happens. The disable option does not enable the capability. The default value of notify-window is 0. It is valid only when notify-vmexit is not disabled. The valid range of notify-window is non-negative. It is even safe to set it to zero since there's an internal hardware threshold to be added to ensure no false positive. Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit qualification (no cases are anticipated that would set this bit), which means VM context is corrupted. It would be reflected in the flags of KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM internal error unconditionally. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-5-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:00 +02:00
Paolo Bonzini	3dba0a335c	kvm: allow target-specific accelerator properties Several hypervisor capabilities in KVM are target-specific. When exposed to QEMU users as accelerator properties (i.e. -accel kvm,prop=value), they should not be available for all targets. Add a hook for targets to add their own properties to -accel kvm, for now no such property is defined. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220929072014.20705-3-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-10 09:23:16 +02:00
Chenyi Qiang	12f89a39cf	i386: kvm: extend kvm_{get, put}_vcpu_events to support pending triple fault For the direct triple faults, i.e. hardware detected and KVM morphed to VM-Exit, KVM will never lose them. But for triple faults sythesized by KVM, e.g. the RSM path, if KVM exits to userspace before the request is serviced, userspace could migrate the VM and lose the triple fault. A new flag KVM_VCPUEVENT_VALID_TRIPLE_FAULT is defined to signal that the event.triple_fault_pending field contains a valid state if the KVM_CAP_X86_TRIPLE_FAULT_EVENT capability is enabled. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-2-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-10 09:23:16 +02:00
Stefan Hajnoczi	fafd35a6da	Pull request trivial patches branch 20220930-v2 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM7XoISHGxhdXJlbnRA dml2aWVyLmV1AAoJEPMMOL0/L748D/0QAKbYtTWjhFPeapjZVoTv13YrTvczWrcF omL6IZivVq0t7hun4iem0DwmvXJELMGexEOTvEJOzM19IIlvvwvOsI8xnxpcMnEY 6GKVbs53Ba0bg2yh7Dll2W9jkou9eX27DwUHMVF8KX7qqsbU+WyD/vdGZitgGt+T 8yna7kzVvNVsdB3+DbIatI5RzzHeu4OqeuH/WCtAyzCaLB64UYTcHprskxIp4+wp dR+EUSoDEr9Qx4PC+uVEsTFK1zZjyAYNoNIkh6fhlkRvDJ1uA75m3EJ57P8xPPqe VbVkPMKi0d4c52m6XvLsQhyYryLx/qLLUAkJWVpY66aHcapYbZAEAfZmNGTQLrOJ qIOJzIkOdU6l3pRgXVdVCgkHRc2HETwET2LyVbNkUz/vBlW2wOZQbZFbezComael bQ/gNBYqP+eOGnZzeWbKBGHr/9QDBClNufidIMC+sOiUw0iSifzjkFwvH7IElx6K EQCOSV6pOhKVlinTpmBbk1XD3xDkQ7ZidiLT9g+P1c8dExrXBhWOnfUHueISb8+s KKMozuxQ/6/3c/DP5hwI9cKPEWEbqJfq1kMuxIvEivKGwUIqX2yq4VJ+hSlYJ+CW nGjXZldtf4KwH+cTsxyPmdZRR5Q7+ODr5Xo7GNvEKBuDsHs7uUl1c3vvOykQgje9 +dyJR6TfbQWn =aK29 -----END PGP SIGNATURE----- Merge tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu into staging Pull request trivial patches branch 20220930-v2 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM7XoISHGxhdXJlbnRA # dml2aWVyLmV1AAoJEPMMOL0/L748D/0QAKbYtTWjhFPeapjZVoTv13YrTvczWrcF # omL6IZivVq0t7hun4iem0DwmvXJELMGexEOTvEJOzM19IIlvvwvOsI8xnxpcMnEY # 6GKVbs53Ba0bg2yh7Dll2W9jkou9eX27DwUHMVF8KX7qqsbU+WyD/vdGZitgGt+T # 8yna7kzVvNVsdB3+DbIatI5RzzHeu4OqeuH/WCtAyzCaLB64UYTcHprskxIp4+wp # dR+EUSoDEr9Qx4PC+uVEsTFK1zZjyAYNoNIkh6fhlkRvDJ1uA75m3EJ57P8xPPqe # VbVkPMKi0d4c52m6XvLsQhyYryLx/qLLUAkJWVpY66aHcapYbZAEAfZmNGTQLrOJ # qIOJzIkOdU6l3pRgXVdVCgkHRc2HETwET2LyVbNkUz/vBlW2wOZQbZFbezComael # bQ/gNBYqP+eOGnZzeWbKBGHr/9QDBClNufidIMC+sOiUw0iSifzjkFwvH7IElx6K # EQCOSV6pOhKVlinTpmBbk1XD3xDkQ7ZidiLT9g+P1c8dExrXBhWOnfUHueISb8+s # KKMozuxQ/6/3c/DP5hwI9cKPEWEbqJfq1kMuxIvEivKGwUIqX2yq4VJ+hSlYJ+CW # nGjXZldtf4KwH+cTsxyPmdZRR5Q7+ODr5Xo7GNvEKBuDsHs7uUl1c3vvOykQgje9 # +dyJR6TfbQWn # =aK29 # -----END PGP SIGNATURE----- # gpg: Signature made Mon 03 Oct 2022 18:13:22 EDT # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu: docs: Update TPM documentation for usage of a TPM 2 Use g_new() & friends where that makes obvious sense Drop superfluous conditionals around g_free() block/qcow2-bitmap: Add missing cast to silent GCC error checkpatch: ignore target/hexagon/imported/* files mem/cxl_type3: fix GPF DVSEC .gitignore: add .cache/ to .gitignore hw/virtio/vhost-shadow-virtqueue: Silence GCC error "maybe-uninitialized" Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-10-04 14:04:18 -04:00
Markus Armbruster	76eb88b12b	Drop superfluous conditionals around g_free() There is no need to guard g_free(P) with if (P): g_free(NULL) is safe. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220923090428.93529-1-armbru@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-10-04 00:10:11 +02:00
Ray Zhang	c4ef867f29	target/i386/kvm: fix kvmclock_current_nsec: Assertion `time.tsc_timestamp <= migration_tsc' failed New KVM_CLOCK flags were added in the kernel.(c68dc1b577eabd5605c6c7c08f3e07ae18d30d5d) ``` + #define KVM_CLOCK_VALID_FLAGS \ + (KVM_CLOCK_TSC_STABLE \| KVM_CLOCK_REALTIME \| KVM_CLOCK_HOST_TSC) case KVM_CAP_ADJUST_CLOCK: - r = KVM_CLOCK_TSC_STABLE; + r = KVM_CLOCK_VALID_FLAGS; ``` kvm_has_adjust_clock_stable needs to handle additional flags, so that s->clock_is_reliable can be true and kvmclock_current_nsec doesn't need to be called. Signed-off-by: Ray Zhang <zhanglei002@gmail.com> Message-Id: <20220922100523.2362205-1-zhanglei002@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-01 21:16:36 +02:00
Vitaly Kuznetsov	45ed68a1a3	i386: do kvm_put_msr_feature_control() first thing when vCPU is reset kvm_put_sregs2() fails to reset 'locked' CR4/CR0 bits upon vCPU reset when it is in VMX root operation. Do kvm_put_msr_feature_control() before kvm_put_sregs2() to (possibly) kick vCPU out of VMX root operation. It also seems logical to do kvm_put_msr_feature_control() before kvm_put_nested_state() and not after it, especially when 'real' nested state is set. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 07:42:37 +02:00
Vitaly Kuznetsov	3cafdb6750	i386: reset KVM nested state upon CPU reset Make sure env->nested_state is cleaned up when a vCPU is reset, it may be stale after an incoming migration, kvm_arch_put_registers() may end up failing or putting vCPU in a weird state. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 07:42:37 +02:00
Vitaly Kuznetsov	3aae0854b2	i386: Hyper-V Direct TLB flush hypercall Hyper-V TLFS allows for L0 and L1 hypervisors to collaborate on L2's TLB flush hypercalls handling. With the correct setup, L2's TLB flush hypercalls can be handled by L0 directly, without the need to exit to L1. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	aa6bb5fad5	i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls KVM kind of supported "extended GVA ranges" (up to 4095 additional GFNs per hypercall) since the implementation of Hyper-V PV TLB flush feature (Linux-4.18) as regardless of the request, full TLB flush was always performed. "Extended GVA ranges for TLB flush hypercalls" feature bit wasn't exposed then. Now, as KVM gains support for fine-grained TLB flush handling, exposing this feature starts making sense. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	9411e8b6fa	i386: Hyper-V XMM fast hypercall input feature Hyper-V specification allows to pass parameters for certain hypercalls using XMM registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows for faster hypercalls processing as KVM can avoid reading guest's memory. KVM supports the feature since v5.14. Rename HV_HYPERCALL_{PARAMS_XMM_AVAILABLE -> XMM_INPUT_AVAILABLE} to comply with KVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	869840d26c	i386: Hyper-V Enlightened MSR bitmap feature The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	7110fe56c1	i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently as it was only used to encode the supported eVMCS version range. In fact, there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Yang Weijiang	3a7a27cffb	target/i386: Remove LBREn bit check when access Arch LBR MSRs Live migration can happen when Arch LBR LBREn bit is cleared, e.g., when migration happens after guest entered SMM mode. In this case, we still need to migrate Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220517155024.33270-1-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-23 10:56:01 +02:00
Richard Henderson	eec398119f	virtio,pc,pci: fixes,cleanups,features most of CXL support fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77 1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1 f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA= =xdSk -----END PGP SIGNATURE----- Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging virtio,pc,pci: fixes,cleanups,features most of CXL support fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmKCuLIPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpdDUH/12SmWaAo+0+SdIHgWFFxsmg3t/EdcO38fgi # MV+GpYdbp6TlU3jdQhrMZYmFdkVVydBdxk93ujCLbFS0ixTsKj31j0IbZMfdcGgv # SLqnV+E3JdHqnGP39q9a9rdwYWyqhkgHoldxilIFW76ngOSapaZVvnwnOMAMkf77 # 1LieL4/Xq7N9Ho86Zrs3IczQcf0czdJRDaFaSIu8GaHl8ELyuPhlSm6CSqqrEEWR # PA/COQsLDbLOMxbfCi5v88r5aaxmGNZcGbXQbiH9qVHw65nlHyLH9UkNTdJn1du1 # f2GYwwa7eekfw/LCvvVwxO1znJrj02sfFai7aAtQYbXPvjvQiqA= # =xdSk # -----END PGP SIGNATURE----- # gpg: Signature made Mon 16 May 2022 01:48:50 PM PDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (86 commits) vhost-user-scsi: avoid unlink(NULL) with fd passing virtio-net: don't handle mq request in userspace handler for vhost-vdpa vhost-vdpa: change name and polarity for vhost_vdpa_one_time_request() vhost-vdpa: backend feature should set only once vhost-net: fix improper cleanup in vhost_net_start vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa virtio-net: setup vhost_dev and notifiers for cvq only when feature is negotiated hw/i386/amd_iommu: Fix IOMMU event log encoding errors hw/i386: Make pic a property of common x86 base machine type hw/i386: Make pit a property of common x86 base machine type include/hw/pci/pcie_host: Correct PCIE_MMCFG_SIZE_MAX include/hw/pci/pcie_host: Correct PCIE_MMCFG_BUS_MASK docs/vhost-user: Clarifications for VHOST_USER_ADD/REM_MEM_REG vhost-user: more master/slave things virtio: add vhost support for virtio devices virtio: drop name parameter for virtio_init() virtio/vhost-user: dynamically assign VhostUserHostNotifiers hw/virtio/vhost-user: don't suppress F_CONFIG when supported include/hw: start documenting the vhost API ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-05-16 16:31:01 -07:00
David Woodhouse	dc89f32d92	target/i386: Fix sanity check on max APIC ID / X2APIC enablement The check on x86ms->apic_id_limit in pc_machine_done() had two problems. Firstly, we need KVM to support the X2APIC API in order to allow IRQ delivery to APICs >= 255. So we need to call/check kvm_enable_x2apic(), which was done elsewhere in some cases but not all. Secondly, microvm needs the same check. So move it from pc_machine_done() to x86_cpus_init() where it will work for both. The check in kvm_cpu_instance_init() is now redundant and can be dropped. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20220314142544.150555-1-dwmw2@infradead.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-16 04:38:39 -04:00
Yang Weijiang	12703d4e75	target/i386: Add MSR access interface for Arch LBR In the first generation of Arch LBR, the max support Arch LBR depth is 32, both host and guest use the value to set depth MSR. This can simplify the implementation of patch given the side-effect of mismatch of host/guest depth MSR: XRSTORS will reset all recording MSRs to 0s if the saved depth mismatches MSR_ARCH_LBR_DEPTH. In most of the cases Arch LBR is not in active status, so check the control bit before save/restore the big chunck of Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-7-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-14 12:32:41 +02:00
Yang Weijiang	5a778a5f82	target/i386: Add kvm_get_one_msr helper When try to get one msr from KVM, I found there's no such kind of existing interface while kvm_put_one_msr() is there. So here comes the patch. It'll remove redundant preparation code before finally call KVM_GET_MSRS IOCTL. No functional change intended. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-4-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-14 12:32:41 +02:00
Jon Doron	d8701185f4	hw: hyperv: Initial commit for Synthetic Debugging device Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-5-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
Jon Doron	73d2407407	hyperv: Add support to process syndbg commands SynDbg commands can come from two different flows: 1. Hypercalls, in this mode the data being sent is fully encapsulated network packets. 2. SynDbg specific MSRs, in this mode only the data that needs to be transfered is passed. Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-4-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
Jon Doron	ccbdf5e81b	hyperv: Add definitions for syndbg Add all required definitions for hyperv synthetic debugger interface. Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-3-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
Marc-André Lureau	0f9668e0c1	Remove qemu-common.h include from most units Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:55 +02:00
Paolo Bonzini	58f7db26f2	KVM: x86: workaround invalid CPUID[0xD,9] info on some AMD processors Some AMD processors expose the PKRU extended save state even if they do not have the related PKU feature in CPUID. Worse, when they do they report a size of 64, whereas the expected size of the PKRU extended save state is 8, therefore the esa->size == eax assertion does not hold. The state is already ignored by KVM_GET_SUPPORTED_CPUID because it was not enabled in the host XCR0. However, QEMU kvm_cpu_xsave_init() runs before QEMU invokes arch_prctl() to enable dynamically-enabled save states such as XTILEDATA, and KVM_GET_SUPPORTED_CPUID hides save states that have yet to be enabled. Therefore, kvm_cpu_xsave_init() needs to consult the host CPUID instead of KVM_GET_SUPPORTED_CPUID, and dies with an assertion failure. When setting up the ExtSaveArea array to match the host, ignore features that KVM does not report as supported. This will cause QEMU to skip the incorrect CPUID leaf instead of tripping the assertion. Closes: https://gitlab.com/qemu-project/qemu/-/issues/916 Reported-by: Daniel P. Berrangé <berrange@redhat.com> Analyzed-by: Yang Zhong <yang.zhong@intel.com> Reported-by: Peter Krempa <pkrempa@redhat.com> Tested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 14:13:58 +01:00
luofei	cb48748af7	i386: Set MCG_STATUS_RIPV bit for mce SRAR error In the physical machine environment, when a SRAR error occurs, the IA32_MCG_STATUS RIPV bit is set, but qemu does not set this bit. When qemu injects an SRAR error into virtual machine, the virtual machine kernel just call do_machine_check() to kill the current task, but not call memory_failure() to isolate the faulty page, which will cause the faulty page to be allocated and used repeatedly. If used by the virtual machine kernel, it will cause the virtual machine to crash Signed-off-by: luofei <luofei@unicloud.com> Message-Id: <20220120084634.131450-1-luofei@unicloud.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 12:22:25 +01:00
Philippe Mathieu-Daudé	dcebbb65b8	target/i386/kvm: Free xsave_buf when destroying vCPU Fix vCPU hot-unplug related leak reported by Valgrind: ==132362== 4,096 bytes in 1 blocks are definitely lost in loss record 8,440 of 8,549 ==132362== at 0x4C3B15F: memalign (vg_replace_malloc.c:1265) ==132362== by 0x4C3B288: posix_memalign (vg_replace_malloc.c:1429) ==132362== by 0xB41195: qemu_try_memalign (memalign.c:53) ==132362== by 0xB41204: qemu_memalign (memalign.c:73) ==132362== by 0x7131CB: kvm_init_xsave (kvm.c:1601) ==132362== by 0x7148ED: kvm_arch_init_vcpu (kvm.c:2031) ==132362== by 0x91D224: kvm_init_vcpu (kvm-all.c:516) ==132362== by 0x9242C9: kvm_vcpu_thread_fn (kvm-accel-ops.c:40) ==132362== by 0xB2EB26: qemu_thread_start (qemu-thread-posix.c:556) ==132362== by 0x7EB2159: start_thread (in /usr/lib64/libpthread-2.28.so) ==132362== by 0x9D45DD2: clone (in /usr/lib64/libc-2.28.so) Reported-by: Mark Kanda <mark.kanda@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Mark Kanda <mark.kanda@oracle.com> Message-Id: <20220322120522.26200-1-philippe.mathieu.daude@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 12:22:25 +01:00
Paolo Bonzini	3ec5ad4008	target/i386: kvm: do not access uninitialized variable on older kernels KVM support for AMX includes a new system attribute, KVM_X86_XCOMP_GUEST_SUPP. Commit `19db68ca68` ("x86: Grant AMX permission for guest", 2022-03-15) however did not fully consider the behavior on older kernels. First, it warns too aggressively. Second, it invokes the KVM_GET_DEVICE_ATTR ioctl unconditionally and then uses the "bitmask" variable, which remains uninitialized if the ioctl fails. Third, kvm_ioctl returns -errno rather than -1 on errors. While at it, explain why the ioctl is needed and KVM_GET_SUPPORTED_CPUID is not enough. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-20 20:38:52 +01:00
Zeng Guang	cdec2b753b	x86: Support XFD and AMX xsave data migration XFD(eXtended Feature Disable) allows to enable a feature on xsave state while preventing specific user threads from using the feature. Support save and restore XFD MSRs if CPUID.D.1.EAX[4] enumerate to be valid. Likewise migrate the MSRs and related xsave state necessarily. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-8-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Jing Liu	e56dd3c70a	x86: add support for KVM_CAP_XSAVE2 and AMX state migration When dynamic xfeatures (e.g. AMX) are used by the guest, the xsave area would be larger than 4KB. KVM_GET_XSAVE2 and KVM_SET_XSAVE under KVM_CAP_XSAVE2 works with a xsave buffer larger than 4KB. Always use the new ioctls under KVM_CAP_XSAVE2 when KVM supports it. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-7-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Jing Liu	f21a48171c	x86: Add AMX CPUIDs enumeration Add AMX primary feature bits XFD and AMX_TILE to enumerate the CPU's AMX capability. Meanwhile, add AMX TILE and TMUL CPUID leaf and subleaves which exist when AMX TILE is present to provide the maximum capability of TILE and TMUL. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-6-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Yang Zhong	19db68ca68	x86: Grant AMX permission for guest Kernel allocates 4K xstate buffer by default. For XSAVE features which require large state component (e.g. AMX), Linux kernel dynamically expands the xstate buffer only after the process has acquired the necessary permissions. Those are called dynamically- enabled XSAVE features (or dynamic xfeatures). There are separate permissions for native tasks and guests. Qemu should request the guest permissions for dynamic xfeatures which will be exposed to the guest. This only needs to be done once before the first vcpu is created. KVM implemented one new ARCH_GET_XCOMP_SUPP system attribute API to get host side supported_xcr0 and Qemu can decide if it can request dynamically enabled XSAVE features permission. https://lore.kernel.org/all/20220126152210.3044876-1-pbonzini@redhat.com/ Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Message-Id: <20220217060434.52460-4-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Jing Liu	131266b756	x86: Fix the 64-byte boundary enumeration for extended state The extended state subleaves (EAX=0Dh, ECX=n, n>1).ECX[1] indicate whether the extended state component locates on the next 64-byte boundary following the preceding state component when the compacted format of an XSAVE area is used. Right now, they are all zero because no supported component needed the bit to be set, but the upcoming AMX feature will use it. Fix the subleaves value according to KVM's supported cpuid. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-2-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Longpeng(Mike)	def4c5570c	kvm/msi: do explicit commit when adding msi routes We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:26:20 +01:00
Peter Maydell	5df022cf2e	osdep: Move memalign-related functions to their own header Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org	2022-03-07 13:16:49 +00:00
Paolo Bonzini	1520f8bb67	KVM: x86: ignore interrupt_bitmap field of KVM_GET/SET_SREGS This is unnecessary, because the interrupt would be retrieved and queued anyway by KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS respectively, and it makes the flow more similar to the one for KVM_GET/SET_SREGS2. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-12 14:09:06 +01:00
Maxim Levitsky	8f515d3869	KVM: use KVM_{GET\|SET}_SREGS2 when supported. This allows to make PDPTRs part of the migration stream and thus not reload them after migration which is against X86 spec. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-12 14:09:06 +01:00
Philippe Mathieu-Daudé	dc7d6cafce	target/i386/kvm: Replace use of __u32 type QEMU coding style mandates to not use Linux kernel internal types for scalars types. Replace __u32 by uint32_t. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211116193955.2793171-1-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-12-17 10:40:51 +01:00
Maxim Levitsky	cabf9862e4	KVM: SVM: add migration support for nested TSC scaling Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-11-02 15:57:27 +01:00
Philippe Mathieu-Daudé	deae846f94	target/i386/sev: Declare system-specific functions in 'sev.h' "sysemu/sev.h" is only used from x86-specific files. Let's move it to include/hw/i386, and merge it with target/i386/sev.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-16-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-13 10:47:49 +02:00
Philippe Mathieu-Daudé	93777de365	target/i386/sev: Rename sev_i386.h -> sev.h SEV is a x86 specific feature, and the "sev_i386.h" header is already in target/i386/. Rename it as "sev.h" to simplify. Patch created mechanically using: $ git mv target/i386/sev_i386.h target/i386/sev.h $ sed -i s/sev_i386.h/sev.h/ $(git grep -l sev_i386.h) Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211007161716.453984-15-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-13 10:47:49 +02:00
Philippe Mathieu-Daudé	773ab6cb16	target/i386/kvm: Restrict SEV stubs to x86 architecture SEV is x86-specific, no need to add its stub to other architectures. Move the stub file to target/i386/kvm/. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-5-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-13 10:47:49 +02:00
Philippe Mathieu-Daudé	02310f3a91	target/i386/kvm: Introduce i386_softmmu_kvm Meson source set Introduce the i386_softmmu_kvm Meson source set to be able to add features dependent on CONFIG_KVM. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-4-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-13 10:47:49 +02:00
Vitaly Kuznetsov	af7228b88d	i386: Make Hyper-V version id configurable Currently, we hardcode Hyper-V version id (CPUID 0x40000002) to WS2008R2 and it is known that certain tools in Windows check this. It seems useful to provide some flexibility by making it possible to change this info at will. CPUID information is defined in TLFS as: EAX: Build Number EBX Bits 31-16: Major Version Bits 15-0: Minor Version ECX Service Pack EDX Bits 31-24: Service Branch Bits 23-0: Service Number Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-8-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	e1f9a8e8c9	i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenment The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled. Normally, Hyper-V SynIC disables these hardware features and suggests the guest to use paravirtualized AutoEOI feature. Linux-4.15 gains support for conditional APICv/AVIC disablement, the feature stays on until the guest tries to use AutoEOI feature with SynIC. With 'HV_DEPRECATING_AEOI_RECOMMENDED' bit exposed, modern enough Windows/ Hyper-V versions should follow the recommendation and not use the (unwanted) feature. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-7-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	050716292a	i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to hyperv_fill_cpuids() In preparation to enabling Hyper-V + APICv/AVIC move HV_APIC_ACCESS_RECOMMENDED setting out of kvm_hyperv_properties[]: the 'real' feature bit for the vAPIC features is HV_APIC_ACCESS_AVAILABLE, HV_APIC_ACCESS_RECOMMENDED is a recommendation to use the feature which we may not always want to give. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	70367f0917	i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID By default, KVM allows the guest to use all currently supported Hyper-V enlightenments when Hyper-V CPUID interface was exposed, regardless of if some features were not announced in guest visible CPUIDs. hv-enforce-cpuid feature alters this behavior and only allows the guest to use exposed Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	988f7b8bfe	i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUID By default, KVM allows the guest to use all currently supported PV features even when they were not announced in guest visible CPUIDs. Introduce a new "kvm-pv-enforce-cpuid" flag to limit the supported feature set to the exposed features. The feature is supported by Linux >= 5.10 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Peter Xu	142518bda5	memory: Name all the memory listeners Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 15:30:24 +02:00
Sean Christopherson	b9edbadefb	i386: Propagate SGX CPUID sub-leafs to KVM The SGX sub-leafs are enumerated at CPUID 0x12. Indices 0 and 1 are always present when SGX is supported, and enumerate SGX features and capabilities. Indices >=2 are directly correlated with the platform's EPC sections. Because the number of EPC sections is dynamic and user defined, the number of SGX sub-leafs is "NULL" terminated. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-15-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:20 +02:00
Sean Christopherson	c22f546785	i386: kvm: Add support for exposing PROVISIONKEY to guest If the guest want to fully use SGX, the guest needs to be able to access provisioning key. Add a new KVM_CAP_SGX_ATTRIBUTE to KVM to support provisioning key to KVM guests. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-14-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:20 +02:00
Sean Christopherson	a04835414b	i386: Add feature control MSR dependency when SGX is enabled SGX adds multiple flags to FEATURE_CONTROL to enable SGX and Flexible Launch Control. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-12-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:20 +02:00
Sean Christopherson	db88806523	i386: Add get/set/migrate support for SGX_LEPUBKEYHASH MSRs On real hardware, on systems that supports SGX Launch Control, those MSRs are initialized to digest of Intel's signing key; on systems that don't support SGX Launch Control, those MSRs are not available but hardware always uses digest of Intel's signing key in EINIT. KVM advertises SGX LC via CPUID if and only if the MSRs are writable. Unconditionally initialize those MSRs to digest of Intel's signing key when CPU is realized and reset to reflect the fact. This avoids potential bug in case kvm_arch_put_registers() is called before kvm_arch_get_registers() is called, in which case guest's virtual SGX_LEPUBKEYHASH MSRs will be set to 0, although KVM initializes those to digest of Intel's signing key by default, since KVM allows those MSRs to be updated by Qemu to support live migration. Save/restore the SGX Launch Enclave Public Key Hash MSRs if SGX Launch Control (LC) is exposed to the guest. Likewise, migrate the MSRs if they are writable by the guest. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-11-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:20 +02:00
Markus Armbruster	436c831a28	migration: Unify failure check for migrate_add_blocker() Most callers check the return value. Some check whether it set an error. Functionally equivalent, but the former tends to be easier on the eyes, so do that everywhere. Prior art: commit `c6ecec43b2` "qemu-option: Check return value instead of @err where convenient". Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-26 17:15:28 +02:00
Markus Armbruster	a5c051b2cf	i386: Never free migration blocker objects instead of sometimes invtsc_mig_blocker has static storage duration. When a CPU with certain features is initialized, and invtsc_mig_blocker is still null, we add a migration blocker and store it in invtsc_mig_blocker. The object is freed when migrate_add_blocker() fails, leaving invtsc_mig_blocker dangling. It is not freed on later failures. Same for hv_passthrough_mig_blocker and hv_no_nonarch_cs_mig_blocker. All failures are actually fatal, so whether we free or not doesn't really matter, except as bad examples to be copied / imitated. Clean this up in a minimal way: never free these blocker objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-7-armbru@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-26 17:15:28 +02:00
Vitaly Kuznetsov	e4adb09f79	i386: assert 'cs->kvm_state' is not null Coverity reports potential NULL pointer dereference in get_supported_hv_cpuid_legacy() when 'cs->kvm_state' is NULL. While 'cs->kvm_state' can indeed be NULL in hv_cpuid_get_host(), kvm_hyperv_expand_features() makes sure that it only happens when KVM_CAP_SYS_HYPERV_CPUID is supported and KVM_CAP_SYS_HYPERV_CPUID implies KVM_CAP_HYPERV_CPUID so get_supported_hv_cpuid_legacy() is never really called. Add asserts to strengthen the protection against broken KVM behavior. Coverity: CID 1458243 Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210716115852.418293-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-29 10:15:51 +02:00
Claudio Fontana	5b8978d804	i386: do not call cpudef-only models functions for max, host, base Some cpu properties have to be set only for cpu models in builtin_x86_defs, registered with x86_register_cpu_model_type, and not for cpu models "base", "max", and the subclass "host". These properties are the ones set by function x86_cpu_apply_props, (also including kvm_default_props, tcg_default_props), and the "vendor" property for the KVM and HVF accelerators. After recent refactoring of cpu, which also affected these properties, they were instead set unconditionally for all x86 cpus. This has been detected as a bug with Nested on AMD with cpu "host", as svm was not turned on by default, due to the wrongful setting of kvm_default_props via x86_cpu_apply_props, which set svm to "off". Rectify the bug introduced in commit "i386: split cpu accelerators" and document the functions that are builtin_x86_defs-only. Signed-off-by: Claudio Fontana <cfontana@suse.de> Tested-by: Alexander Bulekov <alxndr@bu.edu> Fixes: `f5cc5a5c` ("i386: split cpu accelerators from cpu.c,"...) Resolves: https://gitlab.com/qemu-project/qemu/-/issues/477 Message-Id: <20210723112921.12637-1-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-23 15:47:13 +02:00
Vitaly Kuznetsov	cce087f628	i386: Hyper-V SynIC requires POST_MESSAGES/SIGNAL_EVENTS privileges When Hyper-V SynIC is enabled, we may need to allow Windows guests to make hypercalls (POST_MESSAGES/SIGNAL_EVENTS). No issue is currently observed because KVM is very permissive, allowing these hypercalls regarding of guest visible CPUid bits. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-9-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	b26f68c36b	i386: HV_HYPERCALL_AVAILABLE privilege bit is always needed According to TLFS, Hyper-V guest is supposed to check HV_HYPERCALL_AVAILABLE privilege bit before accessing HV_X64_MSR_GUEST_OS_ID/HV_X64_MSR_HYPERCALL MSRs but at least some Windows versions ignore that. As KVM is very permissive and allows accessing these MSRs unconditionally, no issue is observed. We may, however, want to tighten the checks eventually. Conforming to the spec is probably also a good idea. Enable HV_HYPERCALL_AVAILABLE bit unconditionally. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-8-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	5ce48fa354	i386: kill off hv_cpuid_check_and_set() hv_cpuid_check_and_set() does too much: - Checks if the feature is supported by KVM; - Checks if all dependencies are enabled; - Sets the feature bit in cpu->hyperv_features for 'passthrough' mode. To reduce the complexity, move all the logic except for dependencies check out of it. Also, in 'passthrough' mode we don't really need to check dependencies because KVM is supposed to provide a consistent set anyway. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-7-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	071ce4b03b	i386: expand Hyper-V features during CPU feature expansion time To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we need to expand and set the corresponding CPUID leaves early. Modify x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V specific kvm_hv_get_supported_cpuid() instead of kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid() as Hyper-V specific CPUID leaves intersect with KVM's. Note, early expansion will only happen when KVM supports system wide KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID). Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-6-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	d7652b772f	i386: make hyperv_expand_features() return bool Return 'false' when hyperv_expand_features() sets an error. No functional change intended. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-5-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	07454e2ea8	i386: hardcode supported eVMCS version to '1' Currently, the only eVMCS version, supported by KVM (and described in TLFS) is '1'. When Enlightened VMCS feature is enabled, QEMU takes the supported eVMCS version range (from KVM_CAP_HYPERV_ENLIGHTENED_VMCS enablement) and puts it to guest visible CPUIDs. When (and if) eVMCS ver.2 appears a problem on migration is expected: it doesn't seem to be possible to migrate from a host supporting eVMCS ver.2 to a host, which only support eVMCS ver.1. Hardcode eVMCS ver.1 as the result of 'hv-evmcs' enablement for now. Newer eVMCS versions will have to have their own enablement options (e.g. 'hv-evmcs=2'). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210608120817.1325125-4-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
David Edmondson	fea4500841	target/i386: Populate x86_ext_save_areas offsets using cpuid where possible Rather than relying on the X86XSaveArea structure definition, determine the offset of XSAVE state areas using CPUID leaf 0xd where possible (KVM and HVF). Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210705104632.2902400-8-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-06 08:33:48 +02:00
David Edmondson	c0198c5f87	target/i386: Pass buffer and length to XSAVE helper In preparation for removing assumptions about XSAVE area offsets, pass a buffer pointer and buffer length to the XSAVE helper functions. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210705104632.2902400-5-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-06 07:54:53 +02:00
David Edmondson	436463b84b	target/i386: Consolidate the X86XSaveArea offset checks Rather than having similar but different checks in cpu.h and kvm.c, move them all to cpu.h. Message-Id: <20210705104632.2902400-3-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-06 07:54:53 +02:00
Paolo Bonzini	9ce8af4d92	target/i386: kvm: add support for TSC scaling Linux 5.14 will add support for nested TSC scaling. Add the corresponding feature in QEMU; to keep support for existing kernels, do not add it to any processor yet. The handling of the VMCS enumeration MSR is ugly; once we have more than one case, we may want to add a table to check VMX features against. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-25 10:53:46 +02:00
Chenyi Qiang	035d1ef265	i386: Add ratelimit for bus locks acquired in guest A bus lock is acquired through either split locked access to writeback (WB) memory or any locked access to non-WB memory. It is typically >1000 cycles slower than an atomic operation within a cache and can also disrupts performance on other cores. Virtual Machines can exploit bus locks to degrade the performance of system. To address this kind of performance DOS attack coming from the VMs, bus lock VM exit is introduced in KVM and it can report the bus locks detected in guest. If enabled in KVM, it would exit to the userspace to let the user enforce throttling policies once bus locks acquired in VMs. The availability of bus lock VM exit can be detected through the KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in bitmap is the only supported strategy at present. It indicates that KVM will exit to userspace to handle the bus locks. This patch adds a ratelimit on the bus locks acquired in guest as a mitigation policy. Introduce a new field "bus_lock_ratelimit" to record the limited speed of bus locks in the target VM. The user can specify it through the "bus-lock-ratelimit" as a machine property. In current implementation, the default value of the speed is 0 per second, which means no restrictions on the bus locks. As for ratelimit on detected bus locks, simply set the ratelimit interval to 1s and restrict the quota of bus lock occurence to the value of "bus_lock_ratelimit". A potential alternative is to introduce the time slice as a property which can help the user achieve more precise control. The detail of bus lock VM exit can be found in spec: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20210521043820.29678-1-chenyi.qiang@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-06-17 14:11:06 -04:00
Claudio Fontana	662175b91f	i386: reorder call to cpu_exec_realizefn i386 realizefn code is sensitive to ordering, and recent commits aimed at refactoring it, splitting accelerator-specific code, broke assumptions which need to be fixed. We need to: * process hyper-v enlightements first, as they assume features not to be expanded * only then, expand features * after expanding features, attempt to check them and modify them in the accel-specific realizefn code called by cpu_exec_realizefn(). * after the framework has been called via cpu_exec_realizefn, the code can check for what has or hasn't been set by accel-specific code, or extend its results, ie: - check and evenually set code_urev default - modify cpu->mwait after potentially being set from host CPUID. - finally check for phys_bits assuming all user and accel-specific adjustments have already been taken into account. Fixes: `f5cc5a5c` ("i386: split cpu accelerators from cpu.c"...) Fixes: `30565f10` ("cpu: call AccelCPUClass::cpu_realizefn in"...) Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20210603123001.17843-2-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-04 13:47:08 +02:00
Peter Maydell	8c345b3e6a	* Update the references to some doc files (use .rst instead of .txt) * Bump minimum versions of some requirements after removing CentOS 7 support -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmC3L1IRHHRodXRoQHJl ZGhhdC5jb20ACgkQLtnXdP5wLbU8wA//aRAIn0Qa3xoZdsT434P99GHauCZ5ePq3 meY4co69c+TfkQZ/b0xlyvT3+7bd9Ni92CQL7n/LtX5bs3pIhRjHNIsm50e4x6Da 0jb422cKWfRIltXlCfdL/dnFRtJluH83M0seRGMvmhveuWPZ19oSIasEiQjeg//e KMNK5z4HEnJ1czb3Bf8p38bmmY/O/QEAA5wAqk7iJkJHz6T/GlqImLYYwpFPlHj6 JQttm0aWsHrsqEXxnuV0/DT1yHyXDB6S4iuAvABZWhv/M/nCaXo0ib0gW5NPtRPo Yf3HO163F9/fewJCc1AUsBe1C/2UwmSWRhEtxpr9uuW2Mv9qEl3hkJwd4k6sEOvh U4i+GONC4eElPcmECKUfHA9EP+7faDs6xnM6Ev/PIEp+cPJ2QRfklZv4qpMUWYtb 3KkADchOyVZAsdB8cGmnznDEVmno1Dt0adVUq8CF6uW6MwD3pb4838arrfwIfwOp g4yTI1AQQykkxxOaR7GSNoxWRti7TH4fzLhx/xXDd9TKIOWuOepiyuhB7+Q48WAJ 6EW/JUIzOe7k3GsI0iBsk+y67ED2tpATiRKWWw4QS1BwIhNEMUqgZkucFfKE6Ze1 B1Xw+Di+1CzrLHV6lCjtdLVmqWqcGcO43HqhSK4sodE87BDGGwy9UxE6oBVt2C1x HPQ6/omDlFk= =RAkI -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-06-02' into staging * Update the references to some doc files (use .rst instead of .txt) * Bump minimum versions of some requirements after removing CentOS 7 support # gpg: Signature made Wed 02 Jun 2021 08:12:18 BST # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/thuth-gitlab/tags/pull-request-2021-06-02: configure: bump min required CLang to 6.0 / XCode 10.0 configure: bump min required GCC to 7.5.0 configure: bump min required glib version to 2.56 tests/docker: drop CentOS 7 container tests/vm: convert centos VM recipe to CentOS 8 crypto: drop used conditional check crypto: bump min gnutls to 3.5.18, dropping RHEL-7 support crypto: bump min gcrypt to 1.8.0, dropping RHEL-7 support crypto: drop back compatibility typedefs for nettle crypto: bump min nettle to 3.4, dropping RHEL-7 support patchew: move quick build job from CentOS 7 to CentOS 8 container block/ssh: Bump minimum libssh version to 0.8.7 docs: fix references to docs/devel/s390-dasd-ipl.rst docs: fix references to docs/specs/tpm.rst docs: fix references to docs/devel/build-system.rst docs: fix references to docs/devel/atomics.rst docs: fix references to docs/devel/tracing.rst Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-06-02 17:08:11 +01:00
Stefano Garzarella	d0fb9657a3	docs: fix references to docs/devel/tracing.rst Commit `e50caf4a5c` ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-06-02 06:51:09 +02:00
Vitaly Kuznetsov	5aa9ef5e4b	i386: use global kvm_state in hyperv_enabled() check There is no need to use vCPU-specific kvm state in hyperv_enabled() check and we need to do that when feature expansion happens early, before vCPU specific KVM state is created. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-15-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-05-31 15:53:03 -04:00
Vitaly Kuznetsov	2e905438cf	i386: prefer system KVM_GET_SUPPORTED_HV_CPUID ioctl over vCPU's one KVM_GET_SUPPORTED_HV_CPUID was made a system wide ioctl which can be called prior to creating vCPUs and we are going to use that to expand Hyper-V cpu features early. Use it when it is supported by KVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-14-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-05-31 15:53:03 -04:00
Vitaly Kuznetsov	05c900ce73	i386: adjust the expected KVM_GET_SUPPORTED_HV_CPUID array size SYNDBG leaves were recently (Linux-5.8) added to KVM but we haven't updated the expected size of KVM_GET_SUPPORTED_HV_CPUID output in KVM so we now make serveral tries before succeeding. Update the default. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-13-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-05-31 15:53:03 -04:00
Vitaly Kuznetsov	f4a62495be	i386: switch hyperv_expand_features() to using error_setg() Use standard error_setg() mechanism in hyperv_expand_features(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-12-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-05-31 15:53:03 -04:00
Vitaly Kuznetsov	decb4f2013	i386: move eVMCS enablement to hyperv_init_vcpu() hyperv_expand_features() will be called before we create vCPU so evmcs enablement should go away. hyperv_init_vcpu() looks like the right place. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-11-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-05-31 15:53:03 -04:00

1 2 3 4

175 Commits