mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Joao Martins	c345104cd1	i386/xen: handle VCPUOP_register_vcpu_info Handle the hypercall to set a per vcpu info, and also wire up the default vcpu_info in the shared_info page for the first 32 vCPUs. To avoid deadlock within KVM a vCPU thread must set its own vcpu_info rather than it being set from the context in which the hypercall is invoked. Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration, and restore it in kvm_arch_put_registers() appropriately. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	55a3f666b4	i386/xen: handle guest hypercalls This means handling the new exit reason for Xen but still crashing on purpose. As we implement each of the hypercalls we will then return the right return code. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	5e691a955a	i386/kvm: Set Xen vCPU ID in KVM There are (at least) three different vCPU ID number spaces. One is the internal KVM vCPU index, based purely on which vCPU was chronologically created in the kernel first. If userspace threads are all spawned and create their KVM vCPUs in essentially random order, then the KVM indices are basically random too. The second number space is the APIC ID space, which is consistent and useful for referencing vCPUs. MSIs will specify the target vCPU using the APIC ID, for example, and the KVM Xen APIs also take an APIC ID from userspace whenever a vCPU needs to be specified (as opposed to just using the appropriate vCPU fd). The third number space is not normally relevant to the kernel, and is the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But Xen timer hypercalls use it, and Xen timer hypercalls really want to be accelerated in the kernel rather than handled in userspace, so the kernel needs to be told. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	f66b8a83c5	i386/kvm: handle Xen HVM cpuid leaves Introduce support for emulating CPUID for Xen HVM guests. It doesn't make sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally when the xen-version machine property is set. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Obtain xen_version from KVM property, make it automatic] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	61491cf441	i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support This just initializes the basic Xen support in KVM for now. Only permitted on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap overlay, event channel, grant tables and other stuff will exist. There's no point having the basic hypercall support if nothing else works. Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used later by support devices. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Paolo Bonzini	3023c9b4d1	target/i386: KVM: allow fast string operations if host supports them These are just a flag that documents the performance characteristic of an instruction; it needs no hypervisor support. So include them even if KVM does not show them. In particular, FZRM/FSRS/FSRC have only been added very recently, but they are available on Sapphire Rapids processors. Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-02-27 18:44:57 +01:00
Markus Armbruster	e2c1c34f13	include/block: Untangle inclusion loops We have two inclusion loops: block/block.h -> block/block-global-state.h -> block/block-common.h -> block/blockjob.h -> block/block.h block/block.h -> block/block-io.h -> block/block-common.h -> block/blockjob.h -> block/block.h I believe these go back to Emanuele's reorganization of the block API, merged a few months ago in commit `d7e2fe4aac`. Fortunately, breaking them is merely a matter of deleting unnecessary includes from headers, and adding them back in places where they are now missing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221221133551.3967339-2-armbru@redhat.com>	2023-01-20 07:24:28 +01:00
Markus Armbruster	d1c81c3496	qapi: Use returned bool to check for failure (again) Commit `012d4c96e2` changed the visitor functions taking Error ** to return bool instead of void, and the commits following it used the new return value to simplify error checking. Since then a few more uses in need of the same treatment crept in. Do that. All pretty mechanical except for * balloon_stats_get_all() This is basically the same transformation commit `012d4c96e2` applied to the virtual walk example in include/qapi/visitor.h. * set_max_queue_size() Additionally replace "goto end of function" by return. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221121085054.683122-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2022-12-14 16:19:35 +01:00
Zeng Guang	19e2a9fb9d	target/i386: Set maximum APIC ID to KVM prior to vCPU creation Specify maximum possible APIC ID assigned for current VM session to KVM prior to the creation of vCPUs. By this setting, KVM can set up VM-scoped data structure indexed by the APIC ID, e.g. Posted-Interrupt Descriptor pointer table to support Intel IPI virtualization, with the most optimal memory footprint. It can be achieved by calling KVM_ENABLE_CAP for KVM_CAP_MAX_VCPU_ID capability once KVM has enabled it. Ignoring the return error if KVM doesn't support this capability yet. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220825025246.26618-1-guang.zeng@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-31 09:46:34 +01:00
Markus Armbruster	0a553c12c7	Drop useless casts from g_malloc() & friends to pointer These memory allocation functions return void *, and casting to another pointer type is useless clutter. Drop these casts. If you really want another pointer type, consider g_new(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <20220923120025.448759-3-armbru@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-10-22 23:15:40 +02:00
Maciej S. Szmigiero	ec19444a53	hyperv: fix SynIC SINT assertion failure on guest reset Resetting a guest that has Hyper-V VMBus support enabled triggers a QEMU assertion failure: hw/hyperv/hyperv.c:131: synic_reset: Assertion `QLIST_EMPTY(&synic->sint_routes)' failed. This happens both on normal guest reboot or when using "system_reset" HMP command. The failing assertion was introduced by commit `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") to catch dangling SINT routes on SynIC reset. The root cause of this problem is that the SynIC itself is reset before devices using SINT routes have chance to clean up these routes. Since there seems to be no existing mechanism to force reset callbacks (or methods) to be executed in specific order let's use a similar method that is already used to reset another interrupt controller (APIC) after devices have been reset - by invoking the SynIC reset from the machine reset handler via a new x86_cpu_after_reset() function co-located with the existing x86_cpu_reset() in target/i386/cpu.c. Opportunistically move the APIC reset handler there, too. Fixes: `64ddecc88b` ("hyperv: SControl is optional to enable SynIc") # exposed the bug Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <cb57cee2e29b20d06f81dce054cbcea8b5d497e8.1664552976.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:04 +02:00
Alexander Graf	37656470f6	KVM: x86: Implement MSR_CORE_THREAD_COUNT MSR The MSR_CORE_THREAD_COUNT MSR describes CPU package topology, such as number of threads and cores for a given package. This is information that QEMU has readily available and can provide through the new user space MSR deflection interface. This patch propagates the existing hvf logic from patch `027ac0cb51` ("target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT") to KVM. Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20221004225643.65036-4-agraf@csgraf.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:01 +02:00
Alexander Graf	860054d8ce	i386: kvm: Add support for MSR filtering KVM has grown support to deflect arbitrary MSRs to user space since Linux 5.10. For now we don't expect to make a lot of use of this feature, so let's expose it the easiest way possible: With up to 16 individually maskable MSRs. This patch adds a kvm_filter_msr() function that other code can call to install a hook on KVM MSR reads or writes. Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20221004225643.65036-3-agraf@csgraf.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:01 +02:00
Chenyi Qiang	e2e69f6bb9	i386: add notify VM exit support There are cases that malicious virtual machine can cause CPU stuck (due to event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and IRQ) can be delivered. It leads the CPU to be unavailable to host or other VMs. Notify VM exit is introduced to mitigate such kind of attacks, which will generate a VM exit if no event window occurs in VM non-root mode for a specified amount of time (notify window). A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space so that the user can query the capability and set the expected notify window when creating VMs. The format of the argument when enabling this capability is as follows: Bit 63:32 - notify window specified in qemu command Bit 31:0 - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to enable the feature.) Users can configure the feature by a new (x86 only) accel property: qemu -accel kvm,notify-vmexit=run\|internal-error\|disable,notify-window=n The default option of notify-vmexit is run, which will enable the capability and do nothing if the exit happens. The internal-error option raises a KVM internal error if it happens. The disable option does not enable the capability. The default value of notify-window is 0. It is valid only when notify-vmexit is not disabled. The valid range of notify-window is non-negative. It is even safe to set it to zero since there's an internal hardware threshold to be added to ensure no false positive. Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit qualification (no cases are anticipated that would set this bit), which means VM context is corrupted. It would be reflected in the flags of KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM internal error unconditionally. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-5-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-11 09:36:00 +02:00
Paolo Bonzini	3dba0a335c	kvm: allow target-specific accelerator properties Several hypervisor capabilities in KVM are target-specific. When exposed to QEMU users as accelerator properties (i.e. -accel kvm,prop=value), they should not be available for all targets. Add a hook for targets to add their own properties to -accel kvm, for now no such property is defined. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220929072014.20705-3-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-10 09:23:16 +02:00
Chenyi Qiang	12f89a39cf	i386: kvm: extend kvm_{get, put}_vcpu_events to support pending triple fault For the direct triple faults, i.e. hardware detected and KVM morphed to VM-Exit, KVM will never lose them. But for triple faults sythesized by KVM, e.g. the RSM path, if KVM exits to userspace before the request is serviced, userspace could migrate the VM and lose the triple fault. A new flag KVM_VCPUEVENT_VALID_TRIPLE_FAULT is defined to signal that the event.triple_fault_pending field contains a valid state if the KVM_CAP_X86_TRIPLE_FAULT_EVENT capability is enabled. Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220929072014.20705-2-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-10 09:23:16 +02:00
Stefan Hajnoczi	fafd35a6da	Pull request trivial patches branch 20220930-v2 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM7XoISHGxhdXJlbnRA dml2aWVyLmV1AAoJEPMMOL0/L748D/0QAKbYtTWjhFPeapjZVoTv13YrTvczWrcF omL6IZivVq0t7hun4iem0DwmvXJELMGexEOTvEJOzM19IIlvvwvOsI8xnxpcMnEY 6GKVbs53Ba0bg2yh7Dll2W9jkou9eX27DwUHMVF8KX7qqsbU+WyD/vdGZitgGt+T 8yna7kzVvNVsdB3+DbIatI5RzzHeu4OqeuH/WCtAyzCaLB64UYTcHprskxIp4+wp dR+EUSoDEr9Qx4PC+uVEsTFK1zZjyAYNoNIkh6fhlkRvDJ1uA75m3EJ57P8xPPqe VbVkPMKi0d4c52m6XvLsQhyYryLx/qLLUAkJWVpY66aHcapYbZAEAfZmNGTQLrOJ qIOJzIkOdU6l3pRgXVdVCgkHRc2HETwET2LyVbNkUz/vBlW2wOZQbZFbezComael bQ/gNBYqP+eOGnZzeWbKBGHr/9QDBClNufidIMC+sOiUw0iSifzjkFwvH7IElx6K EQCOSV6pOhKVlinTpmBbk1XD3xDkQ7ZidiLT9g+P1c8dExrXBhWOnfUHueISb8+s KKMozuxQ/6/3c/DP5hwI9cKPEWEbqJfq1kMuxIvEivKGwUIqX2yq4VJ+hSlYJ+CW nGjXZldtf4KwH+cTsxyPmdZRR5Q7+ODr5Xo7GNvEKBuDsHs7uUl1c3vvOykQgje9 +dyJR6TfbQWn =aK29 -----END PGP SIGNATURE----- Merge tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu into staging Pull request trivial patches branch 20220930-v2 # -----BEGIN PGP SIGNATURE----- # # iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM7XoISHGxhdXJlbnRA # dml2aWVyLmV1AAoJEPMMOL0/L748D/0QAKbYtTWjhFPeapjZVoTv13YrTvczWrcF # omL6IZivVq0t7hun4iem0DwmvXJELMGexEOTvEJOzM19IIlvvwvOsI8xnxpcMnEY # 6GKVbs53Ba0bg2yh7Dll2W9jkou9eX27DwUHMVF8KX7qqsbU+WyD/vdGZitgGt+T # 8yna7kzVvNVsdB3+DbIatI5RzzHeu4OqeuH/WCtAyzCaLB64UYTcHprskxIp4+wp # dR+EUSoDEr9Qx4PC+uVEsTFK1zZjyAYNoNIkh6fhlkRvDJ1uA75m3EJ57P8xPPqe # VbVkPMKi0d4c52m6XvLsQhyYryLx/qLLUAkJWVpY66aHcapYbZAEAfZmNGTQLrOJ # qIOJzIkOdU6l3pRgXVdVCgkHRc2HETwET2LyVbNkUz/vBlW2wOZQbZFbezComael # bQ/gNBYqP+eOGnZzeWbKBGHr/9QDBClNufidIMC+sOiUw0iSifzjkFwvH7IElx6K # EQCOSV6pOhKVlinTpmBbk1XD3xDkQ7ZidiLT9g+P1c8dExrXBhWOnfUHueISb8+s # KKMozuxQ/6/3c/DP5hwI9cKPEWEbqJfq1kMuxIvEivKGwUIqX2yq4VJ+hSlYJ+CW # nGjXZldtf4KwH+cTsxyPmdZRR5Q7+ODr5Xo7GNvEKBuDsHs7uUl1c3vvOykQgje9 # +dyJR6TfbQWn # =aK29 # -----END PGP SIGNATURE----- # gpg: Signature made Mon 03 Oct 2022 18:13:22 EDT # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * tag 'trivial-branch-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu: docs: Update TPM documentation for usage of a TPM 2 Use g_new() & friends where that makes obvious sense Drop superfluous conditionals around g_free() block/qcow2-bitmap: Add missing cast to silent GCC error checkpatch: ignore target/hexagon/imported/* files mem/cxl_type3: fix GPF DVSEC .gitignore: add .cache/ to .gitignore hw/virtio/vhost-shadow-virtqueue: Silence GCC error "maybe-uninitialized" Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-10-04 14:04:18 -04:00
Markus Armbruster	76eb88b12b	Drop superfluous conditionals around g_free() There is no need to guard g_free(P) with if (P): g_free(NULL) is safe. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220923090428.93529-1-armbru@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2022-10-04 00:10:11 +02:00
Ray Zhang	c4ef867f29	target/i386/kvm: fix kvmclock_current_nsec: Assertion `time.tsc_timestamp <= migration_tsc' failed New KVM_CLOCK flags were added in the kernel.(c68dc1b577eabd5605c6c7c08f3e07ae18d30d5d) ``` + #define KVM_CLOCK_VALID_FLAGS \ + (KVM_CLOCK_TSC_STABLE \| KVM_CLOCK_REALTIME \| KVM_CLOCK_HOST_TSC) case KVM_CAP_ADJUST_CLOCK: - r = KVM_CLOCK_TSC_STABLE; + r = KVM_CLOCK_VALID_FLAGS; ``` kvm_has_adjust_clock_stable needs to handle additional flags, so that s->clock_is_reliable can be true and kvmclock_current_nsec doesn't need to be called. Signed-off-by: Ray Zhang <zhanglei002@gmail.com> Message-Id: <20220922100523.2362205-1-zhanglei002@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-01 21:16:36 +02:00
Vitaly Kuznetsov	45ed68a1a3	i386: do kvm_put_msr_feature_control() first thing when vCPU is reset kvm_put_sregs2() fails to reset 'locked' CR4/CR0 bits upon vCPU reset when it is in VMX root operation. Do kvm_put_msr_feature_control() before kvm_put_sregs2() to (possibly) kick vCPU out of VMX root operation. It also seems logical to do kvm_put_msr_feature_control() before kvm_put_nested_state() and not after it, especially when 'real' nested state is set. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 07:42:37 +02:00
Vitaly Kuznetsov	3cafdb6750	i386: reset KVM nested state upon CPU reset Make sure env->nested_state is cleaned up when a vCPU is reset, it may be stale after an incoming migration, kvm_arch_put_registers() may end up failing or putting vCPU in a weird state. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220818150113.479917-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 07:42:37 +02:00
Vitaly Kuznetsov	3aae0854b2	i386: Hyper-V Direct TLB flush hypercall Hyper-V TLFS allows for L0 and L1 hypervisors to collaborate on L2's TLB flush hypercalls handling. With the correct setup, L2's TLB flush hypercalls can be handled by L0 directly, without the need to exit to L1. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	aa6bb5fad5	i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls KVM kind of supported "extended GVA ranges" (up to 4095 additional GFNs per hypercall) since the implementation of Hyper-V PV TLB flush feature (Linux-4.18) as regardless of the request, full TLB flush was always performed. "Extended GVA ranges for TLB flush hypercalls" feature bit wasn't exposed then. Now, as KVM gains support for fine-grained TLB flush handling, exposing this feature starts making sense. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	9411e8b6fa	i386: Hyper-V XMM fast hypercall input feature Hyper-V specification allows to pass parameters for certain hypercalls using XMM registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows for faster hypercalls processing as KVM can avoid reading guest's memory. KVM supports the feature since v5.14. Rename HV_HYPERCALL_{PARAMS_XMM_AVAILABLE -> XMM_INPUT_AVAILABLE} to comply with KVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	869840d26c	i386: Hyper-V Enlightened MSR bitmap feature The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	7110fe56c1	i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently as it was only used to encode the supported eVMCS version range. In fact, there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Yang Weijiang	3a7a27cffb	target/i386: Remove LBREn bit check when access Arch LBR MSRs Live migration can happen when Arch LBR LBREn bit is cleared, e.g., when migration happens after guest entered SMM mode. In this case, we still need to migrate Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220517155024.33270-1-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-23 10:56:01 +02:00
Yang Weijiang	12703d4e75	target/i386: Add MSR access interface for Arch LBR In the first generation of Arch LBR, the max support Arch LBR depth is 32, both host and guest use the value to set depth MSR. This can simplify the implementation of patch given the side-effect of mismatch of host/guest depth MSR: XRSTORS will reset all recording MSRs to 0s if the saved depth mismatches MSR_ARCH_LBR_DEPTH. In most of the cases Arch LBR is not in active status, so check the control bit before save/restore the big chunck of Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-7-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-14 12:32:41 +02:00
Yang Weijiang	5a778a5f82	target/i386: Add kvm_get_one_msr helper When try to get one msr from KVM, I found there's no such kind of existing interface while kvm_put_one_msr() is there. So here comes the patch. It'll remove redundant preparation code before finally call KVM_GET_MSRS IOCTL. No functional change intended. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-4-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-14 12:32:41 +02:00
Jon Doron	d8701185f4	hw: hyperv: Initial commit for Synthetic Debugging device Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-5-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
Jon Doron	73d2407407	hyperv: Add support to process syndbg commands SynDbg commands can come from two different flows: 1. Hypercalls, in this mode the data being sent is fully encapsulated network packets. 2. SynDbg specific MSRs, in this mode only the data that needs to be transfered is passed. Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-4-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
luofei	cb48748af7	i386: Set MCG_STATUS_RIPV bit for mce SRAR error In the physical machine environment, when a SRAR error occurs, the IA32_MCG_STATUS RIPV bit is set, but qemu does not set this bit. When qemu injects an SRAR error into virtual machine, the virtual machine kernel just call do_machine_check() to kill the current task, but not call memory_failure() to isolate the faulty page, which will cause the faulty page to be allocated and used repeatedly. If used by the virtual machine kernel, it will cause the virtual machine to crash Signed-off-by: luofei <luofei@unicloud.com> Message-Id: <20220120084634.131450-1-luofei@unicloud.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 12:22:25 +01:00
Philippe Mathieu-Daudé	dcebbb65b8	target/i386/kvm: Free xsave_buf when destroying vCPU Fix vCPU hot-unplug related leak reported by Valgrind: ==132362== 4,096 bytes in 1 blocks are definitely lost in loss record 8,440 of 8,549 ==132362== at 0x4C3B15F: memalign (vg_replace_malloc.c:1265) ==132362== by 0x4C3B288: posix_memalign (vg_replace_malloc.c:1429) ==132362== by 0xB41195: qemu_try_memalign (memalign.c:53) ==132362== by 0xB41204: qemu_memalign (memalign.c:73) ==132362== by 0x7131CB: kvm_init_xsave (kvm.c:1601) ==132362== by 0x7148ED: kvm_arch_init_vcpu (kvm.c:2031) ==132362== by 0x91D224: kvm_init_vcpu (kvm-all.c:516) ==132362== by 0x9242C9: kvm_vcpu_thread_fn (kvm-accel-ops.c:40) ==132362== by 0xB2EB26: qemu_thread_start (qemu-thread-posix.c:556) ==132362== by 0x7EB2159: start_thread (in /usr/lib64/libpthread-2.28.so) ==132362== by 0x9D45DD2: clone (in /usr/lib64/libc-2.28.so) Reported-by: Mark Kanda <mark.kanda@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Mark Kanda <mark.kanda@oracle.com> Message-Id: <20220322120522.26200-1-philippe.mathieu.daude@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 12:22:25 +01:00
Paolo Bonzini	3ec5ad4008	target/i386: kvm: do not access uninitialized variable on older kernels KVM support for AMX includes a new system attribute, KVM_X86_XCOMP_GUEST_SUPP. Commit `19db68ca68` ("x86: Grant AMX permission for guest", 2022-03-15) however did not fully consider the behavior on older kernels. First, it warns too aggressively. Second, it invokes the KVM_GET_DEVICE_ATTR ioctl unconditionally and then uses the "bitmask" variable, which remains uninitialized if the ioctl fails. Third, kvm_ioctl returns -errno rather than -1 on errors. While at it, explain why the ioctl is needed and KVM_GET_SUPPORTED_CPUID is not enough. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-20 20:38:52 +01:00
Zeng Guang	cdec2b753b	x86: Support XFD and AMX xsave data migration XFD(eXtended Feature Disable) allows to enable a feature on xsave state while preventing specific user threads from using the feature. Support save and restore XFD MSRs if CPUID.D.1.EAX[4] enumerate to be valid. Likewise migrate the MSRs and related xsave state necessarily. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-8-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Jing Liu	e56dd3c70a	x86: add support for KVM_CAP_XSAVE2 and AMX state migration When dynamic xfeatures (e.g. AMX) are used by the guest, the xsave area would be larger than 4KB. KVM_GET_XSAVE2 and KVM_SET_XSAVE under KVM_CAP_XSAVE2 works with a xsave buffer larger than 4KB. Always use the new ioctls under KVM_CAP_XSAVE2 when KVM supports it. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-7-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Jing Liu	f21a48171c	x86: Add AMX CPUIDs enumeration Add AMX primary feature bits XFD and AMX_TILE to enumerate the CPU's AMX capability. Meanwhile, add AMX TILE and TMUL CPUID leaf and subleaves which exist when AMX TILE is present to provide the maximum capability of TILE and TMUL. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-6-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Yang Zhong	19db68ca68	x86: Grant AMX permission for guest Kernel allocates 4K xstate buffer by default. For XSAVE features which require large state component (e.g. AMX), Linux kernel dynamically expands the xstate buffer only after the process has acquired the necessary permissions. Those are called dynamically- enabled XSAVE features (or dynamic xfeatures). There are separate permissions for native tasks and guests. Qemu should request the guest permissions for dynamic xfeatures which will be exposed to the guest. This only needs to be done once before the first vcpu is created. KVM implemented one new ARCH_GET_XCOMP_SUPP system attribute API to get host side supported_xcr0 and Qemu can decide if it can request dynamically enabled XSAVE features permission. https://lore.kernel.org/all/20220126152210.3044876-1-pbonzini@redhat.com/ Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Message-Id: <20220217060434.52460-4-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Longpeng(Mike)	def4c5570c	kvm/msi: do explicit commit when adding msi routes We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:26:20 +01:00
Peter Maydell	5df022cf2e	osdep: Move memalign-related functions to their own header Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org	2022-03-07 13:16:49 +00:00
Paolo Bonzini	1520f8bb67	KVM: x86: ignore interrupt_bitmap field of KVM_GET/SET_SREGS This is unnecessary, because the interrupt would be retrieved and queued anyway by KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS respectively, and it makes the flow more similar to the one for KVM_GET/SET_SREGS2. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-12 14:09:06 +01:00
Maxim Levitsky	8f515d3869	KVM: use KVM_{GET\|SET}_SREGS2 when supported. This allows to make PDPTRs part of the migration stream and thus not reload them after migration which is against X86 spec. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-12 14:09:06 +01:00
Philippe Mathieu-Daudé	dc7d6cafce	target/i386/kvm: Replace use of __u32 type QEMU coding style mandates to not use Linux kernel internal types for scalars types. Replace __u32 by uint32_t. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211116193955.2793171-1-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-12-17 10:40:51 +01:00
Maxim Levitsky	cabf9862e4	KVM: SVM: add migration support for nested TSC scaling Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-11-02 15:57:27 +01:00
Philippe Mathieu-Daudé	deae846f94	target/i386/sev: Declare system-specific functions in 'sev.h' "sysemu/sev.h" is only used from x86-specific files. Let's move it to include/hw/i386, and merge it with target/i386/sev.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-16-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-13 10:47:49 +02:00
Philippe Mathieu-Daudé	93777de365	target/i386/sev: Rename sev_i386.h -> sev.h SEV is a x86 specific feature, and the "sev_i386.h" header is already in target/i386/. Rename it as "sev.h" to simplify. Patch created mechanically using: $ git mv target/i386/sev_i386.h target/i386/sev.h $ sed -i s/sev_i386.h/sev.h/ $(git grep -l sev_i386.h) Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211007161716.453984-15-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-13 10:47:49 +02:00
Vitaly Kuznetsov	af7228b88d	i386: Make Hyper-V version id configurable Currently, we hardcode Hyper-V version id (CPUID 0x40000002) to WS2008R2 and it is known that certain tools in Windows check this. It seems useful to provide some flexibility by making it possible to change this info at will. CPUID information is defined in TLFS as: EAX: Build Number EBX Bits 31-16: Major Version Bits 15-0: Minor Version ECX Service Pack EDX Bits 31-24: Service Branch Bits 23-0: Service Number Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-8-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	e1f9a8e8c9	i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenment The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled. Normally, Hyper-V SynIC disables these hardware features and suggests the guest to use paravirtualized AutoEOI feature. Linux-4.15 gains support for conditional APICv/AVIC disablement, the feature stays on until the guest tries to use AutoEOI feature with SynIC. With 'HV_DEPRECATING_AEOI_RECOMMENDED' bit exposed, modern enough Windows/ Hyper-V versions should follow the recommendation and not use the (unwanted) feature. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-7-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	050716292a	i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to hyperv_fill_cpuids() In preparation to enabling Hyper-V + APICv/AVIC move HV_APIC_ACCESS_RECOMMENDED setting out of kvm_hyperv_properties[]: the 'real' feature bit for the vAPIC features is HV_APIC_ACCESS_AVAILABLE, HV_APIC_ACCESS_RECOMMENDED is a recommendation to use the feature which we may not always want to give. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	70367f0917	i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID By default, KVM allows the guest to use all currently supported Hyper-V enlightenments when Hyper-V CPUID interface was exposed, regardless of if some features were not announced in guest visible CPUIDs. hv-enforce-cpuid feature alters this behavior and only allows the guest to use exposed Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00

1 2

97 Commits