mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
David Woodhouse	c1bb5418e3	target/i386: Support up to 32768 CPUs without IRQ remapping The IOAPIC has an 'Extended Destination ID' field in its RTE, which maps to bits 11-4 of the MSI address. Since those address bits fall within a given 4KiB page they were historically non-trivial to use on real hardware. The Intel IOMMU uses the lowest bit to indicate a remappable format MSI, and then the remaining 7 bits are part of the index. Where the remappable format bit isn't set, we can actually use the other seven to allow external (IOAPIC and MSI) interrupts to reach up to 32768 CPUs instead of just the 255 permitted on bare metal. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <78097f9218300e63e751e077a0a5ca029b56ba46.camel@infradead.org> [Fix UBSAN warning. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2020-12-10 12:15:00 -05:00
Tom Lendacky	2654ace151	kvm/i386: Set proper nested state format for SVM Currently, the nested state format is hardcoded to VMX. This will result in kvm_put_nested_state() returning an error because the KVM SVM support checks for the nested state to be KVM_STATE_NESTED_FORMAT_SVM. As a result, kvm_arch_put_registers() errors out early. Update the setting of the format based on the virtualization feature: VMX - KVM_STATE_NESTED_FORMAT_VMX SVM - KVM_STATE_NESTED_FORMAT_SVM Also, fix the code formatting while at it. Fixes: `b16c0e20c7` ("KVM: add support for AMD nested live migration") Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <fe53d00fe0d884e812960781284cd48ae9206acc.1605546140.git.thomas.lendacky@amd.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-11-16 13:22:16 -05:00
Eduardo Habkost	eafa08683f	i386/kvm: Delete kvm_allows_irq0_override() As IRQ routing is always available on x86, kvm_allows_irq0_override() will always return true, so we don't need the function anymore. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200922201922.2153598-4-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-10-14 15:28:54 -04:00
Eduardo Habkost	8981bae23e	i386/kvm: Remove IRQ routing support checks KVM_CAP_IRQ_ROUTING is always available on x86, so replace checks for kvm_has_gsi_routing() and KVM_CAP_IRQ_ROUTING with asserts. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200922201922.2153598-3-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-10-14 15:28:54 -04:00
Eduardo Habkost	1a6dff5fa8	i386/kvm: Require KVM_CAP_IRQ_ROUTING KVM_CAP_IRQ_ROUTING is available since 2009 (Linux v2.6.30), so it's safe to just make it a requirement on x86. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200922201922.2153598-2-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-10-14 15:28:54 -04:00
Zhenyu Wang	1a7655d53d	i386/kvm: fix FEATURE_HYPERV_EDX value in hyperv_passthrough case Fix typo to use correct edx value for FEATURE_HYPERV_EDX when hyperv_passthrough is enabled. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Message-Id: <20190820103030.12515-1-zhenyuw@linux.intel.com> Fixes: `e48ddcc6ce` ("i386/kvm: implement 'hv-passthrough' mode") Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-10-14 15:28:54 -04:00
zhenwei pi	8efc4e5150	target-i386: post memory failure event to QMP Post memory failure event through QMP to handle hardware memory corrupted event. Rather than simply printing to the log, QEMU could report more effective message to the client. For example, if a guest receives an MCE, evacuating the host could be a good idea. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20200930100440.1060708-4-pizhenwei@bytedance.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-04 18:36:23 +02:00
Paolo Bonzini	c3dfce9afd	target/i386: kvm: do not use kvm_check_extension to find paravirtual capabilities Paravirtualized features have been listed in KVM_GET_SUPPORTED_CPUID since Linux 2.6.35 (commit 84478c829d0f, "KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID", 2010-05-19). It has been more than 10 years, so remove the fallback code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-30 19:11:36 +02:00
Vitaly Kuznetsov	8700a98443	target/i386: always create kvmclock device QEMU's kvmclock device is only created when KVM PV feature bits for kvmclock (KVM_FEATURE_CLOCKSOURCE/KVM_FEATURE_CLOCKSOURCE2) are exposed to the guest. With 'kvm=off' cpu flag the device is not created and we don't call KVM_GET_CLOCK/KVM_SET_CLOCK upon migration. It was reported that without these call at least Hyper-V TSC page clocksouce (which can be enabled independently) gets broken after migration. Switch to creating kvmclock QEMU device unconditionally, it seems to always make sense to call KVM_GET_CLOCK/KVM_SET_CLOCK on migration. Use KVM_CAP_ADJUST_CLOCK check instead of CPUID feature bits. Reported-by: Antoine Damhet <antoine.damhet@blade-group.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200922151934.899555-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-30 19:11:36 +02:00
Vitaly Kuznetsov	6615be072d	target/i386: Fix VM migration when interrupt based APF is enabled VM with interrupt based APF enabled fails to migrate: qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0xf3 We have two issues: 1) There is a typo in kvm_put_msrs() and we write async_pf_int_msr to MSR_KVM_ASYNC_PF_EN (instead of MSR_KVM_ASYNC_PF_INT) 2) We restore MSR_KVM_ASYNC_PF_EN before MSR_KVM_ASYNC_PF_INT is set and this violates the check in KVM. Re-order MSR_KVM_ASYNC_PF_EN/MSR_KVM_ASYNC_PF_INT setting (and kvm_get_msrs() for consistency) and fix the typo. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200917102316.814804-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-30 19:11:36 +02:00
Vitaly Kuznetsov	db5daafab2	target/i386: support KVM_FEATURE_ASYNC_PF_INT Linux-5.8 introduced interrupt based mechanism for 'page ready' events delivery and disabled the old, #PF based one (see commit 2635b5c4a0e4 "KVM: x86: interrupt based APF 'page ready' event delivery"). Linux guest switches to using in in 5.9 (see commit b1d405751cd5 "KVM: x86: Switch KVM guest to using interrupts for page ready APF delivery"). The feature has a new KVM_FEATURE_ASYNC_PF_INT bit assigned and the interrupt vector is set in MSR_KVM_ASYNC_PF_INT MSR. Support this in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200908141206.357450-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-30 19:09:19 +02:00
Vitaly Kuznetsov	f701c082e7	i386/kvm: correct the meaning of '0xffffffff' value for hv-spinlocks Hyper-V TLFS prior to version 6.0 had a mistake in it: special value '0xffffffff' for CPUID 0x40000004.EBX was called 'never to retry', this looked weird (like why it's not '0' which supposedly have the same effect?) but nobody raised the question. In TLFS version 6.0 the mistake was corrected to 'never notify' which sounds logical. Fix QEMU accordingly. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200515114847.74523-1-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-18 13:49:54 -04:00
Thomas Huth	8821e21414	target/i386/kvm: Add missing fallthrough comment Let's make this file compilable with -Werror=implicit-fallthrough : Looking at the code, it seems like the fallthrough is intended here, so we should add the corresponding "/* fallthrough */" comment here. Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200911125301.413081-1-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-09-11 21:25:59 +02:00
Philippe Mathieu-Daudé	b199c682f1	target/i386/kvm: Rename host_tsx_blacklisted() as host_tsx_broken() In order to use inclusive terminology, rename host_tsx_blacklisted() as host_tsx_broken(). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200910070131.435543-7-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-09-10 16:20:49 +02:00
Eduardo Habkost	30c60f77a8	x86-iommu: Rename QOM type macros Some QOM macros were using a X86_IOMMU_DEVICE prefix, and others were using a X86_IOMMU prefix. Rename all of them to use the same X86_IOMMU_DEVICE prefix. This will make future conversion to OBJECT_DECLARE* easier. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20200825192110.3528606-47-ehabkost@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-02 07:29:25 -04:00
Vitaly Kuznetsov	0baa4b445e	KVM: fix CPU reset wrt HF2_GIF_MASK HF2_GIF_MASK is set in env->hflags2 unconditionally on CPU reset (see x86_cpu_reset()) but when calling KVM_SET_NESTED_STATE, KVM_STATE_NESTED_GIF_SET is only valid for nSVM as e.g. nVMX code looks like if (kvm_state->hdr.vmx.vmxon_pa == -1ull) { if (kvm_state->flags & ~KVM_STATE_NESTED_EVMCS) return -EINVAL; } Also, when adjusting the environment after KVM_GET_NESTED_STATE we need not reset HF2_GIF_MASK on VMX as e.g. x86_cpu_pending_interrupt() expects it to be set. Alternatively, we could've made env->hflags2 SVM-only. Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Fixes: `b16c0e20c7` ("KVM: add support for AMD nested live migration") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200723142701.2521161-1-vkuznets@redhat.com> Tested-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-07-23 15:03:54 -04:00
Paolo Bonzini	e1e43813e7	KVM: x86: believe what KVM says about WAITPKG Currently, QEMU is overriding KVM_GET_SUPPORTED_CPUID's answer for the WAITPKG bit depending on the "-overcommit cpu-pm" setting. This is a bad idea because it does not even check if the host supports it, but it can be done in x86_cpu_realizefn just like we do for the MONITOR bit. This patch moves it there, while making it conditional on host support for the related UMWAIT MSR. Cc: qemu-stable@nongnu.org Reported-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-07-10 18:02:22 -04:00
Paolo Bonzini	b16c0e20c7	KVM: add support for AMD nested live migration Support for nested guest live migration is part of Linux 5.8, add the corresponding code to QEMU. The migration format consists of a few flags, is an opaque 4k blob. The blob is in VMCB format (the control area represents the L1 VMCB control fields, the save area represents the pre-vmentry state; KVM does not use the host save area since the AMD manual allows that) but QEMU does not really care about that. However, the flags need to be copied to hflags/hflags2 and back. In addition, support for retrieving and setting the AMD nested virtualization states allows the L1 guest to be reset while running a nested guest, but a small bug in CPU reset needs to be fixed for that to work. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-07-10 18:02:17 -04:00
Marcelo Tosatti	74aaddc628	kvm: i386: allow TSC to differ by NTP correction bounds without TSC scaling The Linux TSC calibration procedure is subject to small variations (its common to see +-1 kHz difference between reboots on a given CPU, for example). So migrating a guest between two hosts with identical processor can fail, in case of a small variation in calibrated TSC between them. Allow a conservative 250ppm error between host TSC and VM TSC frequencies, rather than requiring an exact match. NTP daemon in the guest can correct this difference. Also change migration to accept this bound. KVM_SET_TSC_KHZ depends on a kernel interface change. Without this change, the behaviour remains the same: in case of a different frequency between host and VM, KVM_SET_TSC_KHZ will fail and QEMU will exit. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20200616165805.GA324612@fuller.cnet> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-26 09:39:40 -04:00
Like Xu	ea39f9b643	target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES The Perfmon and Debug Capability MSR named IA32_PERF_CAPABILITIES is a feature-enumerating MSR, which only enumerates the feature full-width write (via bit 13) by now which indicates the processor supports IA32_A_PMCx interface for updating bits 32 and above of IA32_PMCx. The existence of MSR IA32_PERF_CAPABILITIES is enumerated by CPUID.1:ECX[15]. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: qemu-devel@nongnu.org Signed-off-by: Like Xu <like.xu@linux.intel.com> Message-Id: <20200529074347.124619-5-like.xu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:10:47 -04:00
Pan Nengyuan	2a69314258	i386/kvm: fix a use-after-free when vcpu plug/unplug When we hotplug vcpus, cpu_update_state is added to vm_change_state_head in kvm_arch_init_vcpu(). But it forgot to delete in kvm_arch_destroy_vcpu() after unplug. Then it will cause a use-after-free access. This patch delete it in kvm_arch_destroy_vcpu() to fix that. Reproducer: virsh setvcpus vm1 4 --live virsh setvcpus vm1 2 --live virsh suspend vm1 virsh resume vm1 The UAF stack: ==qemu-system-x86_64==28233==ERROR: AddressSanitizer: heap-use-after-free on address 0x62e00002e798 at pc 0x5573c6917d9e bp 0x7fff07139e50 sp 0x7fff07139e40 WRITE of size 1 at 0x62e00002e798 thread T0 #0 0x5573c6917d9d in cpu_update_state /mnt/sdb/qemu/target/i386/kvm.c:742 #1 0x5573c699121a in vm_state_notify /mnt/sdb/qemu/vl.c:1290 #2 0x5573c636287e in vm_prepare_start /mnt/sdb/qemu/cpus.c:2144 #3 0x5573c6362927 in vm_start /mnt/sdb/qemu/cpus.c:2150 #4 0x5573c71e8304 in qmp_cont /mnt/sdb/qemu/monitor/qmp-cmds.c:173 #5 0x5573c727cb1e in qmp_marshal_cont qapi/qapi-commands-misc.c:835 #6 0x5573c7694c7a in do_qmp_dispatch /mnt/sdb/qemu/qapi/qmp-dispatch.c:132 #7 0x5573c7694c7a in qmp_dispatch /mnt/sdb/qemu/qapi/qmp-dispatch.c:175 #8 0x5573c71d9110 in monitor_qmp_dispatch /mnt/sdb/qemu/monitor/qmp.c:145 #9 0x5573c71dad4f in monitor_qmp_bh_dispatcher /mnt/sdb/qemu/monitor/qmp.c:234 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200513132630.13412-1-pannengyuan@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:10:01 -04:00
Liran Alon	73b994f6d7	i386/cpu: Store LAPIC bus frequency in CPU structure No functional change. This information will be used by following patches. Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Message-Id: <20200312165431.82118-15-liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:09:52 -04:00
Dongjiu Geng	6b552b9bc8	KVM: Move hwpoison page related functions into kvm-all.c kvm_hwpoison_page_add() and kvm_unpoison_all() will both be used by X86 and ARM platforms, so moving them into "accel/kvm/kvm-all.c" to avoid duplicate code. For architectures that don't use the poison-list functionality the reset handler will harmlessly do nothing, so let's register the kvm_unpoison_all() function in the generic kvm_init() function. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com> Acked-by: Xiang Zheng <zhengxiang9@huawei.com> Message-id: 20200512030609.19593-8-gengdongjiu@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-05-14 15:03:09 +01:00
Vitaly Kuznetsov	4a910e1f6a	target/i386: do not set unsupported VMX secondary execution controls Commit `048c95163b` ("target/i386: work around KVM_GET_MSRS bug for secondary execution controls") added a workaround for KVM pre-dating commit 6defc591846d ("KVM: nVMX: include conditional controls in /dev/kvm KVM_GET_MSRS") which wasn't setting certain available controls. The workaround uses generic CPUID feature bits to set missing VMX controls. It was found that in some cases it is possible to observe hosts which have certain CPUID features but lack the corresponding VMX control. In particular, it was reported that Azure VMs have RDSEED but lack VMX_SECONDARY_EXEC_RDSEED_EXITING; attempts to enable this feature bit result in QEMU abort. Resolve the issue but not applying the workaround when we don't have to. As there is no good way to find out if KVM has the fix itself, use 95c5c7c77c ("KVM: nVMX: list VMX MSRs in KVM_GET_MSR_INDEX_LIST") instead as these [are supposed to] come together. Fixes: `048c95163b` ("target/i386: work around KVM_GET_MSRS bug for secondary execution controls") Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200331162752.1209928-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-04-02 14:55:45 -04:00
Paolo Bonzini	6702514814	target/i386: check for availability of MSR_IA32_UCODE_REV as an emulated MSR Even though MSR_IA32_UCODE_REV has been available long before Linux 5.6, which added it to the emulated MSR list, a bug caused the microcode version to revert to 0x100000000 on INIT. As a result, processors other than the bootstrap processor would not see the host microcode revision; some Windows version complain loudly about this and crash with a fairly explicit MICROCODE REVISION MISMATCH error. [If running 5.6 prereleases, the kernel fix "KVM: x86: do not reset microcode version on INIT or RESET" should also be applied.] Reported-by: Alex Williamson <alex.williamson@redhat.com> Message-id: <20200211175516.10716-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-02-12 16:29:40 +01:00
Philippe Mathieu-Daudé	4f7f589381	accel: Replace current_machine->accelerator by current_accel() wrapper We actually want to access the accelerator, not the machine, so use the current_accel() wrapper instead. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200121110349.25842-10-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-24 20:59:11 +01:00
Paolo Bonzini	32c87d70ff	target/i386: kvm: initialize microcode revision from KVM KVM can return the host microcode revision as a feature MSR. Use it as the default value for -cpu host. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1579544504-3616-4-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-24 20:59:10 +01:00
Paolo Bonzini	420ae1fc51	target/i386: kvm: initialize feature MSRs very early Some read-only MSRs affect the behavior of ioctls such as KVM_SET_NESTED_STATE. We can initialize them once and for all right after the CPU is realized, since they will never be modified by the guest. Reported-by: Qingua Cheng <qcheng@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1579544504-3616-2-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-24 20:59:09 +01:00
Michal Privoznik	8f54bbd0b4	x86: Check for machine state object class before typecasting it In `ed9e923c3c` ("x86: move SMM property to X86MachineState", 2019-12-17) In v4.2.0-246-ged9e923c3c the SMM property was moved from PC machine class to x86 machine class. Makes sense, but the change was too aggressive: in target/i386/kvm.c:kvm_arch_init() it altered check which sets SMRAM if given machine has SMM enabled. The line that detects whether given machine object is class of PC_MACHINE was removed from the check. This makes qemu try to enable SMRAM for all machine types, which is not what we want. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Fixes: `ed9e923c3c` ("x86: move SMM property to X86MachineState", 2019-12-17) Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <7cc91bab3191bfd7e071bdd3fdf7fe2a2991deb0.1577692822.git.mprivozn@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-07 12:08:38 +01:00
Eiichi Tsukata	7529a79607	target/i386: remove unused pci-assign codes Legacy PCI device assignment has been already removed in commit `ab37bfc7d6` ("pci-assign: Remove"), but some codes remain unused. CC: qemu-trivial@nongnu.org Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Message-Id: <20191209072932.313056-1-devel@etsukata.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-18 02:34:11 +01:00
Paolo Bonzini	89a289c7e9	x86: move more x86-generic functions out of PC files These are needed by microvm too, so move them outside of PC-specific files. With this patch, microvm.c need not include pc.h anymore. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:33:50 +01:00
Paolo Bonzini	ed9e923c3c	x86: move SMM property to X86MachineState Add it to microvm as well, it is a generic property of the x86 architecture. Suggested-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:33:50 +01:00
Paolo Bonzini	4376c40ded	kvm: introduce kvm_kernel_irqchip_* functions The KVMState struct is opaque, so provide accessors for the fields that will be moved from current_machine to the accelerator. For now they just forward to the machine object, but this will change. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:45 +01:00
Paolo Bonzini	23b0898e44	kvm: convert "-machine kvm_shadow_mem" to an accelerator property Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:27 +01:00
Yang Zhong	2605188240	target/i386: disable VMX features if nested=0 If kvm does not support VMX feature by nested=0, the kvm_vmx_basic can't get the right value from MSR_IA32_VMX_BASIC register, which make qemu coredump when qemu do KVM_SET_MSRS. The coredump info: error: failed to set MSR 0x480 to 0x0 kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20191206071111.12128-1-yang.zhong@intel.com> Reported-by: Catherine Ho <catherine.hecx@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-06 12:35:40 +01:00
Paolo Bonzini	2a9758c51e	target/i386: add support for MSR_IA32_TSX_CTRL The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the Trusty Side-channel Extension). By virtualizing the MSR, KVM guests can disable TSX and avoid paying the price of mitigating TSX-based attacks on microarchitectural side channels. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-11-21 16:35:05 +01:00
Tao Xu	6508799707	target/i386: Add support for save/load IA32_UMWAIT_CONTROL MSR UMWAIT and TPAUSE instructions use 32bits IA32_UMWAIT_CONTROL at MSR index E1H to determines the maximum time in TSC-quanta that the processor can reside in either C0.1 or C0.2. This patch is to Add support for save/load IA32_UMWAIT_CONTROL MSR in guest. Co-developed-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191011074103.30393-3-tao3.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-23 17:50:27 +02:00
Tao Xu	67192a298f	x86/cpu: Add support for UMONITOR/UMWAIT/TPAUSE UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions. This patch adds support for user wait instructions in KVM. Availability of the user wait instructions is indicated by the presence of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to set the maximum time. The patch enable the umonitor, umwait and tpause features in KVM. Because umwait and tpause can put a (psysical) CPU into a power saving state, by default we dont't expose it to kvm and enable it only when guest CPUID has it. And use QEMU command-line "-overcommit cpu-pm=on" (enable_cpu_pm is enabled), a VM can use UMONITOR, UMWAIT and TPAUSE instructions. If the instruction causes a delay, the amount of time delayed is called here the physical delay. The physical delay is first computed by determining the virtual delay (the time to delay relative to the VMâ€™s timestamp counter). Otherwise, UMONITOR, UMWAIT and TPAUSE cause an invalid-opcode exception(#UD). The release document ref below link: https://software.intel.com/sites/default/files/\ managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf Co-developed-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Jingqi Liu <jingqi.liu@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Message-Id: <20191011074103.30393-2-tao3.xu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-23 17:50:27 +02:00
Vitaly Kuznetsov	30d6ff662d	i386/kvm: add NoNonArchitecturalCoreSharing Hyper-V enlightenment Hyper-V TLFS specifies this enlightenment as: "NoNonArchitecturalCoreSharing - Indicates that a virtual processor will never share a physical core with another virtual processor, except for virtual processors that are reported as sibling SMT threads. This can be used as an optimization to avoid the performance overhead of STIBP". However, STIBP is not the only implication. It was found that Hyper-V on KVM doesn't pass MD_CLEAR bit to its guests if it doesn't see NoNonArchitecturalCoreSharing bit. KVM reports NoNonArchitecturalCoreSharing in KVM_GET_SUPPORTED_HV_CPUID to indicate that SMT on the host is impossible (not supported of forcefully disabled). Implement NoNonArchitecturalCoreSharing support in QEMU as tristate: 'off' - the feature is disabled (default) 'on' - the feature is enabled. This is only safe if vCPUS are properly pinned and correct topology is exposed. As CPU pinning is done outside of QEMU the enablement decision will be made on a higher level. 'auto' - copy KVM setting. As during live migration SMT settings on the source and destination host may differ this requires us to add a migration blocker. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20191018163908.10246-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-22 09:38:42 +02:00
Mario Smarduch	73284563dc	target/i386: log MCE guest and host addresses Patch logs MCE AO, AR messages injected to guest or taken by QEMU itself. We print the QEMU address for guest MCEs, helps on hypervisors that have another source of MCE logging like mce log, and when they go missing. For example we found these QEMU logs: September 26th 2019, 17:36:02.309 Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected qemu-system-x86_64 amsN ams3nodeNNNN September 27th 2019, 06:25:03.234 Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected qemu-system-x86_64 amsN ams3nodeNNNN The first log had a corresponding mce log entry, the second didnt (logging thresholds) we can infer from second entry same PA and mce type. Signed-off-by: Mario Smarduch <msmarduch@digitalocean.com> Message-Id: <20191009164459.8209-3-msmarduch@digitalocean.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-22 09:38:38 +02:00
Eduardo Habkost	af95cafb87	i386: Omit all-zeroes entries from KVM CPUID table KVM has a 80-entry limit at KVM_SET_CPUID2. With the introduction of CPUID[0x1F], it is now possible to hit this limit with unusual CPU configurations, e.g.: $ ./x86_64-softmmu/qemu-system-x86_64 \ -smp 1,dies=2,maxcpus=2 \ -cpu EPYC,check=off,enforce=off \ -machine accel=kvm qemu-system-x86_64: kvm_init_vcpu failed: Argument list too long This happens because QEMU adds a lot of all-zeroes CPUID entries for unused CPUID leaves. In the example above, we end up creating 48 all-zeroes CPUID entries. KVM already returns all-zeroes when emulating the CPUID instruction if an entry is missing, so the all-zeroes entries are redundant. Skip those entries. This reduces the CPUID table size by half while keeping CPUID output unchanged. Reported-by: Yumei Huang <yuhuang@redhat.com> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1741508 Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190822225210.32541-1-ehabkost@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-10-15 18:34:44 -03:00
Thomas Huth	a1834d975f	target/i386/kvm: Silence warning from Valgrind about uninitialized bytes When I run QEMU with KVM under Valgrind, I currently get this warning: Syscall param ioctl(generic) points to uninitialised byte(s) at 0x95BA45B: ioctl (in /usr/lib64/libc-2.28.so) by 0x429DC3: kvm_ioctl (kvm-all.c:2365) by 0x51B249: kvm_arch_get_supported_msr_feature (kvm.c:469) by 0x4C2A49: x86_cpu_get_supported_feature_word (cpu.c:3765) by 0x4C4116: x86_cpu_expand_features (cpu.c:5065) by 0x4C7F8D: x86_cpu_realizefn (cpu.c:5242) by 0x5961F3: device_set_realized (qdev.c:835) by 0x7038F6: property_set_bool (object.c:2080) by 0x707EFE: object_property_set_qobject (qom-qobject.c:26) by 0x705814: object_property_set_bool (object.c:1338) by 0x498435: pc_new_cpu (pc.c:1549) by 0x49C67D: pc_cpus_init (pc.c:1681) Address 0x1ffeffee74 is on thread 1's stack in frame #2, created by kvm_arch_get_supported_msr_feature (kvm.c:445) It's harmless, but a little bit annoying, so silence it by properly initializing the whole structure with zeroes. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-04 18:49:20 +02:00
Paolo Bonzini	048c95163b	target/i386: work around KVM_GET_MSRS bug for secondary execution controls Some secondary controls are automatically enabled/disabled based on the CPUID values that are set for the guest. However, they are still available at a global level and therefore should be present when KVM_GET_MSRS is sent to /dev/kvm. Unfortunately KVM forgot to include those, so fix that. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-04 18:49:20 +02:00
Paolo Bonzini	20a78b02d3	target/i386: add VMX features Add code to convert the VMX feature words back into MSR values, allowing the user to enable/disable VMX features as they wish. The same infrastructure enables support for limiting VMX features in named CPU models. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-04 18:49:20 +02:00
Paolo Bonzini	ede146c2e7	target/i386: expand feature words to 64 bits VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP and IA32_VMX_BASIC MSRs. (The VMX control MSRs are 64-bit wide but actually have only 32 bits of information). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-04 18:49:19 +02:00
Philippe Mathieu-Daudé	d6d059ca07	hw/i386/pc: Extract e820 memory layout code Suggested-by: Samuel Ortiz <sameo@linux.intel.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190818225414.22590-3-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-09-16 17:13:07 +02:00
Wanpeng Li	d38d201f0e	i386/kvm: support guest access CORE cstate Allow guest reads CORE cstate when exposing host CPU power management capabilities to the guest. PKG cstate is restricted to avoid a guest to get the whole package information in multi-tenant scenario. Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1563154124-18579-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-09-16 12:32:20 +02:00
Jing Liu	80db491da4	x86: Intel AVX512_BF16 feature enabling Intel CooperLake cpu adds AVX512_BF16 instruction, defining as CPUID.(EAX=7,ECX=1):EAX[bit 05]. The patch adds a property for setting the subleaf of CPUID leaf 7 in case that people would like to specify it. The release spec link as follows, https://software.intel.com/sites/default/files/managed/c5/15/\ architecture-instruction-set-extensions-programming-reference.pdf Signed-off-by: Jing Liu <jing2.liu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-08-20 20:00:52 +02:00
Andrey Shinkevich	1f670a95b3	i386/kvm: initialize struct at full before ioctl call Not the whole structure is initialized before passing it to the KVM. Reduce the number of Valgrind reports. Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Message-Id: <1564502498-805893-4-git-send-email-andrey.shinkevich@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-08-20 17:26:19 +02:00
Li Qiang	de428cead6	target-i386: kvm: 'kvm_get_supported_msrs' cleanup Function 'kvm_get_supported_msrs' is only called once now, get rid of the static variable 'kvm_supported_msrs'. Signed-off-by: Li Qiang <liq3ea@163.com> Message-Id: <20190725151639.21693-1-liq3ea@163.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-08-20 17:26:19 +02:00

1 2 3 4

156 Commits