Define Arch LBR bit in XSS and save/restore structure
for XSAVE area size calculation.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20220215195258.29149-6-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There're some new features, including Arch LBR, depending
on XSAVES/XRSTORS support, the new instructions will
save/restore data based on feature bits enabled in XCR0 | XSS.
This patch adds the basic support for related CPUID enumeration
and meanwhile changes the name from FEAT_XSAVE_COMP_{LO|HI} to
FEAT_XSAVE_XCR0_{LO|HI} to differentiate clearly the feature
bits in XCR0 and those in XSS.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20220215195258.29149-5-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Last Branch Recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors which records a running trace of the most
recent branches taken by the processor in the LBR stack. This option
indicates the LBR format to enable for guest perf.
The LBR feature is enabled if below conditions are met:
1) KVM is enabled and the PMU is enabled.
2) msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM.
3) Supported returned value for lbr_fmt from above msr is non-zero.
4) Guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM.
5) User-provided lbr-fmt value doesn't violate its bitmask (0x3f).
6) Target guest LBR format matches that of host.
Co-developed-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20220215195258.29149-3-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Icelake, is the codename for Intel 3rd generation Xeon Scalable server
processors. There isn't ever client variants. This "Icelake-Client" CPU
model was added wrongly and imaginarily.
It has been deprecated since v5.2, now it's time to remove it completely
from code.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1647247859-4947-1-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When cache_info_passthrough is requested, QEMU passes the host values
of the cache information CPUID leaves down to the guest. However,
it blindly assumes that the CPUID leaf exists on the host, and this
cannot be guaranteed: for example, KVM has recently started to
synthesize AMD leaves up to 0x80000021 in order to provide accurate
CPU bug information to guests.
Querying a nonexistent host leaf fills the output arguments of
host_cpuid with data that (albeit deterministic) is nonsensical
as cache information, namely the data in the highest Intel CPUID
leaf. If said highest leaf is not ECX-dependent, this can even
cause an infinite loop when kvm_arch_init_vcpu prepares the input
to KVM_SET_CPUID2. The infinite loop is only terminated by an
abort() when the array gets full.
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Capstone should be superior to the old libopcode disassembler,
so we can drop the old file nowadays.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220412165836.355850-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
SynDbg commands can come from two different flows:
1. Hypercalls, in this mode the data being sent is fully
encapsulated network packets.
2. SynDbg specific MSRs, in this mode only the data that needs to be
transfered is passed.
Signed-off-by: Jon Doron <arilou@gmail.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220216102500.692781-4-arilou@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some versions of Windows hang on reboot if their TSC value is greater
than 2^54. The calibration of the Hyper-V reference time overflows
and fails; as a result the processors' clock sources are out of sync.
The issue is that the TSC _should_ be reset to 0 on CPU reset and
QEMU tries to do that. However, KVM special cases writing 0 to the
TSC and thinks that QEMU is trying to hot-plug a CPU, which is
correct the first time through but not later. Thwart this valiant
effort and reset the TSC to 1 instead, but only if the CPU has been
run once.
For this to work, env->tsc has to be moved to the part of CPUArchState
that is not zeroed at the beginning of x86_cpu_reset.
Reported-by: Vadim Rozenfeld <vrozenfe@redhat.com>
Supersedes: <20220324082346.72180-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some AMD processors expose the PKRU extended save state even if they do not have
the related PKU feature in CPUID. Worse, when they do they report a size of
64, whereas the expected size of the PKRU extended save state is 8, therefore
the esa->size == eax assertion does not hold.
The state is already ignored by KVM_GET_SUPPORTED_CPUID because it
was not enabled in the host XCR0. However, QEMU kvm_cpu_xsave_init()
runs before QEMU invokes arch_prctl() to enable dynamically-enabled
save states such as XTILEDATA, and KVM_GET_SUPPORTED_CPUID hides save
states that have yet to be enabled. Therefore, kvm_cpu_xsave_init()
needs to consult the host CPUID instead of KVM_GET_SUPPORTED_CPUID,
and dies with an assertion failure.
When setting up the ExtSaveArea array to match the host, ignore features that
KVM does not report as supported. This will cause QEMU to skip the incorrect
CPUID leaf instead of tripping the assertion.
Closes: https://gitlab.com/qemu-project/qemu/-/issues/916
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Analyzed-by: Yang Zhong <yang.zhong@intel.com>
Reported-by: Peter Krempa <pkrempa@redhat.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Even when the feature is not supported in guest CPUID,
still set the msr to the default value which will
be the only value KVM will accept in this case
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220223115824.319821-1-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Windows 11 with WSL2 enabled (Hyper-V) fails to boot with Icelake-Server
{-v5} CPU model but boots well with '-cpu host'. Apparently, it expects
5-level paging and 5-level EPT support to come in pair but QEMU's
Icelake-Server CPU model lacks the later. Introduce 'Icelake-Server-v6'
CPU model with 'vmx-page-walk-5' enabled by default.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20220221145316.576138-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add AMX primary feature bits XFD and AMX_TILE to
enumerate the CPU's AMX capability. Meanwhile, add
AMX TILE and TMUL CPUID leaf and subleaves which
exist when AMX TILE is present to provide the maximum
capability of TILE and TMUL.
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20220217060434.52460-6-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel introduces XFD faulting mechanism for extended
XSAVE features to dynamically enable the features in
runtime. If CPUID (EAX=0Dh, ECX=n, n>1).ECX[2] is set
as 1, it indicates support for XFD faulting of this
state component.
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20220217060434.52460-5-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Kernel allocates 4K xstate buffer by default. For XSAVE features
which require large state component (e.g. AMX), Linux kernel
dynamically expands the xstate buffer only after the process has
acquired the necessary permissions. Those are called dynamically-
enabled XSAVE features (or dynamic xfeatures).
There are separate permissions for native tasks and guests.
Qemu should request the guest permissions for dynamic xfeatures
which will be exposed to the guest. This only needs to be done
once before the first vcpu is created.
KVM implemented one new ARCH_GET_XCOMP_SUPP system attribute API to
get host side supported_xcr0 and Qemu can decide if it can request
dynamically enabled XSAVE features permission.
https://lore.kernel.org/all/20220126152210.3044876-1-pbonzini@redhat.com/
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Message-Id: <20220217060434.52460-4-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The AMX TILECFG register and the TMMx tile data registers are
saved/restored via XSAVE, respectively in state component 17
(64 bytes) and state component 18 (8192 bytes).
Add AMX feature bits to x86_ext_save_areas array to set
up AMX components. Add structs that define the layout of
AMX XSAVE areas and use QEMU_BUILD_BUG_ON to validate the
structs sizes.
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20220217060434.52460-3-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The extended state subleaves (EAX=0Dh, ECX=n, n>1).ECX[1]
indicate whether the extended state component locates
on the next 64-byte boundary following the preceding state
component when the compacted format of an XSAVE area is
used.
Right now, they are all zero because no supported component
needed the bit to be set, but the upcoming AMX feature will
use it. Fix the subleaves value according to KVM's supported
cpuid.
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20220217060434.52460-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The "hardware version" machinery (qemu_set_hw_version(),
qemu_hw_version(), and the QEMU_HW_VERSION define) is used by fewer
than 10 files. Move it out from osdep.h into a new
qemu/hw-version.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-6-peter.maydell@linaro.org
* DMA support in the multiboot option ROM
* Rename default-bus-bypass-iommu
* Deprecate -watchdog and cleanup -watchdog-action
* HVF fix for <PAGE_SIZE regions
* Support TSC scaling for AMD nested virtualization
* Fix for ESP fuzzing bug
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGBUeEUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOh+Qf+OMRhRiv6dYjbK/5zXrx81AgxYAY3
dBUSr8v16LyrMl1U3DZWzhD+MzQsC83m/Xsh4lGxlHDWtkK9QQA5xDG95JZdY26i
MGCbbjnFHISbyBQV9Y724gPfPjOOODuoFbzafSx6VLITOcyv1ye0cm7TOjOPB+tt
E4c3JqTZ7g8a5yMe8ItkVhz5pPY+oVw8dxMNRp6Sup5Dbfx0DjacIwLasLsHfPL7
qBADfqB20ovHUzLjXu7oWgEd4KxJ6kiSCaJJu/KD36hg0wB8+WVP1o43j4PkczHT
QjU7eZaeaTrN5Cf34ttPge6QReMi5SFNCaA9O9/HLqrQgdEtt/diZWuqjQ==
=a2mC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Build system fixes and cleanups
* DMA support in the multiboot option ROM
* Rename default-bus-bypass-iommu
* Deprecate -watchdog and cleanup -watchdog-action
* HVF fix for <PAGE_SIZE regions
* Support TSC scaling for AMD nested virtualization
* Fix for ESP fuzzing bug
# gpg: Signature made Tue 02 Nov 2021 10:57:37 AM EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
* remotes/bonzini/tags/for-upstream: (27 commits)
configure: fix --audio-drv-list help message
configure: Remove the check for the __thread keyword
Move the l2tpv3 test from configure to meson.build
meson: remove unnecessary coreaudio test program
meson: remove pointless warnings
meson.build: Allow to disable OSS again
meson: bump submodule to 0.59.3
qtest/am53c974-test: add test for cancelling in-flight requests
esp: ensure in-flight SCSI requests are always cancelled
KVM: SVM: add migration support for nested TSC scaling
hw/i386: fix vmmouse registration
watchdog: remove select_watchdog_action
vl: deprecate -watchdog
watchdog: add information from -watchdog help to -device help
hw/i386: Rename default_bus_bypass_iommu
hvf: Avoid mapping regions < PAGE_SIZE as ram
configure: do not duplicate CPU_CFLAGS into QEMU_LDFLAGS
configure: remove useless NPTL probe
target/i386: use DMA-enabled multiboot ROM for new-enough QEMU machine types
optionrom: add a DMA-enabled multiboot ROM
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
I noticed -cpu help printing enough trailing spaces to make the output
at least 84 characters wide. Looks ugly unless the terminal is wider.
Ugly or not, trailing spaces are stupid.
The culprit is this line in x86_cpu_list_entry():
qemu_printf("x86 %-20s %-58s\n", name, desc);
This prints a string with minimum field left-justified right before a
newline. Change it to
qemu_printf("x86 %-20s %s\n", name, desc);
which avoids the trailing spaces and is simpler to boot.
A search for the pattern with "git-grep -E '%-[0-9]+s\\n'" found a few
more instances. Change them similarly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Message-Id: <20211009152401.2982862-1-armbru@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Because core-capability releated features are model-specific and KVM
won't support it, remove the core-capability in CPU model to avoid the
warning message.
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210827064818.4698-3-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Only declare sev_enabled() and sev_es_enabled() when CONFIG_SEV is
set, to allow the compiler to elide unused code. Remove unnecessary
stubs.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20211007161716.453984-17-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SEV is a x86 specific feature, and the "sev_i386.h" header
is already in target/i386/. Rename it as "sev.h" to simplify.
Patch created mechanically using:
$ git mv target/i386/sev_i386.h target/i386/sev.h
$ sed -i s/sev_i386.h/sev.h/ $(git grep -l sev_i386.h)
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20211007161716.453984-15-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 00b8105324 ("target-i386: Remove assert_no_error usage")
forgot to add the "qapi/error.h" for &error_abort, add it now.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211007161716.453984-8-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM implements some Hyper-V 2016 functions so providing WS2008R2 version
is somewhat incorrect. While generally guests shouldn't care about it
and always check feature bits, it is known that some tools in Windows
actually check version info.
For compatibility reasons make the change for 6.2 machine types only.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210902093530.345756-9-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, we hardcode Hyper-V version id (CPUID 0x40000002) to
WS2008R2 and it is known that certain tools in Windows check this. It
seems useful to provide some flexibility by making it possible to change
this info at will. CPUID information is defined in TLFS as:
EAX: Build Number
EBX Bits 31-16: Major Version
Bits 15-0: Minor Version
ECX Service Pack
EDX Bits 31-24: Service Branch
Bits 23-0: Service Number
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210902093530.345756-8-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC
enabled. Normally, Hyper-V SynIC disables these hardware features and
suggests the guest to use paravirtualized AutoEOI feature. Linux-4.15
gains support for conditional APICv/AVIC disablement, the feature
stays on until the guest tries to use AutoEOI feature with SynIC. With
'HV_DEPRECATING_AEOI_RECOMMENDED' bit exposed, modern enough Windows/
Hyper-V versions should follow the recommendation and not use the
(unwanted) feature.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210902093530.345756-7-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
By default, KVM allows the guest to use all currently supported Hyper-V
enlightenments when Hyper-V CPUID interface was exposed, regardless of if
some features were not announced in guest visible CPUIDs. hv-enforce-cpuid
feature alters this behavior and only allows the guest to use exposed
Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is
not enabled by default in QEMU.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210902093530.345756-5-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
By default, KVM allows the guest to use all currently supported PV features
even when they were not announced in guest visible CPUIDs. Introduce a new
"kvm-pv-enforce-cpuid" flag to limit the supported feature set to the
exposed features. The feature is supported by Linux >= 5.10 and is not
enabled by default in QEMU.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210902093530.345756-4-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SGX capabilities are enumerated through CPUID_0x12.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-16-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the guest want to fully use SGX, the guest needs to be able to
access provisioning key. Add a new KVM_CAP_SGX_ATTRIBUTE to KVM to
support provisioning key to KVM guests.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-14-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose SGX to the guest if and only if KVM is enabled and supports
virtualization of SGX. While the majority of ENCLS can be emulated to
some degree, because SGX uses a hardware-based root of trust, the
attestation aspects of SGX cannot be emulated in software, i.e.
ultimately emulation will fail as software cannot generate a valid
quote/report. The complexity of partially emulating SGX in Qemu far
outweighs the value added, e.g. an SGX specific simulator for userspace
applications can emulate SGX for development and testing purposes.
Note, access to the PROVISIONKEY is not yet advertised to the guest as
KVM blocks access to the PROVISIONKEY by default and requires userspace
to provide additional credentials (via ioctl()) to expose PROVISIONKEY.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-13-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On real hardware, on systems that supports SGX Launch Control, those
MSRs are initialized to digest of Intel's signing key; on systems that
don't support SGX Launch Control, those MSRs are not available but
hardware always uses digest of Intel's signing key in EINIT.
KVM advertises SGX LC via CPUID if and only if the MSRs are writable.
Unconditionally initialize those MSRs to digest of Intel's signing key
when CPU is realized and reset to reflect the fact. This avoids
potential bug in case kvm_arch_put_registers() is called before
kvm_arch_get_registers() is called, in which case guest's virtual
SGX_LEPUBKEYHASH MSRs will be set to 0, although KVM initializes those
to digest of Intel's signing key by default, since KVM allows those MSRs
to be updated by Qemu to support live migration.
Save/restore the SGX Launch Enclave Public Key Hash MSRs if SGX Launch
Control (LC) is exposed to the guest. Likewise, migrate the MSRs if they
are writable by the guest.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-11-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID leaf 12_1_EAX is an Intel-defined feature bits leaf enumerating
the platform's SGX capabilities that may be utilized by an enclave, e.g.
whether or not an enclave can gain access to the provision key.
Currently there are six capabilities:
- INIT: set when the enclave has has been initialized by EINIT. Cannot
be set by software, i.e. forced to zero in CPUID.
- DEBUG: permits a debugger to read/write into the enclave.
- MODE64BIT: the enclave runs in 64-bit mode
- PROVISIONKEY: grants has access to the provision key
- EINITTOKENKEY: grants access to the EINIT token key, i.e. the
enclave can generate EINIT tokens
- KSS: Key Separation and Sharing enabled for the enclave.
Note that the entirety of CPUID.0x12.0x1, i.e. all registers, enumerates
the allowed ATTRIBUTES (128 bits), but only bits 31:0 are directly
exposed to the user (via FEAT_12_1_EAX). Bits 63:32 are currently all
reserved and bits 127:64 correspond to the allowed XSAVE Feature Request
Mask, which is calculated based on other CPU features, e.g. XSAVE, MPX,
AVX, etc... and is not exposed to the user.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-10-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID leaf 12_0_EBX is an Intel-defined feature bits leaf enumerating
the platform's SGX extended capabilities. Currently there is a single
capabilitiy:
- EXINFO: record information about #PFs and #GPs in the enclave's SSA
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-9-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID leaf 12_0_EAX is an Intel-defined feature bits leaf enumerating
the CPU's SGX capabilities, e.g. supported SGX instruction sets.
Currently there are four enumerated capabilities:
- SGX1 instruction set, i.e. "base" SGX
- SGX2 instruction set for dynamic EPC management
- ENCLV instruction set for VMM oversubscription of EPC
- ENCLS-C instruction set for thread safe variants of ENCLS
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-8-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add CPUID defines for SGX and SGX Launch Control (LC), as well as
defines for their associated FEATURE_CONTROL MSR bits. Define the
Launch Enclave Public Key Hash MSRs (LE Hash MSRs), which exist
when SGX LC is present (in CPUID), and are writable when SGX LC is
enabled (in FEATURE_CONTROL).
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-7-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VGIF provides masking capability for when virtual interrupts
are taken. (APM2)
Signed-off-by: Lara Lazier <laramglazier@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Moved int_ctl into the CPUX86State structure. It removes some
unnecessary stores and loads, and prepares for tracking the vIRQ
state even when it is masked due to vGIF.
Signed-off-by: Lara Lazier <laramglazier@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VGIF allows STGI and CLGI to execute in guest mode and control virtual
interrupts in guest mode.
When the VGIF feature is enabled then:
* executing STGI in the guest sets bit 9 of the VMCB offset 60h.
* executing CLGI in the guest clears bit 9 of the VMCB offset 60h.
Signed-off-by: Lara Lazier <laramglazier@gmail.com>
Message-Id: <20210730070742.9674-1-laramglazier@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
APM2 requires that VMRUN and VMLOAD canonicalize (sign extend to 63
from 48/57) all base addresses in the segment registers that have been
respectively loaded.
Signed-off-by: Lara Lazier <laramglazier@gmail.com>
Message-Id: <20210804113058.45186-1-laramglazier@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The AVX_VNNI feature is not in Cooperlake platform, remove it
from cpu model.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210820054611.84303-1-yang.zhong@intel.com>
Fixes: c1826ea6a0 ("i386/cpu: Expose AVX_VNNI instruction to guest")
Cc: qemu-stable@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
At present, there's no mechanism intelligent enough to virtualize split
lock detection correctly. Remove it in Snowridge CPU model to avoid the
feature exposure.
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210630012053.10098-1-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Missed in commit f3478392 "docs: Move deprecation, build
and license info out of system/"
Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210723065828.1336760-1-maozhongyi@cmss.chinamobile.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some cpu properties have to be set only for cpu models in builtin_x86_defs,
registered with x86_register_cpu_model_type, and not for
cpu models "base", "max", and the subclass "host".
These properties are the ones set by function x86_cpu_apply_props,
(also including kvm_default_props, tcg_default_props),
and the "vendor" property for the KVM and HVF accelerators.
After recent refactoring of cpu, which also affected these properties,
they were instead set unconditionally for all x86 cpus.
This has been detected as a bug with Nested on AMD with cpu "host",
as svm was not turned on by default, due to the wrongful setting of
kvm_default_props via x86_cpu_apply_props, which set svm to "off".
Rectify the bug introduced in commit "i386: split cpu accelerators"
and document the functions that are builtin_x86_defs-only.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c,"...)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/477
Message-Id: <20210723112921.12637-1-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A AMD server typically has cpuid level 0x10(test on Rome/Milan), it
should not be changed to 0x1f in multi-dies case.
* to maintain compatibility with older machine types, only implement
this change when the CPU's "x-vendor-cpuid-only" property is false
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: zhenwei pi <pizhenwei@bytedance.com>
Fixes: a94e142899 (target/i386: Add CPUID.1F generation support for multi-dies PCMachine)
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210708170641.49410-1-michael.roth@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently all built-in CPUs report cache information via CPUID leaves 2
and 4, but these have never been defined for AMD. In the case of
SEV-SNP this can cause issues with CPUID enforcement. Address this by
allowing CPU types to suppress these via a new "x-vendor-cpuid-only"
CPU property, which is true by default, but switched off for older
machine types to maintain compatibility.
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: zhenwei pi <pizhenwei@bytedance.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210708003623.18665-1-michael.roth@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we
need to expand and set the corresponding CPUID leaves early. Modify
x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V
specific kvm_hv_get_supported_cpuid() instead of
kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid()
as Hyper-V specific CPUID leaves intersect with KVM's.
Note, early expansion will only happen when KVM supports system wide
KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID).
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210608120817.1325125-6-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Rather than relying on the X86XSaveArea structure definition,
determine the offset of XSAVE state areas using CPUID leaf 0xd where
possible (KVM and HVF).
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-8-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Provide visibility of the x86_ext_save_areas array and associated type
outside of cpu.c.
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20210705104632.2902400-6-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux 5.14 will add support for nested TSC scaling. Add the
corresponding feature in QEMU; to keep support for existing kernels,
do not add it to any processor yet.
The handling of the VMCS enumeration MSR is ugly; once we have more than
one case, we may want to add a table to check VMX features against.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes host and max cpu initialization, by running the accel cpu
initialization only after all instance init functions are called for all
X86 cpu subclasses.
The bug this is fixing is related to the "max" and "host" i386 cpu
subclasses, which set cpu->max_features, which is then used at cpu
realization time.
In order to properly split the accel-specific max features code that
needs to be executed at cpu instance initialization time,
we cannot call the accel cpu initialization at the end of the x86 base
class initialization, or we will have no way to specialize
"max features" cpu behavior, overriding the "max" cpu class defaults,
and checking for the "max features" flag itself.
This patch moves the accel-specific cpu instance initialization to after
all x86 cpu instance code has been executed, including subclasses,
so that proper initialization of cpu "host" and "max" can be restored.
Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c,"...)
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210603123001.17843-3-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
i386 realizefn code is sensitive to ordering, and recent commits
aimed at refactoring it, splitting accelerator-specific code,
broke assumptions which need to be fixed.
We need to:
* process hyper-v enlightements first, as they assume features
not to be expanded
* only then, expand features
* after expanding features, attempt to check them and modify them in the
accel-specific realizefn code called by cpu_exec_realizefn().
* after the framework has been called via cpu_exec_realizefn,
the code can check for what has or hasn't been set by accel-specific
code, or extend its results, ie:
- check and evenually set code_urev default
- modify cpu->mwait after potentially being set from host CPUID.
- finally check for phys_bits assuming all user and accel-specific
adjustments have already been taken into account.
Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c"...)
Fixes: 30565f10 ("cpu: call AccelCPUClass::cpu_realizefn in"...)
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210603123001.17843-2-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V feature leaves are weird. We have some of them in
feature_word_info[] array but we don't use feature_word_info
magic to enable them. Neither do we use feature_dependencies[]
mechanism to validate the configuration as it doesn't allign
well with Hyper-V's many-to-many dependency chains. Some of
the feature leaves hold not only feature bits, but also values.
E.g. FEAT_HV_NESTED_EAX contains both features and the supported
Enlightened VMCS range.
Hyper-V features are already represented in 'struct X86CPU' with
uint64_t hyperv_features so duplicating them in env->features adds
little (or zero) benefits. THe other half of Hyper-V emulation features
is also stored with values in hyperv_vendor_id[], hyperv_limits[],...
so env->features[] is already incomplete.
Remove Hyper-V feature leaves from env->features[] completely.
kvm_hyperv_properties[] is converted to using raw CPUID func/reg
pairs for features, this allows us to get rid of hv_cpuid_get_fw()
conversion.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210422161130.652779-8-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When cpu->hyperv_vendor is not set manually we default to "Microsoft Hv"
and in 'hv_passthrough' mode we get the information from the host. This
information is stored in cpu->hyperv_vendor_id[] array but we don't update
cpu->hyperv_vendor string so e.g. QMP's query-cpu-model-expansion output
is incorrect.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210422161130.652779-2-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The 'max' CPU under TCG currently reports a family/model/stepping that
approximately corresponds to an AMD K7 vintage architecture.
The K7 series predates the introduction of 64-bit support by AMD
in the K8 series. This has been reported to lead to LLVM complaints
about generating 64-bit code for a 32-bit CPU target
LLVM ERROR: 64-bit code requested on a subtarget that doesn't support it!
It appears LLVM looks at the family/model/stepping, despite qemu64
reporting it is 64-bit capable.
This patch changes 'max' to report a CPUID with the family, model
and stepping taken from a
AMD Athlon(tm) 64 X2 Dual Core Processor 4000+
which is one of the first 64-bit AMD CPUs.
Closes https://gitlab.com/qemu-project/qemu/-/issues/191
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210507133650.645526-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The 'qemu64' CPUID currently reports a family/model/stepping that
approximately corresponds to an AMD K7 vintage architecture.
The K7 series predates the introduction of 64-bit support by AMD
in the K8 series. This has been reported to lead to LLVM complaints
about generating 64-bit code for a 32-bit CPU target
LLVM ERROR: 64-bit code requested on a subtarget that doesn't support it!
It appears LLVM looks at the family/model/stepping, despite qemu64
reporting it is 64-bit capable.
This patch changes 'qemu64' to report a CPUID with the family, model
and stepping taken from a
AMD Athlon(tm) 64 X2 Dual Core Processor 4000+
which is one of the first 64-bit AMD CPUs.
Closes https://gitlab.com/qemu-project/qemu/-/issues/191
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210507133650.645526-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Hyper-V 2016 refuses to boot on Skylake+ CPU models because they lack
'xsaves'/'vmx-xsaves' features and this diverges from real hardware. The
same issue emerges with AMD "EPYC" CPU model prior to version 3 which got
'xsaves' added. EPYC-Rome/EPYC-Milan CPU models have 'xsaves' enabled from
the very beginning so the comment blaming KVM to explain why other CPUs
lack 'xsaves' is likely outdated.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210412073952.860944-1-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-23-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-22-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-21-f4bug@amsat.org>
[rth: Drop declaration movement from target/*/cpu.h]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-20-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The write_elf*() handlers are used to dump vmcore images.
This feature is only meaningful for system emulation.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-19-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
cpu_get_crash_info() is called on GUEST_PANICKED events,
which only occur in system emulation.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-18-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Migration is specific to system emulation.
- Move the CPUClass::vmsd field to SysemuCPUOps,
- restrict VMSTATE_CPU() macro to sysemu,
- vmstate_dummy is now unused, remove it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-16-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce a structure to hold handler specific to sysemu.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210517105140.1062037-15-f4bug@amsat.org>
[rth: Squash "restrict hw/core/sysemu-cpu-ops.h" patch]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Quoting Peter Maydell [*]:
There are two ways to handle migration for
a CPU object:
(1) like any other device, so it has a dc->vmsd that covers
migration for the whole object. As usual for objects that are a
subclass of a parent that has state, the first entry in the
VMStateDescription field list is VMSTATE_CPU(), which migrates
the cpu_common fields, followed by whatever the CPU's own migration
fields are.
(2) a backwards-compatible mechanism for CPUs that were
originally migrated using manual "write fields to the migration
stream structures". The on-the-wire migration format
for those is based on the 'env' pointer (which isn't a QOM object),
and the cpu_common part of the migration data is elsewhere.
cpu_exec_realizefn() handles both possibilities:
* for type 1, dc->vmsd is set and cc->vmsd is not,
so cpu_exec_realizefn() does nothing, and the standard
"register dc->vmsd for a device" code does everything needed
* for type 2, dc->vmsd is NULL and so we register the
vmstate_cpu_common directly to handle the cpu-common fields,
and the cc->vmsd to handle the per-CPU stuff
You can't change a CPU from one type to the other without breaking
migration compatibility, which is why some guest architectures
are stuck on the cc->vmsd form. New targets should use dc->vmsd.
To avoid new targets to start using type (2), rename cc->vmsd as
cc->legacy_vmsd. The correct field to implement is dc->vmsd (the
DeviceClass one).
See also commit b170fce3dd ("cpu: Register VMStateDescription
through CPUState") for historic background.
[*] https://www.mail-archive.com/qemu-devel@nongnu.org/msg800849.html
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210517105140.1062037-13-f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Expose AVX (VEX-encoded) versions of the Vector Neural Network
Instructions to guest.
The bit definition:
CPUID.(EAX=7,ECX=1):EAX[bit 4] AVX_VNNI
The following instructions are available when this feature is
present in the guest.
1. VPDPBUS: Multiply and Add Unsigned and Signed Bytes
2. VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation
3. VPDPWSSD: Multiply and Add Signed Word Integers
4. VPDPWSSDS: Multiply and Add Signed Integers with Saturation
As for the kvm related code, please reference Linux commit id 1085a6b585d7.
The release document ref below link:
https://software.intel.com/content/www/us/en/develop/download/\
intel-architecture-instruction-set-extensions-programming-reference.html
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210407015609.22936-1-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
avoid open coding the accesses to cpu->accel_cpu interfaces,
and instead introduce:
accel_cpu_instance_init,
accel_cpu_realizefn
to be used by the targets/ initfn code,
and by cpu_exec_realizefn respectively.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210322132800.7470-7-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
move the call to accel_cpu->cpu_realizefn to the general
cpu_exec_realizefn from target/i386, so it does not need to be
called for every target explicitly as we enable more targets.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210322132800.7470-6-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
i386 is the first user of AccelCPUClass, allowing to split
cpu.c into:
cpu.c cpuid and common x86 cpu functionality
host-cpu.c host x86 cpu functions and "host" cpu type
kvm/kvm-cpu.c KVM x86 AccelCPUClass
hvf/hvf-cpu.c HVF x86 AccelCPUClass
tcg/tcg-cpu.c TCG x86 AccelCPUClass
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[claudio]:
Rebased on commit b8184135 ("target/i386: allow modifying TCG phys-addr-bits")
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210322132800.7470-5-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Found the following cpu feature bits missing from EPYC-Rome model.
ibrs : Indirect Branch Restricted Speculation
ssbd : Speculative Store Bypass Disable
These new features will be added in EPYC-Rome-v2. The -cpu help output
after the change.
x86 EPYC-Rome (alias configured by machine type)
x86 EPYC-Rome-v1 AMD EPYC-Rome Processor
x86 EPYC-Rome-v2 AMD EPYC-Rome Processor
Reported-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <161478622280.16275.6399866734509127420.stgit@bmoger-ubuntu>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
An assorted set of spelling fixes in various places.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210309111510.79495-1-mjt@msgid.tls.msk.ru>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Bus lock debug exception is a feature that can notify the kernel by
generate an #DB trap after the instruction acquires a bus lock when
CPL>0. This allows the kernel to enforce user application throttling or
mitigations.
This feature is enumerated via CPUID.(EAX=7,ECX=0).ECX[bit 24].
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210202090224.13274-1-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The preferred syntax is to use "foo=on|off", rather than a bare
"+foo" or "-foo"
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210216191027.595031-11-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds the support for AMD 3rd generation processors. The model
display for the new processor will be EPYC-Milan.
Adds the following new feature bits on top of the feature bits from
the first and second generation EPYC models.
pcid : Process context identifiers support
ibrs : Indirect Branch Restricted Speculation
ssbd : Speculative Store Bypass Disable
erms : Enhanced REP MOVSB/STOSB support
fsrm : Fast Short REP MOVSB support
invpcid : Invalidate processor context ID
pku : Protection keys support
svme-addr-chk : SVM instructions address check for #GP handling
Depends on the following kernel commits:
14c2bf81fcd2 ("KVM: SVM: Fix #GP handling for doubly-nested virtualization")
3b9c723ed7cf ("KVM: SVM: Add support for SVM instruction address check change")
4aa2691dcbd3 ("8ce1c461188799d863398dd2865d KVM: x86: Factor out x86 instruction emulation with decoding")
4407a797e941 ("KVM: SVM: Enable INVPCID feature on AMD")
9715092f8d7e ("KVM: X86: Move handling of INVPCID types to x86")
3f3393b3ce38 ("KVM: X86: Rename and move the function vmx_handle_memory_failure to x86.c")
830bd71f2c06 ("KVM: SVM: Remove set_cr_intercept, clr_cr_intercept and is_cr_intercept")
4c44e8d6c193 ("KVM: SVM: Add new intercept word in vmcb_control_area")
c62e2e94b9d4 ("KVM: SVM: Modify 64 bit intercept field to two 32 bit vectors")
9780d51dc2af ("KVM: SVM: Modify intercept_exceptions to generic intercepts")
30abaa88382c ("KVM: SVM: Change intercept_dr to generic intercepts")
03bfeeb988a9 ("KVM: SVM: Change intercept_cr to generic intercepts")
c45ad7229d13 ("KVM: SVM: Introduce vmcb_(set_intercept/clr_intercept/_is_intercept)")
a90c1ed9f11d ("(pcid) KVM: nSVM: Remove unused field")
fa44b82eb831 ("KVM: x86: Move MPK feature detection to common code")
38f3e775e9c2 ("x86/Kconfig: Update config and kernel doc for MPK feature on AMD")
37486135d3a7 ("KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <161290460478.11352.8933244555799318236.stgit@bmoger-ubuntu>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Provide initial support for SEV-ES. This includes creating a function to
indicate the guest is an SEV-ES guest (which will return false until all
support is in place), performing the proper SEV initialization and
ensuring that the guest CPU state is measured as part of the launch.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Co-developed-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Message-Id: <2e6386cbc1ddeaf701547dd5677adf5ddab2b6bd.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose the VMX exit/entry load pkrs control bits in
VMX_TRUE_EXIT_CTLS/VMX_TRUE_ENTRY_CTLS MSRs to guest, which supports the
PKS in nested VM.
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210205083325.13880-3-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Protection Keys for Supervisor-mode pages is a simple extension of
the PKU feature that QEMU already implements. For supervisor-mode
pages, protection key restrictions come from a new MSR. The MSR
has no XSAVE state associated to it.
PKS is only respected in long mode. However, in principle it is
possible to set the MSR even outside long mode, and in fact
even the XSAVE state for PKRU could be set outside long mode
using XRSTOR. So do not limit the migration subsections for
PKRU and PKRS to long mode.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Newer AMD CPUs will add CPUID_0x8000000A_EDX[28] bit, which indicates
that SVM instructions (VMRUN/VMSAVE/VMLOAD) will trigger #VMEXIT before
CPU checking their EAX against reserved memory regions. This change will
allow the hypervisor to avoid intercepting #GP and emulating SVM
instructions. KVM turns on this CPUID bit for nested VMs. In order to
support it, let us populate this bit, along with other SVM feature bits,
in FEAT_SVM.
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Message-Id: <20210126202456.589932-1-wei.huang2@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
32-bit targets by definition do not support long mode; therefore, the
bit must be masked in the features supported by the accelerator.
As a side effect, this avoids setting up the 0x80000008 CPUID leaf
for
qemu-system-i386 -cpu host
which since commit 5a140b255d ("x86/cpu: Use max host physical address
if -cpu max option is applied") would have printed this error:
qemu-system-i386: phys-bits should be between 32 and 36 (but is 48)
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The easiest spots to use QAPI_LIST_APPEND are where we already have an
obvious pointer to the tail of a list. While at it, consistently use
the variable name 'tail' for that purpose.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210113221013.390592-5-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
QEMU option -cpu max(max_features) means "Enables all features supported by
the accelerator in the current host", this looks true for all the features
except guest max physical address width, so add this patch to enable it.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20210113090430.26394-1-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Anywhere we create a list of just one item or by prepending items
(typically because order doesn't matter), we can use
QAPI_LIST_PREPEND(). But places where we must keep the list in order
by appending remain open-coded until later patches.
Note that as a side effect, this also performs a cleanup of two minor
issues in qga/commands-posix.c: the old code was performing
new = g_malloc0(sizeof(*ret));
which 1) is confusing because you have to verify whether 'new' and
'ret' are variables with the same type, and 2) would conflict with C++
compilation (not an actual problem for this file, but makes
copy-and-paste harder).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201113011340.463563-5-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
[Straightforward conflicts due to commit a8aa94b5f8 "qga: update
schema for guest-get-disks 'dependents' field" and commit a10b453a52
"target/mips: Move mips_cpu_add_definition() from helper.c to cpu.c"
resolved. Commit message tweaked.]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
to do this, we need to take code out of cpu.c and helper.c,
and also move some prototypes from cpu.h, for code that is
needed in tcg/xxx_helper.c, and which in turn is part of the
callbacks registered by the class initialization.
Therefore, do some shuffling of the parts of cpu.h that
are only relevant for tcg/, and put them in tcg/helper-tcg.h
For FT0 and similar macros, put them in tcg/fpu-helper.c
since they are used only there.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20201212155530.23098-8-cfontana@suse.de>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
AVX512 Half-precision floating point (FP16) has better performance
compared to FP32 if the presicion or magnitude requirements are met.
It's defined as CPUID.(EAX=7,ECX=0):EDX[bit 23].
Refer to
https://software.intel.com/content/www/us/en/develop/download/\
intel-architecture-instruction-set-extensions-programming-reference.html
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <20201216224002.32677-1-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
As a preparation to expanding Hyper-V CPU features early, move
hyperv_limits initialization to x86_cpu_realizefn().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201119103221.1665171-5-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
As a preparation to expanding Hyper-V CPU features early, move
hyperv_version_id initialization to x86_cpu_realizefn().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201119103221.1665171-4-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
As a preparation to expanding Hyper-V CPU features early, move
hyperv_interface_id initialization to x86_cpu_realizefn().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201119103221.1665171-3-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
As a preparation to expanding Hyper-V CPU features early, move
hyperv_vendor_id initialization to x86_cpu_realizefn(). Introduce
x86_cpu_hyperv_realize() to not not pollute x86_cpu_realizefn()
itself.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201119103221.1665171-2-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Class properties make QOM introspection simpler and easier, as
they don't require an object to be instantiated.
Also, the hundreds of instance properties were having an impact
on QMP commands that create temporary CPU objects. On my
machine, run time of qmp_query_cpu_definitions() changed
from ~200ms to ~16ms after applying this patch.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20201111183823.283752-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The current implementation will disable the guest Intel PT feature
if the Intel PT LIP feature is supported on the host, but the LIP
feature is comming soon(e.g. SnowRidge and later).
This patch will make the guest LIP feature configurable and Intel
PT feature can be enabled in guest when the guest LIP status same
with the host.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <20201202101042.11967-1-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This patch adds support the kernel-irqchip option for
WHPX with on or off value. 'split' value is not supported
for the option. The option only works for the latest version
of Windows (ones that are coming out on Insiders). The
change maintains backward compatibility on older version of
Windows where this option is not supported.
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Message-Id: <SN4PR2101MB0880B13258DA9251F8459F4DC0170@SN4PR2101MB0880.namprd21.prod.outlook.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The IOAPIC has an 'Extended Destination ID' field in its RTE, which maps
to bits 11-4 of the MSI address. Since those address bits fall within a
given 4KiB page they were historically non-trivial to use on real hardware.
The Intel IOMMU uses the lowest bit to indicate a remappable format MSI,
and then the remaining 7 bits are part of the index.
Where the remappable format bit isn't set, we can actually use the other
seven to allow external (IOAPIC and MSI) interrupts to reach up to 32768
CPUs instead of just the 255 permitted on bare metal.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <78097f9218300e63e751e077a0a5ca029b56ba46.camel@infradead.org>
[Fix UBSAN warning. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.
Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Message-Id: <20201023122801.19514-1-chetan4windows@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Icelake-Client CPU models will be removed in the future.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1600758855-80046-2-git-send-email-robert.hu@linux.intel.com>
[ehabkost: reword deprecation note, fix version in doc]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Implement the ability of marking some versions deprecated. When
that CPU model is chosen, print a warning. The warning message
can be customized, e.g. suggesting an alternative CPU model to be
used instead.
The deprecation message will be printed by x86_cpu_list_entry(),
e.g. '-cpu help'.
QMP command 'query-cpu-definitions' will return a bool value
indicating the deprecation status.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1600758855-80046-1-git-send-email-robert.hu@linux.intel.com>
[ehabkost: reword commit message]
[ehabkost: Handle NULL cpu_type]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
With x2apic enabled, configurations can have more that 255 cores.
Noticed the device add test is hitting an assert when during cpu
hotplug with core_id > 255. This is due to assert check in the
CPUID 0x8000001E.
Remove the assert check and fix the problem.
Fixes the bug:
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1834200
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <160072824160.9666.8890355282135970684.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We only use x86_cpu_get_supported_feature_word() after its implementation,
no forward declaration needed.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200904145431.196885-3-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Class properties make QOM introspection simpler and easier, as
they don't require an object to be instantiated.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200921221045.699690-14-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Missed in commit 41fba1618b "docs/system: convert the documentation of
deprecated features to rST."
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200929075824.1517969-3-armbru@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Per Intel SDM vol 1, 13.2, if CPUID.1:ECX.XSAVE[bit 26] is 0, the
processor provides no further enumeration through CPUID function 0DH.
QEMU does not do this for "-cpu host,-xsave".
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200716082019.215316-2-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux-5.8 introduced interrupt based mechanism for 'page ready' events
delivery and disabled the old, #PF based one (see commit 2635b5c4a0e4
"KVM: x86: interrupt based APF 'page ready' event delivery"). Linux
guest switches to using in in 5.9 (see commit b1d405751cd5 "KVM: x86:
Switch KVM guest to using interrupts for page ready APF delivery").
The feature has a new KVM_FEATURE_ASYNC_PF_INT bit assigned and
the interrupt vector is set in MSR_KVM_ASYNC_PF_INT MSR. Support this
in QEMU.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200908141206.357450-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When debugging QEMU it is often useful to put a breakpoint on the
error_setg_internal method impl.
Unfortunately the object_property_add / object_class_property_add
methods call object_property_find / object_class_property_find methods
to check if a property exists already before adding the new property.
As a result there are a huge number of calls to error_setg_internal
on startup of most QEMU commands, making it very painful to set a
breakpoint on this method.
Most callers of object_find_property and object_class_find_property,
however, pass in a NULL for the Error parameter. This simplifies the
methods to remove the Error parameter entirely, and then adds some
new wrapper methods that are able to raise an Error when needed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200914135617.1493072-1-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
apic_id contains all the information required to build
CPUID_8000_001E. core_id and node_id is already part of
apic_id generated by x86_topo_ids_from_apicid.
Also remove the restriction on number bits on core_id and
node_id.
Remove all the hardcoded values and replace with generalized
fields.
Refer the Processor Programming Reference (PPR) documentation
available from the bugzilla Link below.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Message-Id: <159897585257.30750.5815593918927986935.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Remove all the hardcoded values and replace with generalized
fields.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <159897584649.30750.3939159632943292252.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Hyper-V TLFS prior to version 6.0 had a mistake in it: special value
'0xffffffff' for CPUID 0x40000004.EBX was called 'never to retry', this
looked weird (like why it's not '0' which supposedly have the same effect?)
but nobody raised the question. In TLFS version 6.0 the mistake was
corrected to 'never notify' which sounds logical. Fix QEMU accordingly.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200515114847.74523-1-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit c24a41bb53.
Remove the EPYC specific apicid decoding and use the generic
default decoding.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <159889937478.21294.4192291354416942986.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit dd08ef0318.
Remove the EPYC specific apicid decoding and use the generic
default decoding.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <159889936257.21294.1786224705357428082.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit 0c1538cb1a.
Remove the EPYC specific apicid decoding and use the generic
default decoding.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <159889935015.21294.1425332462852607813.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit 247b18c593.
Remove the EPYC specific apicid decoding and use the generic
default decoding.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <159889933756.21294.13999336052652073520.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit 7b225762c8.
Remove the EPYC specific apicid decoding and use the generic
default decoding.
Also fix all the references of pkg_offset.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <159889933119.21294.8112825730577505757.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add the missing vmx features in Skylake-Server and Cascadelake-Server
CPU models based on the output of Paolo's script.
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20200714084148.26690-4-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add the missing features(sha_ni, avx512ifma, rdpid, fsrm,
vmx-rdseed-exit, vmx-pml, vmx-eptp-switching) and change the model
number to 106 in the Icelake-Server-v4 CPU model.
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20200714084148.26690-3-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
For CPUs support fast short REP MOV[CPUID.(EAX=7,ECX=0):EDX(bit4)], e.g
Icelake and Tigerlake, expose it to the guest VM.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20200714084148.26690-2-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Features unavailable due to absent of their dependent features should
not be added to env->user_features. env->user_features only contains the
feature explicity specified with -feature/+feature by user.
Fixes: 99e24dbdaa ("target/i386: introduce generic feature dependency mechanism")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200713174436.41070-3-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Features defined in versioned CPU model are recorded in env->user_features
since they are updated as property. It's unwated because they are not
user specified.
Simply clear env->user_features as a fix. It won't clear user specified
features because user specified features are filled to
env->user_features later in x86_cpu_expand_features().
Cc: Chenyi Qiang <chenyi.qiang@intel.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200713174436.41070-2-xiaoyao.li@intel.com>
[ehabkost: fix coding style]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This instruction aims to give a way to choose which memory accesses
do not need to be tracked in the TSX read set, which is defined as
CPUID.(EAX=7,ECX=0):EDX[bit 16].
The release spec link is as follows:
https://software.intel.com/content/dam/develop/public/us/en/documents/\
architecture-instruction-set-extensions-programming-reference.pdf
The associated kvm patch link is as follows:
https://lore.kernel.org/patchwork/patch/1268026/
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <1593991036-12183-3-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The availability of the SERIALIZATION instruction is indicated
by the presence of the CPUID feature flag SERIALIZE, which is
defined as CPUID.(EAX=7,ECX=0):ECX[bit 14].
The release spec link is as follows:
https://software.intel.com/content/dam/develop/public/us/en/documents/\
architecture-instruction-set-extensions-programming-reference.pdf
The associated kvm patch link is as follows:
https://lore.kernel.org/patchwork/patch/1268025/
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <1593991036-12183-2-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CPUID level need to be set to 0x14 manually on old
machine-type if Intel PT is enabled in guest. E.g. the
CPUID[0].EAX(level)=7 and CPUID[7].EBX[25](intel-pt)=1 when the
Qemu with "-machine pc-i440fx-3.1 -cpu qemu64,+intel-pt" parameter.
This patch corrects the warning message of the previous
submission(ddc2fc9).
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1593499113-4768-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, QEMU is overriding KVM_GET_SUPPORTED_CPUID's answer for
the WAITPKG bit depending on the "-overcommit cpu-pm" setting. This is a
bad idea because it does not even check if the host supports it, but it
can be done in x86_cpu_realizefn just like we do for the MONITOR bit.
This patch moves it there, while making it conditional on host
support for the related UMWAIT MSR.
Cc: qemu-stable@nongnu.org
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hvf_reset_vcpu() duplicates actions performed by x86_cpu_reset(). The
difference is that hvf_reset_vcpu() stores initial values directly to
VMCS while x86_cpu_reset() stores it in CPUX86State and then
cpu_synchronize_all_post_init() or cpu_synchronize_all_post_reset()
flushes CPUX86State into VMCS. That makes hvf_reset_vcpu() a kind of
no-op.
Here's the trace of CPU state modifications during VM start:
hvf_reset_vcpu (resets VMCS)
cpu_synchronize_all_post_init (overwrites VMCS fields written by
hvf_reset_vcpu())
cpu_synchronize_all_states
hvf_reset_vcpu (resets VMCS)
cpu_synchronize_all_post_reset (overwrites VMCS fields written by
hvf_reset_vcpu())
General purpose registers, system registers, segment descriptors, flags
and IP are set by hvf_put_segments() in post-init and post-reset,
therefore it's safe to remove them from hvf_reset_vcpu().
PDPTE initialization can be dropped because Intel SDM (26.3.1.6 Checks
on Guest Page-Directory-Pointer-Table Entries) doesn't require PDPTE to
be clear unless PAE is used: "A VM entry to a guest that does not use
PAE paging does not check the validity of any PDPTEs."
And if PAE is used, PDPTE's are initialized from CR3 in macvm_set_cr0().
Cc: Cameron Esfahani <dirty@apple.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200630102824.77604-8-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Support for nested guest live migration is part of Linux 5.8, add the
corresponding code to QEMU. The migration format consists of a few
flags, is an opaque 4k blob.
The blob is in VMCB format (the control area represents the L1 VMCB
control fields, the save area represents the pre-vmentry state; KVM does
not use the host save area since the AMD manual allows that) but QEMU
does not really care about that. However, the flags need to be
copied to hflags/hflags2 and back.
In addition, support for retrieving and setting the AMD nested virtualization
states allows the L1 guest to be reset while running a nested guest, but
a small bug in CPU reset needs to be fixed for that to work.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away. The previous two commits did that for sufficiently simple
cases with Coccinelle. Do it for several more manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-37-armbru@redhat.com>
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away. Convert
if (!foo(..., &err)) {
...
error_propagate(errp, err);
...
return ...
}
to
if (!foo(..., errp)) {
...
...
return ...
}
where nothing else needs @err. Coccinelle script:
@rule1 forall@
identifier fun, err, errp, lbl;
expression list args, args2;
binary operator op;
constant c1, c2;
symbol false;
@@
if (
(
- fun(args, &err, args2)
+ fun(args, errp, args2)
|
- !fun(args, &err, args2)
+ !fun(args, errp, args2)
|
- fun(args, &err, args2) op c1
+ fun(args, errp, args2) op c1
)
)
{
... when != err
when != lbl:
when strict
- error_propagate(errp, err);
... when != err
(
return;
|
return c2;
|
return false;
)
}
@rule2 forall@
identifier fun, err, errp, lbl;
expression list args, args2;
expression var;
binary operator op;
constant c1, c2;
symbol false;
@@
- var = fun(args, &err, args2);
+ var = fun(args, errp, args2);
... when != err
if (
(
var
|
!var
|
var op c1
)
)
{
... when != err
when != lbl:
when strict
- error_propagate(errp, err);
... when != err
(
return;
|
return c2;
|
return false;
|
return var;
)
}
@depends on rule1 || rule2@
identifier err;
@@
- Error *err = NULL;
... when != err
Not exactly elegant, I'm afraid.
The "when != lbl:" is necessary to avoid transforming
if (fun(args, &err)) {
goto out
}
...
out:
error_propagate(errp, err);
even though other paths to label out still need the error_propagate().
For an actual example, see sclp_realize().
Without the "when strict", Coccinelle transforms vfio_msix_setup(),
incorrectly. I don't know what exactly "when strict" does, only that
it helps here.
The match of return is narrower than what I want, but I can't figure
out how to express "return where the operand doesn't use @err". For
an example where it's too narrow, see vfio_intx_enable().
Silently fails to convert hw/arm/armsse.c, because Coccinelle gets
confused by ARMSSE being used both as typedef and function-like macro
there. Converted manually.
Line breaks tidied up manually. One nested declaration of @local_err
deleted manually. Preexisting unwanted blank line dropped in
hw/riscv/sifive_e.c.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-35-armbru@redhat.com>
The previous commit enables conversion of
foo(..., &err);
if (err) {
...
}
to
if (!foo(..., errp)) {
...
}
for QOM functions that now return true / false on success / error.
Coccinelle script:
@@
identifier fun = {
object_apply_global_props, object_initialize_child_with_props,
object_initialize_child_with_propsv, object_property_get,
object_property_get_bool, object_property_parse, object_property_set,
object_property_set_bool, object_property_set_int,
object_property_set_link, object_property_set_qobject,
object_property_set_str, object_property_set_uint, object_set_props,
object_set_propv, user_creatable_add_dict,
user_creatable_complete, user_creatable_del
};
expression list args, args2;
typedef Error;
Error *err;
@@
- fun(args, &err, args2);
- if (err)
+ if (!fun(args, &err, args2))
{
...
}
Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.
Line breaks tidied up manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-29-armbru@redhat.com>
The object_property_set_FOO() setters take property name and value in
an unusual order:
void object_property_set_FOO(Object *obj, FOO_TYPE value,
const char *name, Error **errp)
Having to pass value before name feels grating. Swap them.
Same for object_property_set(), object_property_get(), and
object_property_parse().
Convert callers with this Coccinelle script:
@@
identifier fun = {
object_property_get, object_property_parse, object_property_set_str,
object_property_set_link, object_property_set_bool,
object_property_set_int, object_property_set_uint, object_property_set,
object_property_set_qobject
};
expression obj, v, name, errp;
@@
- fun(obj, v, name, errp)
+ fun(obj, name, v, errp)
Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information". Convert that one manually.
Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.
Fails to convert hw/rx/rx-gdbsim.c, because Coccinelle gets confused
by RXCPU being used both as typedef and function-like macro there.
Convert manually. The other files using RXCPU that way don't need
conversion.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-27-armbru@redhat.com>
[Straightforwad conflict with commit 2336172d9b "audio: set default
value for pcspk.iobase property" resolved]
The previous commit enables conversion of
visit_foo(..., &err);
if (err) {
...
}
to
if (!visit_foo(..., errp)) {
...
}
for visitor functions that now return true / false on success / error.
Coccinelle script:
@@
identifier fun =~ "check_list|input_type_enum|lv_start_struct|lv_type_bool|lv_type_int64|lv_type_str|lv_type_uint64|output_type_enum|parse_type_bool|parse_type_int64|parse_type_null|parse_type_number|parse_type_size|parse_type_str|parse_type_uint64|print_type_bool|print_type_int64|print_type_null|print_type_number|print_type_size|print_type_str|print_type_uint64|qapi_clone_start_alternate|qapi_clone_start_list|qapi_clone_start_struct|qapi_clone_type_bool|qapi_clone_type_int64|qapi_clone_type_null|qapi_clone_type_number|qapi_clone_type_str|qapi_clone_type_uint64|qapi_dealloc_start_list|qapi_dealloc_start_struct|qapi_dealloc_type_anything|qapi_dealloc_type_bool|qapi_dealloc_type_int64|qapi_dealloc_type_null|qapi_dealloc_type_number|qapi_dealloc_type_str|qapi_dealloc_type_uint64|qobject_input_check_list|qobject_input_check_struct|qobject_input_start_alternate|qobject_input_start_list|qobject_input_start_struct|qobject_input_type_any|qobject_input_type_bool|qobject_input_type_bool_keyval|qobject_input_type_int64|qobject_input_type_int64_keyval|qobject_input_type_null|qobject_input_type_number|qobject_input_type_number_keyval|qobject_input_type_size_keyval|qobject_input_type_str|qobject_input_type_str_keyval|qobject_input_type_uint64|qobject_input_type_uint64_keyval|qobject_output_start_list|qobject_output_start_struct|qobject_output_type_any|qobject_output_type_bool|qobject_output_type_int64|qobject_output_type_null|qobject_output_type_number|qobject_output_type_str|qobject_output_type_uint64|start_list|visit_check_list|visit_check_struct|visit_start_alternate|visit_start_list|visit_start_struct|visit_type_.*";
expression list args;
typedef Error;
Error *err;
@@
- fun(args, &err);
- if (err)
+ if (!fun(args, &err))
{
...
}
A few line breaks tidied up manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-19-armbru@redhat.com>
QEMU incorrectly validates FEAT_SVM feature flags against
GET_SUPPORTED_CPUID even if SVM features are being masked out by
cpu_x86_cpuid(). This can make QEMU print warnings on most AMD
CPU models, even when SVM nesting is disabled (which is the
default).
This bug was never detected before because of a Linux KVM bug:
until Linux v5.6, KVM was not filtering out SVM features in
GET_SUPPORTED_CPUID when nested was disabled. This KVM bug was
fixed in Linux v5.7-rc1, on Linux commit a50718cc3f43 ("KVM:
nSVM: Expose SVM features to L1 iff nested is enabled").
Fix the problem by adding a CPUID_EXT3_SVM dependency to all
FEAT_SVM feature flags in the feature_dependencies table.
Reported-by: Yanan Fu <yfu@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20200623230116.277409-1-ehabkost@redhat.com>
[Fix testcase. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add which features are added or removed in this version.
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20200324051034.30541-1-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All remaining conversions to qdev_realize() are for bus-less devices.
Coccinelle script:
// only correct for bus-less @dev!
@@
expression errp;
expression dev;
@@
- qdev_init_nofail(dev);
+ qdev_realize(dev, NULL, &error_fatal);
@ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
expression errp;
expression dev;
symbol true;
@@
- object_property_set_bool(OBJECT(dev), true, "realized", errp);
+ qdev_realize(DEVICE(dev), NULL, errp);
@ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
expression errp;
expression dev;
symbol true;
@@
- object_property_set_bool(dev, true, "realized", errp);
+ qdev_realize(DEVICE(dev), NULL, errp);
Note that Coccinelle chokes on ARMSSE typedef vs. macro in
hw/arm/armsse.c. Worked around by temporarily renaming the macro for
the spatch run.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200610053247.1583243-57-armbru@redhat.com>
The Perfmon and Debug Capability MSR named IA32_PERF_CAPABILITIES is
a feature-enumerating MSR, which only enumerates the feature full-width
write (via bit 13) by now which indicates the processor supports IA32_A_PMCx
interface for updating bits 32 and above of IA32_PMCx.
The existence of MSR IA32_PERF_CAPABILITIES is enumerated by CPUID.1:ECX[15].
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20200529074347.124619-5-like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AVX512_VP2INTERSECT compute vector pair intersection to a pair
of mask registers, which is introduced with intel Tiger Lake,
defining as CPUID.(EAX=7,ECX=0):EDX[bit 08].
Refer to the following release spec:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <1586760758-13638-1-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This code is not related to hardware emulation.
Move it under accel/ with the other hypervisors.
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200508100222.7112-1-philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID leaf CPUID_Fn80000008_ECX provides information about the
number of threads supported by the processor. It was found that
the field ApicIdSize(bits 15-12) was not set correctly.
ApicIdSize is defined as the number of bits required to represent
all the ApicId values within a package.
Valid Values: Value Description
3h-0h Reserved.
4h up to 16 threads.
5h up to 32 threads.
6h up to 64 threads.
7h up to 128 threads.
Fh-8h Reserved.
Fix the bit appropriately.
This came up during following thread.
https://lore.kernel.org/qemu-devel/158643709116.17430.15995069125716778943.malonedeb@wampee.canonical.com/#t
Refer the Processor Programming Reference (PPR) for AMD Family 17h
Model 01h, Revision B1 Processors. The documentation is available
from the bugzilla Link below.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Reported-by: Philipp Eppelt <1871842@bugs.launchpad.net>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <20200417215345.64800.73351.stgit@localhost.localdomain>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
IEC binary prefixes ease code review: the unit is explicit.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200601142930.29408-9-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL. Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.
x86_cpu_load_model() is wrong that way. Harmless, because its @errp
is always &error_abort. To fix, cut out the @errp middleman.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200505101908.6207-11-armbru@redhat.com>
Devices may have component devices and buses.
Device realization may fail. Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).
When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far. If any of these unrealizes
failed, the device would be left in an inconsistent state. Must not
happen.
device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.
Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back? We'd have to
re-realize, which can fail. This design is fundamentally broken.
device_set_realized() does not roll back at all. Instead, it keeps
unrealizing, ignoring further errors.
It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.
bus_set_realized() does not roll back either. Instead, it stops
unrealizing.
Fortunately, no unrealize method can fail, as we'll see below.
To fix the design error, drop parameter @errp from all the unrealize
methods.
Any unrealize method that uses @errp now needs an update. This leads
us to unrealize() methods that can fail. Merely passing it to another
unrealize method cannot cause failure, though. Here are the ones that
do other things with @errp:
* virtio_serial_device_unrealize()
Fails when qbus_set_hotplug_handler() fails, but still does all the
other work. On failure, the device would stay realized with its
resources completely gone. Oops. Can't happen, because
qbus_set_hotplug_handler() can't actually fail here. Pass
&error_abort to qbus_set_hotplug_handler() instead.
* hw/ppc/spapr_drc.c's unrealize()
Fails when object_property_del() fails, but all the other work is
already done. On failure, the device would stay realized with its
vmstate registration gone. Oops. Can't happen, because
object_property_del() can't actually fail here. Pass &error_abort
to object_property_del() instead.
* spapr_phb_unrealize()
Fails and bails out when remove_drcs() fails, but other work is
already done. On failure, the device would stay realized with some
of its resources gone. Oops. remove_drcs() fails only when
chassis_from_bus()'s object_property_get_uint() fails, and it can't
here. Pass &error_abort to remove_drcs() instead.
Therefore, no unrealize method can fail before this patch.
device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool(). Can't drop @errp there, so pass
&error_abort.
We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors. Pass &error_abort instead.
Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.
One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().
Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
The only way object_property_add() can fail is when a property with
the same name already exists. Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.
Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent. Parentage is
also under program control, so this is a programming error, too.
We have a bit over 500 callers. Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.
The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.
Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL. Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call. ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.
When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.
Drop parameter @errp and assert the preconditions instead.
There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification". Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
Fixes the following coccinelle warnings:
$ spatch --sp-file --verbose-parsing ... \
scripts/coccinelle/remove_local_err.cocci
...
SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5213
SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5261
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:166
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:167
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:169
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:170
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:171
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:172
SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:173
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5787
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5789
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5800
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5801
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5802
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5804
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5805
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5806
SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:6329
SUSPICIOUS: a \ character appears outside of a #define at ./hw/sd/sdhci.c:1133
SUSPICIOUS: a \ character appears outside of a #define at ./hw/scsi/scsi-disk.c:3081
SUSPICIOUS: a \ character appears outside of a #define at ./hw/net/virtio-net.c:1529
SUSPICIOUS: a \ character appears outside of a #define at ./hw/riscv/sifive_u.c:468
SUSPICIOUS: a \ character appears outside of a #define at ./dump/dump.c:1895
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2209
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2215
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2221
SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2222
SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:172
SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:173
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200412223619.11284-2-f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Current Icelake-Server CPU model lacks all the features enumerated by
MSR_IA32_ARCH_CAPABILITIES.
Add them, so that guest of "Icelake-Server" can see all of them.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200316095605.12318-1-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The CPUID level need to be set to 0x14 manually on old
machine-type if Intel PT is enabled in guest. E.g. the
CPUID[0].EAX(level)=7 and CPUID[7].EBX[25](intel-pt)=1 when the
Qemu with "-machine pc-i440fx-3.1 -cpu qemu64,+intel-pt" parameter.
Some Intel PT capabilities are exposed by leaf 0x14 and the
missing capabilities will cause some MSRs access failed.
This patch add a warning message to inform the user to extend
the CPUID level.
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1584031686-16444-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
If the system is numa configured the pkg_offset needs
to be adjusted for EPYC cpu models. Fix it calling the
model specific handler.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396725589.58170.16424607815207074485.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The APIC ID is decoded based on the sequence sockets->dies->cores->threads.
This works fine for most standard AMD and other vendors' configurations,
but this decoding sequence does not follow that of AMD's APIC ID enumeration
strictly. In some cases this can cause CPU topology inconsistency.
When booting a guest VM, the kernel tries to validate the topology, and finds
it inconsistent with the enumeration of EPYC cpu models. The more details are
in the bug https://bugzilla.redhat.com/show_bug.cgi?id=1728166.
To fix the problem we need to build the topology as per the Processor
Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
Processors. The documentation is available from the bugzilla Link below.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
It is also available at
https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip
Here is the text from the PPR.
Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
number of least significant bits in the Initial APIC ID that indicate core ID
within a processor, in constructing per-core CPUID masks.
Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
(MNC) that the processor could theoretically support, not the actual number of
cores that are actually implemented or enabled on the processor, as indicated
by Core::X86::Cpuid::SizeId[NC].
Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
The new apic id encoding is enabled for EPYC and EPYC-Rome models.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <158396724913.58170.3539083528095710811.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add a boolean variable use_epyc_apic_id_encoding in X86CPUDefinition.
This will be set if this cpu model needs to use new EPYC based
apic id encoding.
Override the handlers with EPYC based handlers if use_epyc_apic_id_encoding
is set. This will be done in x86_cpus_init.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <158396723514.58170.14825482171652019765.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Use the new functions from topology.h and delete the unused code. Given the
sockets, nodes, cores and threads, the new functions generate apic id for EPYC
mode. Removes all the hardcoded values.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <158396722151.58170.8031705769621392927.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Update structures X86CPUTopoIDs and CPUX86State to hold the number of
nodes per package. This is required to build EPYC mode topology.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396720035.58170.1973738805301006456.stgit@naples-babu.amd.com>
Now that we have all the parameters in X86CPUTopoInfo, we can just
pass the structure to calculate the offsets and width.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396717953.58170.5628042059144117669.stgit@naples-babu.amd.com>
The CPUClass has a 'reset' method. This is a legacy from when
TYPE_CPU used not to inherit from TYPE_DEVICE. We don't need it any
more, as we can simply use the TYPE_DEVICE reset. The 'cpu_reset()'
function is kept as the API which most places use to reset a CPU; it
is now a wrapper which calls device_cold_reset() and then the
tracepoint function.
This change should not cause CPU objects to be reset more often
than they are at the moment, because:
* nobody is directly calling device_cold_reset() or
qdev_reset_all() on CPU objects
* no CPU object is on a qbus, so they will not be reset either
by somebody calling qbus_reset_all()/bus_cold_reset(), or
by the main "reset sysbus and everything in the qbus tree"
reset that most devices are reset by
Note that this does not change the need for each machine or whatever
to use qemu_register_reset() to arrange to call cpu_reset() -- that
is necessary because CPU objects are not on any qbus, so they don't
get reset when the qbus tree rooted at the sysbus bus is reset, and
this isn't being changed here.
All the changes to the files under target/ were made using the
included Coccinelle script, except:
(1) the deletion of the now-inaccurate and not terribly useful
"CPUClass::reset" comments was done with a perl one-liner afterwards:
perl -n -i -e '/ CPUClass::reset/ or print' target/*/*.c
(2) this bit of the s390 change was done by hand, because the
Coccinelle script is not sophisticated enough to handle the
parent_reset call being inside another function:
| @@ -96,8 +96,9 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type type)
| S390CPU *cpu = S390_CPU(s);
| S390CPUClass *scc = S390_CPU_GET_CLASS(cpu);
| CPUS390XState *env = &cpu->env;
|+ DeviceState *dev = DEVICE(s);
|
|- scc->parent_reset(s);
|+ scc->parent_reset(dev);
| cpu->env.sigp_order = 0;
| s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200303100511.5498-1-peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Adds the support for 2nd Gen AMD EPYC Processors. The model display
name will be EPYC-Rome.
Adds the following new feature bits on top of the feature bits from the
first generation EPYC models.
perfctr-core : core performance counter extensions support. Enables the VM to
use extended performance counter support. It enables six
programmable counters instead of four counters.
clzero : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
pointers.
wbnoinvd : Write back and do not invalidate cache
ibpb : Indirect Branch Prediction Barrier
amd-stibp : Single Thread Indirect Branch Predictor
clwb : Cache Line Write Back and Retain
xsaves : XSAVES, XRSTORS and IA32_XSS support
rdpid : Read Processor ID instruction support
umip : User-Mode Instruction Prevention support
The Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdfhttps://www.amd.com/system/files/TechDocs/24594.pdf
Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
6d61e3c32248 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
52297436199d ("kvm: svm: Update svm_xsaves_supported")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <157314966312.23828.17684821666338093910.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Adds the following missing CPUID bits:
perfctr-core : core performance counter extensions support. Enables the VM
to use extended performance counter support. It enables six
programmable counters instead of 4 counters.
clzero : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
pointers.
ibpb : Indirect Branch Prediction Barrie.
xsaves : XSAVES, XRSTORS and IA32_XSS supported.
Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
52297436199d ("kvm: svm: Update svm_xsaves_supported")
These new features will be added in EPYC-v3. The -cpu help output after the change.
x86 EPYC-v1 AMD EPYC Processor
x86 EPYC-v2 AMD EPYC Processor (with IBPB)
x86 EPYC-v3 AMD EPYC Processor
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <157314965662.23828.3063243729449408327.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add additional information for -cpu help to indicate the changes in this
version of CPU model.
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20200212081328.7385-4-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Because MPX is being removed from the linux kernel, remove MPX feature
from Denverton.
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20200212081328.7385-2-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
These two features were incorrectly tied to host_cpuid_required rather than
cpu->max_features. As a result, -cpu max was not enabling either MONITOR
features or ucode revision.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes a confusion in the help output. (Although, if you squint
long enough at the '-cpu help' output, you _do_ notice that
"Skylake-Client-noTSX-IBRS" is an alias of "Skylake-Client-v3";
similarly for Skylake-Server-v3.)
Without this patch:
$ qemu-system-x86 -cpu help
...
x86 Skylake-Client-v1 Intel Core Processor (Skylake)
x86 Skylake-Client-v2 Intel Core Processor (Skylake, IBRS)
x86 Skylake-Client-v3 Intel Core Processor (Skylake, IBRS)
...
x86 Skylake-Server-v1 Intel Xeon Processor (Skylake)
x86 Skylake-Server-v2 Intel Xeon Processor (Skylake, IBRS)
x86 Skylake-Server-v3 Intel Xeon Processor (Skylake, IBRS)
...
With this patch:
$ ./qemu-system-x86 -cpu help
...
x86 Skylake-Client-v1 Intel Core Processor (Skylake)
x86 Skylake-Client-v2 Intel Core Processor (Skylake, IBRS)
x86 Skylake-Client-v3 Intel Core Processor (Skylake, IBRS, no TSX)
...
x86 Skylake-Server-v1 Intel Xeon Processor (Skylake)
x86 Skylake-Server-v2 Intel Xeon Processor (Skylake, IBRS)
x86 Skylake-Server-v3 Intel Xeon Processor (Skylake, IBRS, no TSX)
...
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-Id: <20200123090116.14409-1-kchamart@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM can return the host microcode revision as a feature MSR.
Use it as the default value for -cpu host.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1579544504-3616-4-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the property and plumb it in TCG and HVF (the latter of which
tried to support returning a constant value but used the wrong MSR).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1579544504-3616-3-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert all targets to use cpu_class_set_parent_reset() with the following
coccinelle script:
@@
type CPUParentClass;
CPUParentClass *pcc;
CPUClass *cc;
identifier parent_fn;
identifier child_fn;
@@
+cpu_class_set_parent_reset(cc, child_fn, &pcc->parent_fn);
-pcc->parent_fn = cc->reset;
...
-cc->reset = child_fn;
Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <157650847817.354886.7047137349018460524.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It lacks VMX features and two security feature bits (disclosed recently) in
MSR_IA32_ARCH_CAPABILITIES in current Cooperlake CPU model, so add them.
Fixes: 22a866b616 ("i386: Add new CPU model Cooperlake")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191225063018.20038-3-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When using `query-cpu-definitions` using `-machine none`,
QEMU is resolving all CPU models to their latest versions. The
actual CPU model version being used by another machine type (e.g.
`pc-q35-4.0`) might be different.
In theory, this was OK because the correct CPU model
version is returned when using the correct `-machine` argument.
Except that in practice, this breaks libvirt expectations:
libvirt always use `-machine none` when checking if a CPU model
is runnable, because runnability is not expected to be affected
when the machine type is changed.
For example, when running on a Haswell host without TSX,
Haswell-v4 is runnable, but Haswell-v1 is not. On those hosts,
`query-cpu-definitions` says Haswell is runnable if using
`-machine none`, but Haswell is actually not runnable using any
of the `pc-*` machine types (because they resolve Haswell to
Haswell-v1). In other words, we're breaking the "runnability
guarantee" we promised to not break for a few releases (see
qemu-deprecated.texi).
To address this issue, change the default CPU model version to v1
on all machine types, so we make `query-cpu-definitions` output
when using `-machine none` match the results when using `pc-*`.
This will change in the future (the plan is to always return the
latest CPU model version if using `-machine none`), but only
after giving libvirt the opportunity to adapt.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1779078
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20191205223339.764534-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Similar to CPU and machine classes, "-accel" class names are mangled,
so we have to first get a class via accel_find and then instantiate it.
Provide a new function to instantiate a class without going through
object_class_get_name, and use it for CPUs and machines already.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cooper Lake is intel's successor to Cascade Lake, the new
CPU model inherits features from Cascadelake-Server, while
add one platform associated new feature: AVX512_BF16. Meanwhile,
add STIBP for speculative execution.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-4-git-send-email-cathy.zhang@intel.com>
Reviewed-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
They are present in client (Core) Skylake but pasted wrong into the server
SKUs.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have been trying to avoid adding new aliases for CPU model
versions, but in the case of changes in defaults introduced by
the TAA mitigation patches, the aliases might help avoid user
confusion when applying host software updates.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the
Trusty Side-channel Extension). By virtualizing the MSR, KVM guests
can disable TSX and avoid paying the price of mitigating TSX-based
attacks on microarchitectural side channels.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
TSX Async Abort (TAA) is a side channel attack on internal buffers in
some Intel processors similar to Microachitectural Data Sampling (MDS).
Some future Intel processors will use the ARCH_CAP_TAA_NO bit in the
IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable to
TAA. Make this bit available to guests.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Denverton is the Atom Processor of Intel Harrisonville platform.
For more information:
https://ark.intel.com/content/www/us/en/ark/products/\
codename/63508/denverton.html
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190718073405.28301-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions.
This patch adds support for user wait instructions in KVM. Availability
of the user wait instructions is indicated by the presence of the CPUID
feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may
be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to
set the maximum time.
The patch enable the umonitor, umwait and tpause features in KVM.
Because umwait and tpause can put a (psysical) CPU into a power saving
state, by default we dont't expose it to kvm and enable it only when
guest CPUID has it. And use QEMU command-line "-overcommit cpu-pm=on"
(enable_cpu_pm is enabled), a VM can use UMONITOR, UMWAIT and TPAUSE
instructions. If the instruction causes a delay, the amount of time
delayed is called here the physical delay. The physical delay is first
computed by determining the virtual delay (the time to delay relative to
the VM’s timestamp counter). Otherwise, UMONITOR, UMWAIT and TPAUSE cause
an invalid-opcode exception(#UD).
The release document ref below link:
https://software.intel.com/sites/default/files/\
managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-2-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V TLFS specifies this enlightenment as:
"NoNonArchitecturalCoreSharing - Indicates that a virtual processor will never
share a physical core with another virtual processor, except for virtual
processors that are reported as sibling SMT threads. This can be used as an
optimization to avoid the performance overhead of STIBP".
However, STIBP is not the only implication. It was found that Hyper-V on
KVM doesn't pass MD_CLEAR bit to its guests if it doesn't see
NoNonArchitecturalCoreSharing bit.
KVM reports NoNonArchitecturalCoreSharing in KVM_GET_SUPPORTED_HV_CPUID to
indicate that SMT on the host is impossible (not supported of forcefully
disabled).
Implement NoNonArchitecturalCoreSharing support in QEMU as tristate:
'off' - the feature is disabled (default)
'on' - the feature is enabled. This is only safe if vCPUS are properly
pinned and correct topology is exposed. As CPU pinning is done outside
of QEMU the enablement decision will be made on a higher level.
'auto' - copy KVM setting. As during live migration SMT settings on the
source and destination host may differ this requires us to add a migration
blocker.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20191018163908.10246-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add new version of Snowridge CPU model that removes MPX feature.
MPX support is being phased out by Intel. GCC has dropped it, Linux kernel
and KVM are also going to do that in the future.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191012024748.127135-1-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Drop the duplicated definition of cpuid AVX512_VBMI macro and rename
it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as
CPUID_7_0_ECX_AVX512_VBMI2.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-3-tao3.xu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add code to convert the VMX feature words back into MSR values,
allowing the user to enable/disable VMX features as they wish. The same
infrastructure enables support for limiting VMX features in named
CPU models.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP
and IA32_VMX_BASIC MSRs. (The VMX control MSRs are 64-bit wide but
actually have only 32 bits of information).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sometimes a CPU feature does not make sense unless another is
present. In the case of VMX features, KVM does not even allow
setting the VMX controls to some invalid combinations.
Therefore, this patch adds a generic mechanism that looks for bits
that the user explicitly cleared, and uses them to remove other bits
from the expanded CPU definition. If these dependent bits were also
explicitly *set* by the user, this will be a warning for "-cpu check"
and an error for "-cpu enforce". If not, then the dependent bits are
cleared silently, for convenience.
With VMX features, this will be used so that for example
"-cpu host,-rdrand" will also hide support for RDRAND exiting.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The next patch will add a different reason for filtering features, unrelated
to host feature support. Extract a new function that takes care of disabling
the features and optionally reporting them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is a problem, that you don't have access to the data using cpu_memory_rw_debug() function when in SMM. You can't remotely debug SMM mode program because of that for example.
Likely attrs version of get_phys_page_debug should be used to get correct asidx at the end to handle access properly.
Here the patch to fix it.
Signed-off-by: Dmitry Poletaev <poletaev@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform
and could be passed to the guest.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>