Linux-5.8 introduced interrupt based mechanism for 'page ready' events
delivery and disabled the old, #PF based one (see commit 2635b5c4a0e4
"KVM: x86: interrupt based APF 'page ready' event delivery"). Linux
guest switches to using in in 5.9 (see commit b1d405751cd5 "KVM: x86:
Switch KVM guest to using interrupts for page ready APF delivery").
The feature has a new KVM_FEATURE_ASYNC_PF_INT bit assigned and
the interrupt vector is set in MSR_KVM_ASYNC_PF_INT MSR. Support this
in QEMU.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200908141206.357450-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V TLFS prior to version 6.0 had a mistake in it: special value
'0xffffffff' for CPUID 0x40000004.EBX was called 'never to retry', this
looked weird (like why it's not '0' which supposedly have the same effect?)
but nobody raised the question. In TLFS version 6.0 the mistake was
corrected to 'never notify' which sounds logical. Fix QEMU accordingly.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200515114847.74523-1-vkuznets@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit c24a41bb53.
Remove the EPYC specific apicid decoding and use the generic
default decoding.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <159889937478.21294.4192291354416942986.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit 0c1538cb1a.
Remove the EPYC specific apicid decoding and use the generic
default decoding.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <159889935015.21294.1425332462852607813.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This reverts commit 7b225762c8.
Remove the EPYC specific apicid decoding and use the generic
default decoding.
Also fix all the references of pkg_offset.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <159889933119.21294.8112825730577505757.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
For CPUs support fast short REP MOV[CPUID.(EAX=7,ECX=0):EDX(bit4)], e.g
Icelake and Tigerlake, expose it to the guest VM.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20200714084148.26690-2-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This instruction aims to give a way to choose which memory accesses
do not need to be tracked in the TSX read set, which is defined as
CPUID.(EAX=7,ECX=0):EDX[bit 16].
The release spec link is as follows:
https://software.intel.com/content/dam/develop/public/us/en/documents/\
architecture-instruction-set-extensions-programming-reference.pdf
The associated kvm patch link is as follows:
https://lore.kernel.org/patchwork/patch/1268026/
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <1593991036-12183-3-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The availability of the SERIALIZATION instruction is indicated
by the presence of the CPUID feature flag SERIALIZE, which is
defined as CPUID.(EAX=7,ECX=0):ECX[bit 14].
The release spec link is as follows:
https://software.intel.com/content/dam/develop/public/us/en/documents/\
architecture-instruction-set-extensions-programming-reference.pdf
The associated kvm patch link is as follows:
https://lore.kernel.org/patchwork/patch/1268025/
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <1593991036-12183-2-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Support for nested guest live migration is part of Linux 5.8, add the
corresponding code to QEMU. The migration format consists of a few
flags, is an opaque 4k blob.
The blob is in VMCB format (the control area represents the L1 VMCB
control fields, the save area represents the pre-vmentry state; KVM does
not use the host save area since the AMD manual allows that) but QEMU
does not really care about that. However, the flags need to be
copied to hflags/hflags2 and back.
In addition, support for retrieving and setting the AMD nested virtualization
states allows the L1 guest to be reset while running a nested guest, but
a small bug in CPU reset needs to be fixed for that to work.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The SSE instruction implementations all fail to raise the expected
IEEE floating-point exceptions because they do nothing to convert the
exception state from the softfloat machinery into the exception flags
in MXCSR.
Fix this by adding such conversions. Unlike for x87, emulated SSE
floating-point operations might be optimized using hardware floating
point on the host, and so a different approach is taken that is
compatible with such optimizations. The required invariant is that
all exceptions set in env->sse_status (other than "denormal operand",
for which the SSE semantics are different from those in the softfloat
code) are ones that are set in the MXCSR; the emulated MXCSR is
updated lazily when code reads MXCSR, while when code sets MXCSR, the
exceptions in env->sse_status are set accordingly.
A few instructions do not raise all the exceptions that would be
raised by the softfloat code, and those instructions are made to save
and restore the softfloat exception state accordingly.
Nothing is done about "denormal operand"; setting that (only for the
case when input denormals are *not* flushed to zero, the opposite of
the logic in the softfloat code for such an exception) will require
custom code for relevant instructions, or else architecture-specific
conditionals in the softfloat code for when to set such an exception
together with custom code for various SSE conversion and rounding
instructions that do not set that exception.
Nothing is done about trapping exceptions (for which there is minimal
and largely broken support in QEMU's emulation in the x87 case and no
support at all in the SSE case).
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.21.2006252358000.3832@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-14-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There's no similar field in CPUX86State, but it's needed for MMIO traps.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-13-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The lazy flags are still needed for instruction decoder.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200528193758.51454-12-r.bolshakov@yadro.com>
[Move struct to target/i386/cpu.h - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Perfmon and Debug Capability MSR named IA32_PERF_CAPABILITIES is
a feature-enumerating MSR, which only enumerates the feature full-width
write (via bit 13) by now which indicates the processor supports IA32_A_PMCx
interface for updating bits 32 and above of IA32_PMCx.
The existence of MSR IA32_PERF_CAPABILITIES is enumerated by CPUID.1:ECX[15].
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20200529074347.124619-5-like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AVX512_VP2INTERSECT compute vector pair intersection to a pair
of mask registers, which is introduced with intel Tiger Lake,
defining as CPUID.(EAX=7,ECX=0):EDX[bit 08].
Refer to the following release spec:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <1586760758-13638-1-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When we hotplug vcpus, cpu_update_state is added to vm_change_state_head
in kvm_arch_init_vcpu(). But it forgot to delete in kvm_arch_destroy_vcpu() after
unplug. Then it will cause a use-after-free access. This patch delete it in
kvm_arch_destroy_vcpu() to fix that.
Reproducer:
virsh setvcpus vm1 4 --live
virsh setvcpus vm1 2 --live
virsh suspend vm1
virsh resume vm1
The UAF stack:
==qemu-system-x86_64==28233==ERROR: AddressSanitizer: heap-use-after-free on address 0x62e00002e798 at pc 0x5573c6917d9e bp 0x7fff07139e50 sp 0x7fff07139e40
WRITE of size 1 at 0x62e00002e798 thread T0
#0 0x5573c6917d9d in cpu_update_state /mnt/sdb/qemu/target/i386/kvm.c:742
#1 0x5573c699121a in vm_state_notify /mnt/sdb/qemu/vl.c:1290
#2 0x5573c636287e in vm_prepare_start /mnt/sdb/qemu/cpus.c:2144
#3 0x5573c6362927 in vm_start /mnt/sdb/qemu/cpus.c:2150
#4 0x5573c71e8304 in qmp_cont /mnt/sdb/qemu/monitor/qmp-cmds.c:173
#5 0x5573c727cb1e in qmp_marshal_cont qapi/qapi-commands-misc.c:835
#6 0x5573c7694c7a in do_qmp_dispatch /mnt/sdb/qemu/qapi/qmp-dispatch.c:132
#7 0x5573c7694c7a in qmp_dispatch /mnt/sdb/qemu/qapi/qmp-dispatch.c:175
#8 0x5573c71d9110 in monitor_qmp_dispatch /mnt/sdb/qemu/monitor/qmp.c:145
#9 0x5573c71dad4f in monitor_qmp_bh_dispatcher /mnt/sdb/qemu/monitor/qmp.c:234
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200513132630.13412-1-pannengyuan@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No functional change.
This information will be used by following patches.
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20200312165431.82118-15-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
translates the host VA delivered by host to guest PA, then fills this PA
to guest APEI GHES memory, then notifies guest according to the SIGBUS
type.
When guest accesses the poisoned memory, it will generate a Synchronous
External Abort(SEA). Then host kernel gets an APEI notification and calls
memory_failure() to unmapped the affected page in stage 2, finally
returns to guest.
Guest continues to access the PG_hwpoison page, it will trap to KVM as
stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
Qemu, Qemu records this error address into guest APEI GHES memory and
notifes guest using Synchronous-External-Abort(SEA).
In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
in which we can setup the type of exception and the syndrome information.
When switching to guest, the target vcpu will jump to the synchronous
external abort vector table entry.
The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
not valid and hold an UNKNOWN value. These values will be set to KVM
register structures through KVM_SET_ONE_REG IOCTL.
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Xiang Zheng <zhengxiang9@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20200512030609.19593-10-gengdongjiu@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the system is numa configured the pkg_offset needs
to be adjusted for EPYC cpu models. Fix it calling the
model specific handler.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396725589.58170.16424607815207074485.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add a boolean variable use_epyc_apic_id_encoding in X86CPUDefinition.
This will be set if this cpu model needs to use new EPYC based
apic id encoding.
Override the handlers with EPYC based handlers if use_epyc_apic_id_encoding
is set. This will be done in x86_cpus_init.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <158396723514.58170.14825482171652019765.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Notice the magic page during translate, much like we already
do for the arm32 commpage. At runtime, raise an exception to
return cpu_loop for emulation.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200213032223.14643-4-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
We are not short of numbers for EXCP_*. There is no need to confuse things
by having EXCP_VMEXIT and EXCP_SYSCALL overlap, even though the former is
only used for system mode and the latter is only used for user mode.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200213032223.14643-2-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Bug fixes:
* memory encryption: Disable mem merge
(Dr. David Alan Gilbert)
Features:
* New EPYC CPU definitions (Babu Moger)
* Denventon-v2 CPU model (Tao Xu)
* New 'note' field on versioned CPU models (Tao Xu)
Cleanups:
* x86 CPU topology cleanups (Babu Moger)
* cpu: Use DeviceClass reset instead of a special CPUClass reset
(Peter Maydell)
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAl5xdnsUHGVoYWJrb3N0
QHJlZGhhdC5jb20ACgkQKAeTb5hNxaYkGA/9Fn1tCdW/74CEREPbcKNOf8twmCr2
L4qykix7mFcZXstFhEQuoNJQMz8mEPJngOfUSQY1c9w4psf0AXE6q3wbdNcxxdj1
1/+cPbaRuoF8EKw63MgR3AaReuWtAV+sGS4+eKBMJTMUbl03pOYARE+irCWJU6rd
YdP0t6CX0NWF4afv+2wMeeZVr+IcKEo81jCCCSjmM0YLkwvu0Vs5ng3jE7vtFKPj
MQHMyqD/lz0FwyksBiOLwjOCbnmIydWc/8VV68UH5ulxka96jk8CwmI0+A9v2UMQ
4PjQ84UeQclJTbec+h/Qy8DoCP3qiqijFMRau2wo1UWCsAjMcaRIJjIe5CSOJFRu
3FrP2FEJCZiWjh11b/x3jIyjK6MDjv3Y1oky1j5VkCnFUNLHbXUA2KY3jaZ/pf+1
BDqa6lNDYJBN+FQQt0yXDWAdGLUxxP87S9jmU9RULzwAwCic0FxVR/a5zk9EUDi0
mA+WL0ekfhIEVACdHYuCTxujGq8QnGiCppr1Wgx3t+GgveR8AjXdd/KclcKskYiw
ozbujtBPQUImuq3xi6FTkRHXuEW+zc+IFbhZ3Zq5OhmJmpdgmSHryFcKAdvNJH/z
VllKAsLg1hffm+PjlpuZLBucC4PBrvHbS7htHhMaemEiJHO9V5EfGDWQdELNRM8p
sKymFNs5XjzQcGE=
=9fEL
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-and-machine-pull-request' into staging
x86 and machine queue for 5.0 soft freeze
Bug fixes:
* memory encryption: Disable mem merge
(Dr. David Alan Gilbert)
Features:
* New EPYC CPU definitions (Babu Moger)
* Denventon-v2 CPU model (Tao Xu)
* New 'note' field on versioned CPU models (Tao Xu)
Cleanups:
* x86 CPU topology cleanups (Babu Moger)
* cpu: Use DeviceClass reset instead of a special CPUClass reset
(Peter Maydell)
# gpg: Signature made Wed 18 Mar 2020 01:16:43 GMT
# gpg: using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg: issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-and-machine-pull-request:
hw/i386: Rename apicid_from_topo_ids to x86_apicid_from_topo_ids
hw/i386: Update structures to save the number of nodes per package
hw/i386: Remove unnecessary initialization in x86_cpu_new
machine: Add SMP Sockets in CpuTopology
hw/i386: Consolidate topology functions
hw/i386: Introduce X86CPUTopoInfo to contain topology info
cpu: Use DeviceClass reset instead of a special CPUClass reset
machine/memory encryption: Disable mem merge
hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
i386: Add 2nd Generation AMD EPYC processors
i386: Add missing cpu feature bits in EPYC model
target/i386: Add new property note to versioned CPU models
target/i386: Add Denverton-v2 (no MPX) CPU model
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Update structures X86CPUTopoIDs and CPUX86State to hold the number of
nodes per package. This is required to build EPYC mode topology.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396720035.58170.1973738805301006456.stgit@naples-babu.amd.com>
Adds the support for 2nd Gen AMD EPYC Processors. The model display
name will be EPYC-Rome.
Adds the following new feature bits on top of the feature bits from the
first generation EPYC models.
perfctr-core : core performance counter extensions support. Enables the VM to
use extended performance counter support. It enables six
programmable counters instead of four counters.
clzero : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
pointers.
wbnoinvd : Write back and do not invalidate cache
ibpb : Indirect Branch Prediction Barrier
amd-stibp : Single Thread Indirect Branch Predictor
clwb : Cache Line Write Back and Retain
xsaves : XSAVES, XRSTORS and IA32_XSS support
rdpid : Read Processor ID instruction support
umip : User-Mode Instruction Prevention support
The Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdfhttps://www.amd.com/system/files/TechDocs/24594.pdf
Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
6d61e3c32248 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
52297436199d ("kvm: svm: Update svm_xsaves_supported")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <157314966312.23828.17684821666338093910.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of passing a pointer to memory now just extend the GByteArray
to all the read register helpers. They can then safely append their
data through the normal way. We don't bother with this abstraction for
write registers as we have already ensured the buffer being copied
from is the correct size.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-Id: <20200316172155.971-15-alex.bennee@linaro.org>
Add the property and plumb it in TCG and HVF (the latter of which
tried to support returning a constant value but used the wrong MSR).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1579544504-3616-3-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The functions generated by these macros are unused.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The bit 6, 7 and 8 of MSR_IA32_ARCH_CAPABILITIES are recently disclosed
for some security issues. Add the definitions for them to be used by named
CPU models.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191225063018.20038-2-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
stibp feature is already added through the following commit.
0e89165829
Add a macro for it to allow CPU models to report it when host supports.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-3-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Define MSR_ARCH_CAP_MDS_NO in the IA32_ARCH_CAPABILITIES MSR to allow
CPU models to report the feature when host supports it.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-2-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the
Trusty Side-channel Extension). By virtualizing the MSR, KVM guests
can disable TSX and avoid paying the price of mitigating TSX-based
attacks on microarchitectural side channels.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Change the handling of port F0h writes and FPU exceptions to implement IGNNE.
The implementation mixes a bit what the chipset and processor do in real
hardware, but the effect is the same as what happens with actual FERR#
and IGNNE# pins: writing to port F0h asserts IGNNE# in addition to lowering
FP_IRQ; while clearing the SE bit in the FPU status word deasserts IGNNE#.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move it out of pc.c since it is strictly tied to TCG. This is
almost exclusively code movement, the next patch will implement
IGNNE.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
UMWAIT and TPAUSE instructions use 32bits IA32_UMWAIT_CONTROL at MSR
index E1H to determines the maximum time in TSC-quanta that the processor
can reside in either C0.1 or C0.2.
This patch is to Add support for save/load IA32_UMWAIT_CONTROL MSR in
guest.
Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-3-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions.
This patch adds support for user wait instructions in KVM. Availability
of the user wait instructions is indicated by the presence of the CPUID
feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may
be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to
set the maximum time.
The patch enable the umonitor, umwait and tpause features in KVM.
Because umwait and tpause can put a (psysical) CPU into a power saving
state, by default we dont't expose it to kvm and enable it only when
guest CPUID has it. And use QEMU command-line "-overcommit cpu-pm=on"
(enable_cpu_pm is enabled), a VM can use UMONITOR, UMWAIT and TPAUSE
instructions. If the instruction causes a delay, the amount of time
delayed is called here the physical delay. The physical delay is first
computed by determining the virtual delay (the time to delay relative to
the VM’s timestamp counter). Otherwise, UMONITOR, UMWAIT and TPAUSE cause
an invalid-opcode exception(#UD).
The release document ref below link:
https://software.intel.com/sites/default/files/\
managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-2-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V TLFS specifies this enlightenment as:
"NoNonArchitecturalCoreSharing - Indicates that a virtual processor will never
share a physical core with another virtual processor, except for virtual
processors that are reported as sibling SMT threads. This can be used as an
optimization to avoid the performance overhead of STIBP".
However, STIBP is not the only implication. It was found that Hyper-V on
KVM doesn't pass MD_CLEAR bit to its guests if it doesn't see
NoNonArchitecturalCoreSharing bit.
KVM reports NoNonArchitecturalCoreSharing in KVM_GET_SUPPORTED_HV_CPUID to
indicate that SMT on the host is impossible (not supported of forcefully
disabled).
Implement NoNonArchitecturalCoreSharing support in QEMU as tristate:
'off' - the feature is disabled (default)
'on' - the feature is enabled. This is only safe if vCPUS are properly
pinned and correct topology is exposed. As CPU pinning is done outside
of QEMU the enablement decision will be made on a higher level.
'auto' - copy KVM setting. As during live migration SMT settings on the
source and destination host may differ this requires us to add a migration
blocker.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20191018163908.10246-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the duplicated definition of cpuid AVX512_VBMI macro and rename
it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as
CPUID_7_0_ECX_AVX512_VBMI2.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-3-tao3.xu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add some comments, clean up comments over 80 chars per line. And there
is an extra line in comment of CPUID_8000_0008_EBX_WBNOINVD, remove
the extra enter and spaces.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-2-tao3.xu@intel.com>
[ehabkost: rebase to latest git master]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add code to convert the VMX feature words back into MSR values,
allowing the user to enable/disable VMX features as they wish. The same
infrastructure enables support for limiting VMX features in named
CPU models.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These will be used to compile the list of VMX features for named
CPU models, and/or by the code that sets up the VMX MSRs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP
and IA32_VMX_BASIC MSRs. (The VMX control MSRs are 64-bit wide but
actually have only 32 bits of information).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is a problem, that you don't have access to the data using cpu_memory_rw_debug() function when in SMM. You can't remotely debug SMM mode program because of that for example.
Likely attrs version of get_phys_page_debug should be used to get correct asidx at the end to handle access properly.
Here the patch to fix it.
Signed-off-by: Dmitry Poletaev <poletaev@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform
and could be passed to the guest.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel CooperLake cpu adds AVX512_BF16 instruction, defining as
CPUID.(EAX=7,ECX=1):EAX[bit 05].
The patch adds a property for setting the subleaf of CPUID leaf 7 in
case that people would like to specify it.
The release spec link as follows,
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add support for halt poll control MSR: save/restore, migration
and new feature name.
The purpose of this MSR is to allow the guest to disable
host halt poll.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20190603230408.GA7938@amt.cnet>
[Do not enable by default, as pointed out by Mark Kanda. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We declare incomplete struct VMStateDescription in a couple of places
so we don't have to include migration/vmstate.h for the typedef.
That's fine with me. However, the next commit will drop
migration/vmstate.h from a massive number of compiles. Move the
typedef to qemu/typedefs.h now, so I don't have to insert struct in
front of VMStateDescription all over the place then.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-15-armbru@redhat.com>
Previous to this change, a vCPU exposed with VMX running on a kernel
without KVM_CAP_NESTED_STATE or KVM_CAP_EXCEPTION_PAYLOAD resulted in
adding a migration blocker. This was because when the code was written
it was thought there is no way to reliably know if a vCPU is utilising
VMX or not at runtime. However, it turns out that this can be known to
some extent:
In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
Since it was set, CR4.VMXE must remain set as long as the vCPU is in
VMX operation. This is because CR4.VMXE is one of the bits set
in MSR_IA32_VMX_CR4_FIXED1.
There is one exception to the above statement when vCPU enters SMM mode.
When a vCPU enters SMM mode, it temporarily exits VMX operation and
may also reset CR4.VMXE during execution in SMM mode.
When the vCPU exits SMM mode, vCPU state is restored to be in VMX operation
and CR4.VMXE is restored to its original state of being set.
Therefore, when the vCPU is not in SMM mode, we can infer whether
VMX is being used by examining CR4.VMXE. Otherwise, we cannot
know for certain but assume the worse that vCPU may utilise VMX.
Summaring all the above, a vCPU may have enabled VMX in case
CR4.VMXE is set or vCPU is in SMM mode.
Therefore, remove migration blocker and check before migration
(cpu_pre_save()) if the vCPU may have enabled VMX. If true, only then
require relevant kernel capabilities.
While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when the vCPU is in
guest-mode and there is a pending/injected exception. Otherwise, this
kernel capability is not required for proper migration.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Tested-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will make unversioned CPU models behavior depend on the
machine type:
* "pc-*-4.0" and older will not report them as aliases.
This is done to keep compatibility with older QEMU versions
after management software starts translating aliases.
* "pc-*-4.1" will translate unversioned CPU models to -v1.
This is done to keep compatibility with existing management
software, that still relies on CPU model runnability promises.
* "none" will translate unversioned CPU models to their latest
version. This is planned become the default in future machine
types (probably in pc-*-4.3).
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-8-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add support for registration of multiple versions of CPU models.
The existing CPU models will be registered with a "-v1" suffix.
The -noTSX, -IBRS, and -IBPB CPU model variants will become
versions of the original models in a separate patch, so
make sure we register no versions for them.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-5-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add a new option that can be used to disable feature flag
filtering. This will allow CPU model compatibility test cases to
work without host hardware dependencies.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-3-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The CPUID.1F as Intel V2 Extended Topology Enumeration Leaf would be
exposed if guests want to emulate multiple software-visible die within
each package. Per Intel's SDM, the 0x1f is a superset of 0xb, thus they
can be generated by almost same code as 0xb except die_offset setting.
If the number of dies per package is greater than 1, the cpuid_min_level
would be adjusted to 0x1f regardless of whether the host supports CPUID.1F.
Likewise, the CPUID.1F wouldn't be exposed if env->nr_dies < 2.
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190620054525.37188-2-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Use the same definition as features/user_features in CPUX86State.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190620023746.9869-1-richardw.yang@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The current default value for hv-spinlocks is 0xFFFFFFFF (meaning
"never retry"). However, the value is stored as a signed
integer, making the getter of the hv-spinlocks QOM property
return -1 instead of 0xFFFFFFFF.
Fix this by changing the type of X86CPU::hyperv_spinlock_attempts
to uint32_t. This has no visible effect to guest operating
systems, affecting just the behavior of the QOM getter.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190615200505.31348-1-ehabkost@redhat.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The field die_id (default as 0) and has_die_id are introduced to X86CPU.
Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-4-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The die-level as the first PC-specific cpu topology is added to the leagcy
cpu topology model, which has one die per package implicitly and only the
numbers of sockets/cores/threads are configurable.
In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:
#cpus = #sockets * #dies * #cores * #threads
and considering compatibility, the default value for #dies would be
initialized to one in x86_cpu_initfn() and pc_machine_initfn().
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-2-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Kernel commit c4f55198c7c2 ("kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD")
introduced a new KVM capability which allows userspace to correctly
distinguish between pending and injected exceptions.
This distinguish is important in case of nested virtualization scenarios
because a L2 pending exception can still be intercepted by the L1 hypervisor
while a L2 injected exception cannot.
Furthermore, when an exception is attempted to be injected by QEMU,
QEMU should specify the exception payload (CR2 in case of #PF or
DR6 in case of #DB) instead of having the payload already delivered in
the respective vCPU register. Because in case exception is injected to
L2 guest and is intercepted by L1 hypervisor, then payload needs to be
reported to L1 intercept (VMExit handler) while still preserving
respective vCPU register unchanged.
This commit adds support for QEMU to properly utilise this new KVM
capability (KVM_CAP_EXCEPTION_PAYLOAD).
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-10-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
introduced new IOCTLs to extract and restore vCPU state related to
Intel VMX & AMD SVM.
Utilize these IOCTLs to add support for migration of VMs which are
running nested hypervisors.
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Tested-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-9-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit d98f26073b ("target/i386: kvm: add VMX migration blocker")
added a migration blocker for vCPU exposed with Intel VMX.
However, migration should also be blocked for vCPU exposed with
AMD SVM.
Both cases should be blocked because QEMU should extract additional
vCPU state from KVM that should be migrated as part of vCPU VMState.
E.g. Whether vCPU is running in guest-mode or host-mode.
Fixes: d98f26073b ("target/i386: kvm: add VMX migration blocker")
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-6-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MSR IA32_CORE_CAPABILITY is a feature-enumerating MSR, which only
enumerates the feature split lock detection (via bit 5) by now.
The existence of MSR IA32_CORE_CAPABILITY is enumerated by CPUID.7_0:EDX[30].
The latest kernel patches about them can be found here:
https://lkml.org/lkml/2019/4/24/1909
Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
Message-Id: <20190617153654.916-1-xiaoyao.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V on KVM can only use Synthetic timers with Direct Mode (opting for
an interrupt instead of VMBus message). This new capability is only
announced in KVM_GET_SUPPORTED_HV_CPUID.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-10-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In many case we just want to give Windows guests all currently supported
Hyper-V enlightenments and that's where this new mode may come handy. We
pass through what was returned by KVM_GET_SUPPORTED_HV_CPUID.
hv_cpuid_check_and_set() is modified to also set cpu->hyperv_* flags as
we may want to check them later (and we actually do for hv_runtime,
hv_synic,...).
'hv-passthrough' is a development only feature, a migration blocker is
added to prevent issues while migrating between hosts with different
feature sets.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-6-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Representing Hyper-V properties as bits will allow us to check features
and dependencies between them in a natural way.
Suggested-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h,
sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in
qemu-common.h. This necessitates inclusion of qemu-common.h into
headers, which is against the rules spelled out in qemu-common.h's
file comment.
Move tcg_enabled() & friends into their own header sysemu/tcg.h, and
adjust #include directives.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-2-armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[Rebased with conflicts resolved automatically, except for
accel/tcg/tcg-all.c]
This macro is now always empty, so remove it. This leaves the
entire contents of CPUArchState under the control of the guest
architecture.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Nothing in there so far, but all of the plumbing done
within the target ArchCPU state.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we have ArchCPU, we can define this generically,
in the one place that needs it.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Cleanup in the boilerplate that each target must define.
Replace x86_env_get_cpu with env_archcpu. The combination
CPU(x86_env_get_cpu) should have used ENV_GET_CPU to begin;
use env_cpu now.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we have both ArchCPU and CPUArchState, we can define
this generically instead of via macro in each target's cpu.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For all targets, do this just before including exec/cpu-all.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For all targets, do this just before including exec/cpu-all.h.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For all targets, into this new file move TARGET_LONG_BITS,
TARGET_PAGE_BITS, TARGET_PHYS_ADDR_SPACE_BITS,
TARGET_VIRT_ADDR_SPACE_BITS, and NB_MMU_MODES.
Include this new file from exec/cpu-defs.h.
This now removes the somewhat odd requirement that target/arch/cpu.h
defines TARGET_LONG_BITS before including exec/cpu-defs.h, so push the
bulk of the includes within target/arch/cpu.h to the top.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The CPUID.01H:ECX[bit 3] ought to mirror the value of the MSR
IA32_MISC_ENABLE MWAIT bit and as userspace has control of them
both, it is userspace's job to configure both bits to match on
the initial setup.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1557813999-9175-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We do not support probing, but we do not need it yet either.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a new base CPU model called 'Dhyana' to model processors from Hygon
Dhyana(family 18h), which derived from AMD EPYC(family 17h).
The following features bits have been removed compare to AMD EPYC:
aes, pclmulqdq, sha_ni
The Hygon Dhyana support to KVM in Linux is already accepted upstream[1].
So add Hygon Dhyana support to Qemu is necessary to create Hygon's own
CPU model.
Reference:
[1] https://git.kernel.org/tip/fec98069fb72fb656304a3e52265e0c2fc9adf87
Signed-off-by: Pu Wen <puwen@hygon.cn>
Message-Id: <1555416373-28690-1-git-send-email-puwen@hygon.cn>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
CPUClass method dump_statistics() takes an fprintf()-like callback and
a FILE * to pass to it. Most callers pass fprintf() and stderr.
log_cpu_state() passes fprintf() and qemu_log_file.
hmp_info_registers() passes monitor_fprintf() and the current monitor
cast to FILE *. monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf().
The callback gets passed around a lot, which is tiresome. The
type-punning around monitor_fprintf() is ugly.
Drop the callback, and call qemu_fprintf() instead. Also gets rid of
the type-punning, since qemu_fprintf() takes NULL instead of the
current monitor cast to FILE *.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-15-armbru@redhat.com>
x86_cpu_dump_local_apic_state() takes an fprintf()-like callback and a
FILE * to pass to it, and so do its helper functions.
Its only caller hmp_info_local_apic() passes monitor_fprintf() and the
current monitor cast to FILE *. monitor_fprintf() casts it right
back, and is otherwise identical to monitor_printf(). The
type-punning is ugly.
Drop the callback, and call qemu_printf() instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-12-armbru@redhat.com>
The various TARGET_cpu_list() take an fprintf()-like callback and a
FILE * to pass to it. Their callers (vl.c's main() via list_cpus(),
bsd-user/main.c's main(), linux-user/main.c's main()) all pass
fprintf() and stdout. Thus, the flexibility provided by the (rather
tiresome) indirection isn't actually used.
Drop the callback, and call qemu_printf() instead.
Calling printf() would also work, but would make the code unsuitable
for monitor context without making it simpler.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190417191805.28198-10-armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Intel Processor Trace required CPUID[0x14] but the cpuid_level
have no change when create a kvm guest with
e.g. "-cpu qemu64,+intel-pt".
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1548805979-12321-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 5131dc433d.
For new instruction 'PCONFIG' will not be exposed to guest.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1545227081-213696-3-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some downstream distributions of QEMU set host-phys-bits=on by
default. This worked very well for most use cases, because
phys-bits really didn't have huge consequences. The only
difference was on the CPUID data seen by guests, and on the
handling of reserved bits.
This changed in KVM commit 855feb673640 ("KVM: MMU: Add 5 level
EPT & Shadow page table support"). Now choosing a large
phys-bits value for a VM has bigger impact: it will make KVM use
5-level EPT even when it's not really necessary. This means
using the host phys-bits value may not be the best choice.
Management software could address this problem by manually
configuring phys-bits depending on the size of the VM and the
amount of MMIO address space required for hotplug. But this is
not trivial to implement.
However, there's another workaround that would work for most
cases: keep using the host phys-bits value, but only if it's
smaller than 48. This patch makes this possible by introducing a
new "-cpu" option: "host-phys-bits-limit". Management software
or users can make sure they will always use 4-level EPT using:
"host-phys-bits=on,host-phys-bits-limit=48".
This behavior is still not enabled by default because QEMU
doesn't enable host-phys-bits=on by default. But users,
management software, or downstream distributions may choose to
change their defaults using the new option.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20181211192527.13254-1-ehabkost@redhat.com>
[ehabkost: removed test code while some issues are addressed]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
It was found that QMP users of QEMU (e.g. libvirt) may need
HV_CPUID_ENLIGHTMENT_INFO.EAX/HV_CPUID_NESTED_FEATURES.EAX information. In
particular, 'hv_tlbflush' and 'hv_evmcs' enlightenments are only exposed in
HV_CPUID_ENLIGHTMENT_INFO.EAX.
HV_CPUID_NESTED_FEATURES.EAX is exposed for two reasons: convenience
(we don't need to export it from hyperv_handle_properties() and as
future-proof for Enlightened MSR-Bitmap, PV EPT invalidation and
direct virtual flush features.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20181126135958.20956-1-vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
MOVDIR64B moves 64-bytes as direct-store with 64-bytes write atomicity.
Direct store is implemented by using write combining (WC) for writing
data directly into memory without caching the data.
The bit definition:
CPUID.(EAX=7,ECX=0):ECX[bit 28] MOVDIR64B
The release document ref below link:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Cc: Xu Tao <tao3.xu@intel.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Message-Id: <1541488407-17045-3-git-send-email-jingqi.liu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
MOVDIRI moves doubleword or quadword from register to memory through
direct store which is implemented by using write combining (WC) for
writing data directly into memory without caching the data.
The bit definition:
CPUID.(EAX=7,ECX=0):ECX[bit 27] MOVDIRI
The release document ref below link:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Cc: Xu Tao <tao3.xu@intel.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Message-Id: <1541488407-17045-2-git-send-email-jingqi.liu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Adds a new CPU flag to enable the Enlightened VMCS KVM feature.
QEMU enables KVM_CAP_HYPERV_ENLIGHTENED_VMCS and gets back the
version to be advertised in lower 16 bits of CPUID.0x4000000A:EAX.
Suggested-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20181022165506.30332-3-vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Note RSBA is specially treated -- no matter host support it or not, qemu
pretends it is supported.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1539578845-37944-4-git-send-email-robert.hu@linux.intel.com>
[ehabkost: removed automatic enabling of RSBA]
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Certain configurations do not allow SynIC to be used in QEMU. In
particular,
- when hyperv_vpindex is off, SINT routes can't be used as they refer to
the destination vCPU by vp_index
- older KVM (which doesn't expose KVM_CAP_HYPERV_SYNIC2) zeroes out
SynIC message and event pages on every msr load, breaking migration
OTOH in-KVM users of SynIC -- SynIC timers -- do work in those
configurations, and we shouldn't stop the guest from using them.
To cover both scenarios, introduce an X86CPU property that makes CPU
init code to skip creation of the SynIC object (and thus disables any
SynIC use in QEMU) but keeps the KVM part of the SynIC working.
The property is clear by default but is set via compat logic for older
machine types.
As a result, when hv_synic and a modern machine type are specified, QEMU
will refuse to run unless vp_index is on and the kernel is recent
enough. OTOH with an older machine type QEMU will run fine with
hv_synic=on against an older kernel and/or without vp_index enabled but
will disallow the in-QEMU uses of SynIC (in e.g. VMBus).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V PV IPI support is merged to KVM, enable the feature in Qemu. When
enabled, this allows Windows guests to send IPIs to other vCPUs with a
single hypercall even when there are >64 vCPUs in the request.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20181009130853.6412-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While at it, also rename var to indicate it is not used only in KVM.
Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
Reviewed-by: Patrick Colp <patrick.colp@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20180914003827.124570-2-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This flag will be used for KVM's nested VMX migration; the HF_GUEST_MASK name
is already used in KVM, adopt it in QEMU as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Interrupt handling depends on various flags in env->hflags or env->hflags2,
and the exact detail were not exactly replicated between x86_cpu_has_work
and x86_cpu_exec_interrupt. Create a new function that extracts the
highest-priority non-masked interrupt, and use it in both functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
WBNOINVD: Write back and do not invalidate cache, enumerated by
CPUID.(EAX=80000008H, ECX=0):EBX[bit 9].
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-5-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Support of IA32_PRED_CMD MSR already be enumerated by same CPUID bit as
SPEC_CTRL.
At present, mark CPUID_7_0_EDX_ARCH_CAPABILITIES unmigratable, per Paolo's
comment.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-3-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
IA32_PRED_CMD MSR gives software a way to issue commands that affect the state
of indirect branch predictors. Enumerated by CPUID.(EAX=7H,ECX=0):EDX[26].
IA32_ARCH_CAPABILITIES MSR enumerates architectural features of RDCL_NO and
IBRS_ALL. Enumerated by CPUID.(EAX=07H, ECX=0):EDX[29].
https://software.intel.com/sites/default/files/managed/c5/63/336996-Speculative-Execution-Side-Channel-Mitigations.pdf
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-2-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
MSR_SMI_COUNT started being migrated in QEMU 2.12. Do not migrate it
on older machine types, or the subsection causes a load failure for
guests that use SMM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename DCACHE to DATA_CACHE and ICACHE to INSTRUCTION_CACHE.
This avoids conflict with Linux asm/cachectl.h macros and fixes
build failure on mips hosts.
Reported-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180717194010.30096-1-ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This implements NPT suport for SVM by hooking into
x86_cpu_handle_mmu_fault where it reads the stage-1 page table. Whether
we need to perform this 2nd stage translation, and how, is decided
during vmrun and stored in hflags2, along with nested_cr3 and
nested_pg_mode.
As get_hphys performs a direct cpu_vmexit in case of NPT faults, we need
retaddr in that function. To avoid changing the signature of
cpu_handle_mmu_fault, this passes the value from tlb_fill to get_hphys
via the CPU state.
This was tested successfully via the Jailhouse hypervisor.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <567473a0-6005-5843-4c73-951f476085ca@web.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add support for Hyper-V TLB flush which recently got added to KVM.
Just like regular Hyper-V we announce HV_EX_PROCESSOR_MASKS_RECOMMENDED
regardless of how many vCPUs we have. Windows is 'smart' and uses less
expensive non-EX Hypercall whenever possible (when it wants to flush TLB
for all vCPUs or the maximum vCPU index in the vCPU set requires flushing
is less than 64).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20180610184927.19309-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When guest CPU PM is enabled, and with -cpu host, expose the host CPU
MWAIT leaf in the CPUID so guest can make good PM decisions.
Note: the result is 100% CPU utilization reported by host as host
no longer knows that the CPU is halted.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180622192148.178309-3-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It calls cpu_loop_exit in system emulation mode (and should never be
called in user emulation mode).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <6f4d44ffde55d074cbceb48309c1678600abad2f.1522769774.git.jan.kiszka@web.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Always initialize CPUCaches structs with cache information, even
if legacy_cache=true. Use different CPUCaches struct for
CPUID[2], CPUID[4], and the AMD CPUID leaves.
This will simplify a lot the logic inside cpu_x86_cpuid().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1527176614-26271-2-git-send-email-babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Switch to the header we imported from Linux,
this allows us to drop a hack in kvm_i386.h.
More code will be dropped in the next patch.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
"Some AMD processors only support a non-architectural means of enabling
speculative store bypass disable (SSBD). To allow a simplified view of
this to a guest, an architectural definition has been created through a new
CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f. With this, a
hypervisor can virtualize the existence of this definition and provide an
architectural method for using SSBD to a guest.
Add the new CPUID feature, the new MSR and update the existing SSBD
support to use this MSR when present." (from x86/speculation: Add virtualized
speculative store bypass disable support in Linux).
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-4-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
New microcode introduces the "Speculative Store Bypass Disable"
CPUID feature bit. This needs to be exposed to guest OS to allow
them to protect against CVE-2018-3639.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Message-Id: <20180521215424.13520-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The property legacy-cache will be used to control the cache information.
If user passes "-cpu legacy-cache" then older information will
be displayed even if the hardware supports new information. Otherwise
use the statically loaded cache definitions if available.
Renamed the previous cache structures to legacy_*. If there is any change in
the cache information, then it needs to be initialized in builtin_x86_defs.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Message-Id: <20180514164156.27034-3-babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of having a collection of macros that need to be used in
complex expressions to build CPUID data, define a CPUCacheInfo
struct that can hold information about a given cache. Helper
functions will take a CPUCacheInfo struct as input to encode
CPUID leaves for a cache.
This will help us ensure consistency between cache information
CPUID leaves, and make the existing inconsistencies in CPUID info
more visible.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Message-Id: <20180510204148.11687-2-babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The CLDEMOTE instruction hints to hardware that the cache line that
contains the linear address should be moved("demoted") from
the cache(s) closest to the processor core to a level more distant
from the processor core. This may accelerate subsequent accesses
to the line by other cores in the same coherence domain,
especially if the line was written by the core that demotes the line.
Intel Snow Ridge has added new cpu feature, CLDEMOTE.
The new cpu feature needs to be exposed to guest VM.
The bit definition:
CPUID.(EAX=7,ECX=0):ECX[bit 25] CLDEMOTE
The release document ref below link:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Message-Id: <1525406253-54846-1-git-send-email-jingqi.liu@intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
KVM recently gained support for Hyper-V Reenlightenment MSRs which are
required to make KVM-on-Hyper-V enable TSC page clocksource to its guests
when INVTSC is not passed to it (and it is not passed by default in Qemu
as it effectively blocks migration).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20180411115036.31832-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to guarantee compatibility on migration, QEMU should have
complete control over the features it announces to the guest via CPUID.
However, the availability of Hyper-V frequency MSRs
(HV_X64_MSR_TSC_FREQUENCY and HV_X64_MSR_APIC_FREQUENCY) depends solely
on the support for them in the underlying KVM.
Introduce "hv-frequencies" cpu property (off by default) which gives
QEMU full control over whether these MSRs are announced.
While at this, drop the redundant check of the cpu tsc frequency, and
decouple this feature from hv-time.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180330170209.20627-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
cpu_init(cpu_model) were replaced by cpu_create(cpu_type) so
no users are left, remove it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc)
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1518000027-274608-6-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
it will be used for providing to cpu name resolving class for
parsing cpu model for system and user emulation code.
Along with change add target to null-machine tests, so
that when switch to CPU_RESOLVING_TYPE happens,
it would ensure that null-machine usecase still works.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu> (m68k)
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc)
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> (tricore)
Message-Id: <1518000027-274608-4-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: Added macro to riscv too]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* Update kernel headers (Gerd, myself)
* SEV support (Brijesh)
I have not tested non-x86 compilation, but I reordered the SEV patches
so that all non-x86-specific changes go first to catch any possible
issues (which weren't there anyway :)).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJap/4yAAoJEL/70l94x66DmPoH/igfzYkxFyIHFqzb/hQEut3e
IJA05u9DBSqqdSvL0UeLdUgyJTeDM3S5kKZqZ38BPHIudwOGtydoIM2utWtPSejf
Z+mS77+dSgchEMgf1gxmD0oZ5TrO/2pdOYfaZZuQuGmGLruKsDgz6vH3F87cfk8b
yJSJkoZkFc8C9SpwQERWYuhXn2fYFxSBFgEMc9xSFN+zqQUFqeIfOJhwZ+txjAUl
y1EKlhhVyjkxTLR++SkzhKIJ8D5cycpcY/H19gw3ghHviY/tGwNLot3bLRPbwCM6
QvrXDf4rhvFHTmmOfliCI5y6Xgj0u7IZv2fVoKXEtKk1qyfyD4ZnouYTaqP/U9I=
=Q4/y
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-sev' into staging
* Migrate MSR_SMI_COUNT (Liran)
* Update kernel headers (Gerd, myself)
* SEV support (Brijesh)
I have not tested non-x86 compilation, but I reordered the SEV patches
so that all non-x86-specific changes go first to catch any possible
issues (which weren't there anyway :)).
# gpg: Signature made Tue 13 Mar 2018 16:37:06 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream-sev: (22 commits)
sev/i386: add sev_get_capabilities()
sev/i386: qmp: add query-sev-capabilities command
sev/i386: qmp: add query-sev-launch-measure command
sev/i386: hmp: add 'info sev' command
cpu/i386: populate CPUID 0x8000_001F when SEV is active
sev/i386: add migration blocker
sev/i386: finalize the SEV guest launch flow
sev/i386: add support to LAUNCH_MEASURE command
target/i386: encrypt bios rom
sev/i386: add command to encrypt guest memory region
sev/i386: add command to create launch memory encryption context
sev/i386: register the guest memory range which may contain encrypted data
sev/i386: add command to initialize the memory encryption context
include: add psp-sev.h header file
sev/i386: qmp: add query-sev command
target/i386: add Secure Encrypted Virtualization (SEV) object
kvm: introduce memory encryption APIs
kvm: add memory encryption context
docs: add AMD Secure Encrypted Virtualization (SEV)
machine: add memory-encryption option
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This MSR returns the number of #SMIs that occurred on
CPU since boot.
KVM commit 52797bf9a875 ("KVM: x86: Add emulation of MSR_SMI_COUNT")
introduced support for emulating this MSR.
This commit adds support for QEMU to save/load this
MSR for migration purposes.
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add Intel Processor Trace related definition. It also add
corresponding part to kvm_get/set_msr and vmstate.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1520182116-16485-2-git-send-email-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Expose Intel Processor Trace feature to guest.
To make Intel PT live migration safe and get same CPUID information
with same CPU model on diffrent host. CPUID[14] is constant in this
patch. Intel PT use EPT is first supported in IceLake, the CPUID[14]
get on this machine as default value. Intel PT would be disabled
if any machine don't support this minial feature list.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1520182116-16485-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add KVM_HINTS_DEDICATED performance hint, guest checks this feature bit
to determine if they run on dedicated vCPUs, allowing optimizations such
as usage of qspinlocks.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1518185725-69559-1-git-send-email-wanpengli@tencent.com>
[ehabkost: Renamed property to kvm-hint-dedicated]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
As cpu.h is another typically widely included file which doesn't need
full access to the softfloat API we can remove the includes from here
as well. Where they do need types it's typically for float_status and
the rounding modes so we move that to softfloat-types.h as well.
As a result of not having softfloat in every cpu.h call we now need to
add it to various helpers that do need the full softfloat.h
definitions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[For PPC parts]
Acked-by: David Gibson <david@gibson.dropbear.id.au>
The MC68040 MMU provides the size of the access that
triggers the page fault.
This size is set in the Special Status Word which
is written in the stack frame of the access fault
exception.
So we need the size in m68k_cpu_unassigned_access() and
m68k_cpu_handle_mmu_fault().
To be able to do that, this patch modifies the prototype of
handle_mmu_fault handler, tlb_fill() and probe_write().
do_unassigned_access() already includes a size parameter.
This patch also updates handle_mmu_fault handlers and
tlb_fill() of all targets (only parameter, no code change).
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180118193846.24953-2-laurent@vivier.eu>
Add the new feature word and the "ibpb" feature flag.
Based on a patch by Paolo Bonzini.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add the feature name and a CPUID_7_0_EDX_SPEC_CTRL macro.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We will share the same code for hax/kvm.
Signed-off-by: Tao Wu <lepton@google.com>
Message-Id: <20180110195056.85403-1-lepton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This file begins tracking the files that will be the code base for HVF
support in QEMU. This code base is part of Google's QEMU version of
their Android emulator, and can be found at
https://android.googlesource.com/platform/external/qemu/+/emu-master-dev
This code is based on Veertu Inc's vdhh (Veertu Desktop Hosted
Hypervisor), found at https://github.com/veertuinc/vdhh. Everything is
appropriately licensed under GPL v2-or-later, except for the code inside
x86_task.c and x86_task.h, which, deriving from KVM (the Linux kernel),
is licensed GPL v2-only.
This code base already implements a very great deal of functionality,
although Google's version removed from Vertuu's the support for APIC
page and hyperv-related stuff. According to the Android Emulator Release
Notes, Revision 26.1.3 (August 2017), "Hypervisor.framework is now
enabled by default on macOS for 32-bit x86 images to improve performance
and macOS compatibility", although we better use with caution for, as the
same Revision warns us, "If you experience issues with it specifically,
please file a bug report...". The code hasn't seen much update in the
last 5 months, so I think that we can further develop the code with
occasional visiting Google's repository to see if there has been any
update.
On top of Google's code, the following changes were made:
- add code to the configure script to support the --enable-hvf argument.
If the OS is Darwin, it checks for presence of HVF in the system. The
patch also adds strings related to HVF in the file qemu-options.hx.
QEMU will only support the modern syntax style '-M accel=hvf' no enable
hvf; the legacy '-enable-hvf' will not be supported.
- fix styling issues
- add glue code to cpus.c
- move HVFX86EmulatorState field to CPUX86State, changing the
the emulation functions to have a parameter with signature 'CPUX86State *'
instead of 'CPUState *' so we don't have to get the 'env'.
Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-2-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-3-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-5-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-6-Sergio.G.DelReal@gmail.com>
Message-Id: <20170905035457.3753-7-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The value of HV_X64_MSR_SVERSION is initialized once at vcpu init, and
is reset to zero on vcpu reset, which is wrong.
It is supposed to be a constant, so drop the field from X86CPU, set the
msr with the constant value, and don't bother getting it.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20171122181418.14180-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V has a notion of partition-wide MSRs. Those MSRs are read and
written as usual on each VCPU, however the hypervisor maintains a single
global value for all VCPUs. Thus writing such an MSR from any single
VCPU affects the global value that is read by all other VCPUs.
This leads to an issue during VCPU hotplug: the zero-initialzied values
of those MSRs get synced into KVM and override the global values as has
already been set by the guest.
This change makes the partition-wide MSRs only be synchronized on the
first vcpu.
Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20171122181418.14180-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel IceLake cpu has added new cpu features,AVX512_VBMI2/GFNI/
VAES/VPCLMULQDQ/AVX512_VNNI/AVX512_BITALG. Those new cpu features
need expose to guest VM.
The bit definition:
CPUID.(EAX=7,ECX=0):ECX[bit 06] AVX512_VBMI2
CPUID.(EAX=7,ECX=0):ECX[bit 08] GFNI
CPUID.(EAX=7,ECX=0):ECX[bit 09] VAES
CPUID.(EAX=7,ECX=0):ECX[bit 10] VPCLMULQDQ
CPUID.(EAX=7,ECX=0):ECX[bit 11] AVX512_VNNI
CPUID.(EAX=7,ECX=0):ECX[bit 12] AVX512_BITALG
The release document ref below link:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1511335676-20797-1-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJZwXs9AAoJECgHk2+YTcWmS18P/1OceEmetatwBQ6YWURA0VfU
CB+nCHxijlWXU8UiwoTVDOXc11P00V5VO0mMFouUbv+o+/80qyAfNl2DhDVq7ZXo
nhIFHKXUR1BX3YbcqfIonH8xCMGtrAexghhhaGPQbx450PqmGyyjBsZsXttNge1S
Yt+jzgD9drsZPYw4ZLJjT/tnWI/+8kDPn37jVujzRzApTD9/fq+77ZZq7q25RzQH
ISa5OXOQk6pq+tPHvaIFXfOqfhILcpM7u/X7MYVwiA1oBqcLXTDCjDX3cKavKade
+i3yrH7ahNgUO1DyT2bMZX6NAFoZDBrwlYgpkw8n+Yf+EUcdPDOHHEcEeaMpdTGx
wgWbQrs+xzIg/ulRb8Qqe9FwdXGQbehfFGofd+gnGQ0XuxekT82in+ucMOivQO3x
W/azGnzoz6D83stJFIZ93S69SRswqBuj2R8mu821yzqx1EUSNXgfKXz9OPwwFed5
El9YN127F/VAyp0av1CsOg1XgqIujMUGRxGf7eQBfkh1R3C/g2XNPTvR3yaY9L5B
zuMJfWLF6r6zL53ymt7/9UVEim295Lia3mNGS5/Min5QGms7edphsDsrubzXsZGq
2owWfAU/KeDH9gNVNNkZdLcEcS5TEz+2oGPR5oeDeB/QlzVdNQ3FeTVlzFavxNQa
8nrzeFcw7VNrIx2gdvsY
=LQ8U
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
Machine/CPU/NUMA queue, 2017-09-19
# gpg: Signature made Tue 19 Sep 2017 21:17:01 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-next-pull-request:
MAINTAINERS: Update git URLs for my trees
hw/acpi-build: Fix SRAT memory building in case of node 0 without RAM
NUMA: Replace MAX_NODES with nb_numa_nodes in for loop
numa: cpu: calculate/set default node-ids after all -numa CLI options are parsed
arm: drop intermediate cpu_model -> cpu type parsing and use cpu type directly
pc: use generic cpu_model parsing
vl.c: convert cpu_model to cpu type and set of global properties before machine_init()
cpu: make cpu_generic_init() abort QEMU on error
qom: cpus: split cpu_generic_init() on feature parsing and cpu creation parts
hostmem-file: Add "discard-data" option
osdep: Define QEMU_MADV_REMOVE
vl: Clean up user-creatable objects when exiting
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The definitions for Hyper-V emulation are currently taken from a header
imported from the Linux kernel.
However, as these describe a third-party protocol rather than a kernel
API, it probably wasn't a good idea to publish it in the kernel uapi.
This patch introduces a header that provides all the necessary
definitions, superseding the one coming from the kernel.
The new header supports (temporary) coexistence with the kernel one.
The constants explicitly named in the Hyper-V specification (e.g. msr
numbers) are defined in a non-conflicting way. Other constants and
types have got new names.
While at this, the protocol data structures are defined in a more
conventional way, without bitfields, enums, and excessive unions.
The code using this stuff is adjusted, too; it can now be built both
with and without the kernel header in the tree.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20170713201522.13765-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Starting with Windows Server 2012 and Windows 8, if
CPUID.40000005.EAX contains a value of -1, Windows assumes specific
limit to the number of VPs. In this case, Windows Server 2012
guest VMs may use more than 64 VPs, up to the maximum supported
number of processors applicable to the specific Windows
version being used.
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs
For compatibility, Let's introduce a new property for X86CPU,
named "x-hv-max-vps" as Eduardo's suggestion, and set it
to 0x40 before machine 2.10.
(The "x-" prefix indicates that the property is not supposed to
be a stable user interface.)
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1505143227-14324-1-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
define default CPU type in generic way in pc_machine_class_init()
and let common machine code to handle cpu_model parsing
Patch also introduces TARGET_DEFAULT_CPU_TYPE define for 2 purposes:
* make foo_machine_class_init() look uniform on every target
* use define in [bsd|linux]-user targets to pick default
cpu type
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1505318697-77161-5-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
it's just a wrapper, drop it and use cpu_generic_init() directly
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1503592308-93913-21-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently when running KVM, we expose "KVMKVMKVM\0\0\0" in
the 0x40000000 CPUID leaf. Other hypervisors (VMWare,
HyperV, Xen, BHyve) all do the same thing, which leaves
TCG as the odd one out.
The CPUID signature is used by software to detect which
virtual environment they are running in and (potentially)
change behaviour in certain ways. For example, systemd
supports a ConditionVirtualization= setting in unit files.
The virt-what command can also report the virt type it is
running on
Currently both these apps have to resort to custom hacks
like looking for 'fw-cfg' entry in the /proc/device-tree
file to identify TCG.
This change thus proposes a signature "TCGTCGTCGTCG" to be
reported when running under TCG.
To hide this, the -cpu option tcg-cpuid=off can be used.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170509132736.10071-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add the tcg_enabled() where the x86 target needs to disable
TCG-specific code.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split the cpu_set_mxcsr() and make cpu_set_fpuc() inline with specific
tcg code.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move cpu_get_fp80()/cpu_set_fp80() from fpu_helper.c to
machine.c because fpu_helper.c will be disabled if tcg is
disabled in the build.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch pulls out of kvm.c and into the new files the implementation
for the xsave and xrstor instructions. This so they can be shared by
kvm and hvf.
Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170626200832.11058-1-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sergio Andres Gomez Del Real <sergio.g.delreal@gmail.com>
This speeds up SMM switches. Later on it may remove the need to take
the BQL, and it may also allow to reuse code between TCG and KVM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ignore env->a20_mask when running in system management mode.
Reported-by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1494502528-12670-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move vcpu's associated numa_node field out of generic CPUState
into inherited classes that actually care about cpu<->numa mapping,
i.e: ARMCPU, PowerPCCPU, X86CPU.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496161442-96665-6-git-send-email-imammedo@redhat.com>
[ehabkost: s/CPU is belonging to/CPU belongs to/ on comments]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The existing code for "host" and "max" CPU models overrides every
single feature in the CPU object at realize time, even the ones
that were explicitly enabled or disabled by the user using
"feat=on" or "feat=off", while features set using +feat/-feat are
kept.
This means "-cpu host,+invtsc" works as expected, while
"-cpu host,invtsc=on" doesn't.
This was a known bug, already documented in a comment inside
x86_cpu_expand_features(). What makes this bug worse now is that
libvirt 3.0.0 and newer now use "feat=on|off" instead of
+feat/-feat when it detects a QEMU version that supports it (see
libvirt commit d47db7b16dd5422c7e487c8c8ee5b181a2f9cd66).
Change the feature property getter/setter to set a
env->user_features field, to keep track of features that were
explicitly changed using QOM properties. Then make the
max_features code not override user features when handling "-cpu
host" and "-cpu max".
This will also allow us to remove the plus_features/minus_features
hack in the future, but I plan to do that after 2.9.0 is
released.
Reported-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170327144815.8043-3-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Helper function for code that needs to check the host CPU
vendor/family/model/stepping values.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170309181212.18864-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Paths through the softmmu code during code generation now need to be audited
to check for double locking of tb_lock. In particular, VMEXIT can take tb_lock
through cpu_vmexit -> cpu_x86_update_cr4 -> tlb_flush.
To avoid this, split VMEXIT delivery in two parts, similar to what is done with
exceptions. cpu_vmexit only records the VMEXIT exit code and information, and
cc->do_interrupt can then deliver it when it is safe to take the lock.
Reported-by: Alexander Boettcher <alexander.boettcher@genode-labs.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Tested-by: Alexander Boettcher <alexander.boettcher@genode-labs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
This suppresses the incorrect warning when forcing MTTCG for x86
guests on x86 hosts. A future patch will still warn when
TARGET_SUPPORT_MTTCG hasn't been defined for the guest (which is still
pending for x86).
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Migration from a 2.3.0 qemu results in a reboot on the receiving QEMU
due to a disagreement about SM (System management) interrupts.
2.3.0 didn't have much SMI support, but it did set CPU_INTERRUPT_SMI
and this gets into the migration stream, but on 2.3.0 it
never got delivered.
~2.4.0 SMI interrupt support was added but was broken - so
that when a 2.3.0 stream was received it cleared the CPU_INTERRUPT_SMI
but never actually caused an interrupt.
The SMI delivery was recently fixed by 68c6efe07a, but the
effect now is that an incoming 2.3.0 stream takes the interrupt it
had flagged but it's bios can't actually handle it(I think
partly due to the original interrupt not being taken during boot?).
The consequence is a triple(?) fault and a reboot.
Tested from:
2.3.1 -M 2.3.0
2.7.0 -M 2.3.0
2.8.0 -M 2.3.0
2.8.0 -M 2.8.0
This corresponds to RH bugzilla entry 1420679.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170223133441.16010-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"-cpu max" and query-cpu-model-expansion support for x86. This
should be the last x86 pull request before 2.9 soft freeze.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJYtFKvAAoJECgHk2+YTcWmyCwQAJEBM8NdOmN14yA3qTiTs0Kd
D/3VWSxtu9t41g39+yho70c+ZnpqGW28/WbY7E4ovAPRIoUI/pmKACY42k+WrTmK
MIBMesp1YnkxhwrFW9CwgqiUV8nr5ZMlzW/pQU/GaXbcH+7KfObeI93iGhtGjWvi
4nNvrK7b5mz2wPU6s1j+Bz2mp0CMd/sktmiH93tyWU+KgU7NXvMDPInVkfvBMvZN
5D6JLIeKxrndbaaUgvGbR4SUUmRs8TZFYfEbOdkkjIqh7MAKVKCCipFaxWIEfndr
bs1MDmw6uIUaI55JuWaXb//BkS+jai1dmn+ZEzMoisetlheSwR8cEStFJBxcm/7n
WQfxUd6TuNJ9PC1FIvb/OHUCGvzb+vtbEYAMmvCv8BVMnMivN7WNliu3rNVRgWBK
OHClBPdgwoIx2cGt6ic1rvxHxcpjeJ/YXBzL/JbBkblckpxbNRcW1NRTZKHIe3vr
JcPMEoP8g5d9ZHOG0WqBhKtJ3vUrxF3xqBKuR1Ha7QWpyKe9YF+RKrIA9dZkhLy0
Jh0fPZn2PmQrbLuZTC1u7Sgp22Duy7RcfJ7SikR+uhMLtkvToqu3ywLteW6Ta1by
oinb2UYMazwpAKKDcab4GdNJHPOuhnDw58osBVTyBiiN1tJjH+BhnVV4bYJVpaHI
MJIx5QvwqocSO0qoDZxo
=KPbC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
x86 queue, 2017-02-27
"-cpu max" and query-cpu-model-expansion support for x86. This
should be the last x86 pull request before 2.9 soft freeze.
# gpg: Signature made Mon 27 Feb 2017 16:24:15 GMT
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request:
i386: Improve query-cpu-model-expansion full mode
i386: Implement query-cpu-model-expansion QMP command
i386: Define static "base" CPU model
i386: Don't set CPUClass::cpu_def on "max" model
i386: Make "max" model not use any host CPUID info on TCG
i386: Create "max" CPU model
qapi-schema: Comment about full expansion of non-migration-safe models
i386: Reorganize and document CPUID initialization steps
i386: Rename X86CPU::host_features to X86CPU::max_features
i386: Add ordering field to CPUClass
i386: Unset cannot_destroy_with_object_finalize_yet on "host" model
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Note that x86_64 has only _rt signal handlers. This implementation
attempts to share code with the x86_32 implementation.
CC: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Allan Wirth <awirth@akamai.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20170226165345.8757-1-bobby.prani@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Rename the field and add a small comment to make its purpose
clearer.
Message-Id: <20170119210449.11991-4-ehabkost@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Commit 2afbdf8 ("target-i386: exception handling for memory helpers",
2015-09-15) changed tlb_fill's cpu_restore_state+raise_exception_err
to raise_exception_err_ra. After this change, the cpu_restore_state
and raise_exception_err's cpu_loop_exit are merged into
raise_exception_err_ra's cpu_loop_exit_restore.
This actually fixed some bugs, but when SVM is enabled there is a
second path from raise_exception_err_ra to cpu_loop_exit. This is
the VMEXIT path, and now cpu_vmexit is called without a
cpu_restore_state before.
The fix is to pass the retaddr to cpu_vmexit (via
cpu_svm_check_intercept_param). All helpers can now use GETPC() to pass
the correct retaddr, too.
Cc: qemu-stable@nongnu.org
Fixes: 2afbdf8480
Reported-by: Alexander Boettcher <alexander.boettcher@genode-labs.com>
Tested-by: Alexander Boettcher <alexander.boettcher@genode-labs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes timekeeping of x86-64 Darwin/OS X/macOS guests when using KVM.
Darwin/OS X/macOS for x86-64 uses the TSC for timekeeping; it normally calibrates this by querying various clock frequency scaling MSRs. Details depend on the exact CPU model detected. The local APIC timer frequency is extracted from (EFI) firmware.
This is problematic in the presence of virtualisation, as the MSRs in question are typically not handled by the hypervisor. VMWare (Fusion) advertises TSC and APIC frequency via a custom 0x40000010 CPUID leaf, in the eax and ebx registers respectively. This is documented at https://lwn.net/Articles/301888/ among other places.
Darwin/OS X/macOS looks for the generic 0x40000000 hypervisor leaf, and if this indicates via eax that leaf 0x40000010 might be available, that is in turn queried for the two frequencies.
This adds a CPU option "vmware-cpuid-freq" to enable the same behaviour when running Qemu with KVM acceleration, if the KVM TSC frequency can be determined, and it is stable. (invtsc or user-specified) The virtualised APIC bus cycle is hardcoded to 1GHz in KVM, so ebx of the CPUID leaf is also hardcoded to this value.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-Id: <1484921496-11257-2-git-send-email-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is a common thing amongst the various cpu reset functions want to
flush the SoftMMU's TLB entries. This is done either by calling
tlb_flush directly or by way of a general memset of the CPU
structure (sometimes both).
This moves the tlb_flush call to the common reset function and
additionally ensures it is only done for the CONFIG_SOFTMMU case and
when tcg is enabled.
In some target cases we add an empty end_of_reset_fields structure to the
target vCPU structure so have a clear end point for any memset which
is resetting value in the structure before CPU_COMMON (where the TLB
structures are).
While this is a nice clean-up in general it is also a precursor for
changes coming to cputlb for MTTCG where the clearing of entries
can't be done arbitrarily across vCPUs. Currently the cpu_reset
function is usually called from the context of another vCPU as the
architectural power up sequence is run. By using the cputlb API
functions we can ensure the right behaviour in the future.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
The new paging more is extension of IA32e mode with more additional page
table level.
It brings support of 57-bit vitrual address space (128PB) and 52-bit
physical address space (4PB).
The structure of new page table level is identical to pml4.
The feature is enumerated with CPUID.(EAX=07H, ECX=0):ECX[bit 16].
CR4.LA57[bit 12] need to be set when pageing enables to activate 5-level
paging mode.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Message-Id: <20161215001305.146807-1-kirill.shutemov@linux.intel.com>
[Drop changes to target-i386/translate.c. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We've currently got 18 architectures in QEMU, and thus 18 target-xxx
folders in the root folder of the QEMU source tree. More architectures
(e.g. RISC-V, AVR) are likely to be included soon, too, so the main
folder of the QEMU sources slowly gets quite overcrowded with the
target-xxx folders.
To disburden the main folder a little bit, let's move the target-xxx
folders into a dedicated target/ folder, so that target-xxx/ simply
becomes target/xxx/ instead.
Acked-by: Laurent Vivier <laurent@vivier.eu> [m68k part]
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> [tricore part]
Acked-by: Michael Walle <michael@walle.cc> [lm32 part]
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x part]
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [s390x part]
Acked-by: Eduardo Habkost <ehabkost@redhat.com> [i386 part]
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com> [sparc part]
Acked-by: Richard Henderson <rth@twiddle.net> [alpha part]
Acked-by: Max Filippov <jcmvbkbc@gmail.com> [xtensa part]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ppc part]
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> [crisµblaze part]
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> [unicore32 part]
Signed-off-by: Thomas Huth <thuth@redhat.com>