Put a bit more consistency into handling KVM_CAP_HYPERV_SYNIC capability,
by checking its availability and determining the feasibility of hv-synic
property first, and enabling it later.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will allow to build slightly leaner QEMU that supports some HyperV
features of KVM (e.g. SynIC timers, PV spinlocks, APIC assists, etc.)
but nothing else on the QEMU side.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-6-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A significant part of hyperv.c is not actually tied to x86, and can
be moved to hw/.
This will allow to maintain most of Hyper-V and VMBus
target-independent, and to avoid conflicts with inclusion of
arch-specific headers down the road in VMBus implementation.
Also this stuff can now be opt-out with CONFIG_HYPERV.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Also make the inverse function, hyperv_find_vcpu, static as it's not
used outside hyperv.c
This paves the way to making hyperv.c built optionally.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-3-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some parts of the Hyper-V hypervisor-guest interface appear to be
target-independent, so move them into a proper header.
Not that Hyper-V ARM64 emulation is around the corner but it seems more
conveninent to have most of Hyper-V and VMBus target-independent, and
allows to avoid conflicts with inclusion of arch-specific headers down
the road in VMBus implementation.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There's nothing kvm-specific in it so follow the suite and replace
"kvm_hv" prefix with "hyperv".
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-9-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Multiple entities (e.g. VMBus devices) can use the same SINT route. To
make their lives easier in maintaining SINT route ownership, make it
reference-counted. Adjust the respective API names accordingly.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-8-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use X86CPU pointer to refer to the respective HvSintRoute instead of
vp_index. This is more convenient and also paves the way for future
enhancements.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-7-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make sint ack callback accept an opaque pointer, that is stored on
sint_route at creation time.
This allows for more convenient interaction with the callback.
Besides, nothing outside hyperv.c should need to know the layout of
HvSintRoute fields any more so its declaration can be removed from the
header.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-6-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There's no point setting up an sint ack notifier if no callback is
specified.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-5-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
accel_init_machine sets *(acc->allowed) to true if acc->init_machine(ms)
succeeds. There's no need to have both hvf_allowed and hvf_disabled.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20181018143051.48508-1-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to Intel(R)64 and IA-32 Architectures Software Developer's
Manual, the following one-byte registers should be fetched when REX
prefix is present (sorted by reg encoding index):
AL, CL, DL, BL, SPL, BPL, SIL, DIL, R8L - R15L
The first 8 are fetched if REX.R is zero, the last 8 if non-zero.
The following registers should be fetched for instructions without REX
prefix (also sorted by reg encoding index):
AL, CL, DL, BL, AH, CH, DH, BH
Current emulation code doesn't handle accesses to SPL, BPL, SIL, DIL
when REX is present, thefore an instruction 40883e "mov %dil,(%rsi)" is
decoded as "mov %bh,(%rsi)".
That caused an infinite loop in vp_reset:
https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03293.html
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20181018134401.44471-1-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V PV IPI support is merged to KVM, enable the feature in Qemu. When
enabled, this allows Windows guests to send IPIs to other vCPUs with a
single hypercall even when there are >64 vCPUs in the request.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20181009130853.6412-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The exception.pad field is going to be renamed to pending in an upcoming
header file update. Remove the unnecessary initialization; it was
introduced to please valgrind (commit 7e680753cf) but they were later
rendered unnecessary by commit 076796f8fd, which added the "= {}"
initializer to the declaration of "events". Therefore the patch does
not change behavior in any way.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch fixes the checking of boundary crossing instructions.
In icount mode only first instruction of the block may cross
the page boundary to keep the translation deterministic.
These conditions already existed, but compared the wrong variable.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <20180920071702.22477.43980.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While at it, also rename var to indicate it is not used only in KVM.
Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
Reviewed-by: Patrick Colp <patrick.colp@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20180914003827.124570-2-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This flag will be used for KVM's nested VMX migration; the HF_GUEST_MASK name
is already used in KVM, adopt it in QEMU as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Interrupt handling depends on various flags in env->hflags or env->hflags2,
and the exact detail were not exactly replicated between x86_cpu_has_work
and x86_cpu_exec_interrupt. Create a new function that extracts the
highest-priority non-masked interrupt, and use it in both functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
And convert it to a bool to use an existing hole
in the struct.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The AMD IOMMU does not (yet) support interrupt remapping. But
kvm_arch_fixup_msi_route assumes that all implementations do and crashes
when the AMD IOMMU is used in KVM mode.
Fixes: 8b5ed7dffa ("intel_iommu: add support for split irqchip")
Reported-by: Christopher Goldsworthy <christopher.goldsworthy@outlook.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <48ae78d8-58ec-8813-8680-6f407ea46041@siemens.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The capability macros are always defined, since they come from kernel
headers that are copied into the QEMU tree. Remove the unnecessary #ifdefs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current implementation has three bugs,
* segment limits are not enforced in protected mode if the L bit is set
in the target segment descriptor
* segment limits are not enforced in compatibility mode (ljmp to 32-bit
code segment in long mode)
* #GP(new_cs) is generated rather than #GP(0)
Now the segment limits are enforced if we're not in long mode OR the
target code segment doesn't have the L bit set.
Signed-off-by: Andrew Oates <aoates@google.com>
Message-Id: <20180816011903.39816-1-andrew@andrewoates.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently call gates are always treated as 32-bit gates. In IA-32e mode
(either compatibility or 64-bit submode), system segment descriptors are
always 64-bit. Treating them as 32-bit has the expected unfortunate
effect: only the lower 32 bits of the offset are loaded, the stack
pointer is truncated, a bad new stack pointer is loaded from the TSS (if
switching privilege levels), etc.
This change adds support for 64-bit call gate to the lcall and ljmp
instructions. Additionally, there should be a check for non-canonical
stack pointers, but I've omitted that since there doesn't seem to be
checks for non-canonical addresses in this code elsewhere.
I've left the raise_exception_err_ra lines unwapped at 80 columns to
match the style in the rest of the file.
Signed-off-by: Andrew Oates <aoates@google.com>
Message-Id: <20180819181725.34098-1-andrew@andrewoates.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported by Coverity:
Error: RESOURCE_LEAK (CWE-772): [#def439]
qemu-2.12.0/target/i386/cpu.c:3179: alloc_fn: Storage is returned from allocation function "qdict_new".
qemu-2.12.0/qobject/qdict.c:34:5: alloc_fn: Storage is returned from allocation function "g_malloc0".
qemu-2.12.0/qobject/qdict.c:34:5: var_assign: Assigning: "qdict" = "g_malloc0(4120UL)".
qemu-2.12.0/qobject/qdict.c:37:5: return_alloc: Returning allocated memory "qdict".
qemu-2.12.0/target/i386/cpu.c:3179: var_assign: Assigning: "props" = storage returned from "qdict_new()".
qemu-2.12.0/target/i386/cpu.c:3217: leaked_storage: Variable "props" going out of scope leaks the storage it points to.
This was introduced by commit b8097deb35 ("i386: Improve
query-cpu-model-expansion full mode").
The leak is only theoretical: if ret->model->props is set to
props, the qapi_free_CpuModelExpansionInfo() call will free props
too in case of errors. The only way for this to not happen is if
we enter the default branch of the switch statement, which would
never happen because all CpuModelExpansionType values are being
handled.
It's still worth to change this to make the allocation logic
easier to follow and make the Coverity error go away. To make
everything simpler, initialize ret->model and ret->model->props
earlier in the function.
While at it, remove redundant check for !prop because prop is
always initialized at the beginning of the function.
Fixes: b8097deb35
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180816183509.8231-1-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Many of these are marked as "intentional/fix required" because they
just need adding a fall through comment. This is exactly what this
patch does, except for target/mips/translate.c where it is easier to
duplicate the code, and hw/audio/sb16.c where I consulted the DOSBox
sources and decide to just remove the LOG_UNIMP before the fallthrough.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Enabling TOPOEXT is always allowed, but it can't be enabled
blindly by "-cpu host" because it may make guests crash if the
rest of the cache topology information isn't provided or isn't
consistent.
This addresses the bug reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1613277
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180809221852.15285-1-ehabkost@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
New CPU models mostly inherit features from ancestor Skylake, while addin new
features: UMIP, New Instructions ( PCONIFIG (server only), WBNOINVD,
AVX512_VBMI2, GFNI, AVX512_VNNI, VPCLMULQDQ, VAES, AVX512_BITALG),
Intel PT and 5-level paging (Server only). As well as
IA32_PRED_CMD, SSBD support for speculative execution
side channel mitigations.
Note:
For 5-level paging, Guest physical address width can be configured, with
parameter "phys-bits". Unless explicitly specified, we still use its default
value, even for Icelake-Server cpu model.
At present, hold on expose IA32_ARCH_CAPABILITIES to guest, as 1) This MSR
actually presents more than 1 'feature', maintainers are considering expanding current
features presentation of only CPUIDs to MSR bits; 2) a reasonable default value
for MSR_IA32_ARCH_CAPABILITIES needs to settled first. These 2 are actully
beyond Icelake CPU model itself but fundamental. So split these work apart
and do it later.
https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00774.htmlhttps://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00796.html
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-6-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
WBNOINVD: Write back and do not invalidate cache, enumerated by
CPUID.(EAX=80000008H, ECX=0):EBX[bit 9].
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-5-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Support of IA32_PRED_CMD MSR already be enumerated by same CPUID bit as
SPEC_CTRL.
At present, mark CPUID_7_0_EDX_ARCH_CAPABILITIES unmigratable, per Paolo's
comment.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-3-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
IA32_PRED_CMD MSR gives software a way to issue commands that affect the state
of indirect branch predictors. Enumerated by CPUID.(EAX=7H,ECX=0):EDX[26].
IA32_ARCH_CAPABILITIES MSR enumerates architectural features of RDCL_NO and
IBRS_ALL. Enumerated by CPUID.(EAX=07H, ECX=0):EDX[29].
https://software.intel.com/sites/default/files/managed/c5/63/336996-Speculative-Execution-Side-Channel-Mitigations.pdf
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-2-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
MSR_SMI_COUNT started being migrated in QEMU 2.12. Do not migrate it
on older machine types, or the subsection causes a load failure for
guests that use SMM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename DCACHE to DATA_CACHE and ICACHE to INSTRUCTION_CACHE.
This avoids conflict with Linux asm/cachectl.h macros and fixes
build failure on mips hosts.
Reported-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180717194010.30096-1-ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Hyper-V identifies vCPUs by Virtual Processor (VP) index which can be
queried by the guest via HV_X64_MSR_VP_INDEX msr. It is defined by the
spec as a sequential number which can't exceed the maximum number of
vCPUs per VM.
It has to be owned by QEMU in order to preserve it across migration.
However, the initial implementation in KVM didn't allow to set this
msr, and KVM used its own notion of VP index. Fortunately, the way
vCPUs are created in QEMU/KVM makes it likely that the KVM value is
equal to QEMU cpu_index.
So choose cpu_index as the value for vp_index, and push that to KVM on
kernels that support setting the msr. On older ones that don't, query
the kernel value and assert that it's in sync with QEMU.
Besides, since handling errors from vCPU init at hotplug time is
impossible, disable vCPU hotplug.
This patch also introduces accessor functions to encapsulate the mapping
between a vCPU and its vp_index.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180702134156.13404-3-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>