uintptr_t, or unsigned long which is equivalent on Linux I32LP64 systems,
is an unsigned type and there is no need to further cast to __u64 which is
another unsigned integer type; widening casts from unsigned integers
zero-extend the value.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For PC-relative translation blocks, env->eip changes during the
execution of a translation block, Therefore, QEMU must be able to
recover an instruction's PC just from the TranslationBlock struct and
the instruction data with. Because a TB will not span two pages, QEMU
stores all the low bits of EIP in the instruction data and replaces them
in x86_restore_state_to_opc. Bits 12 and higher (which may vary between
executions of a PCREL TB, since these only use the physical address in
the hash key) are kept unmodified from env->eip. The assumption is that
these bits of EIP, unlike bits 0-11, will not change as the translation
block executes.
Unfortunately, this is incorrect when the CS base is not aligned to a page.
Then the linear address of the instructions (i.e. the one with the
CS base addred) indeed will never span two pages, but bits 12+ of EIP
can actually change. For example, if CS base is 0x80262200 and EIP =
0x6FF4, the first instruction in the translation block will be at linear
address 0x802691F4. Even a very small TB will cross to EIP = 0x7xxx,
while the linear addresses will remain comfortably within a single page.
The fix is simply to use the low bits of the linear address for data[0],
since those don't change. Then x86_restore_state_to_opc uses tb->cs_base
to compute a temporary linear address (referring to some unknown
instruction in the TB, but with the correct values of bits 12 and higher);
the low bits are replaced with data[0], and EIP is obtained by subtracting
again the CS base.
Huge thanks to Mark Cave-Ayland for the image and initial debugging,
and to Gitlab user @kjliew for help with bisecting another occurrence
of (hopefully!) the same bug.
It should be relatively easy to write a testcase that performs MMIO on
an EIP with different bits 12+ than the first instruction of the translation
block; any help is welcome.
Fixes: e3a79e0e87 ("target/i386: Enable TARGET_TB_PCREL", 2022-10-11)
Cc: qemu-stable@nongnu.org
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Richard Henderson <richard.henderson@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1759
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1964
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2012
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The PCREL patches introduced a bug when updating EIP in the !CF_PCREL case.
Using s->pc in func gen_update_eip_next() solves the problem.
Cc: qemu-stable@nongnu.org
Fixes: b5e0d5d22f ("target/i386: Fix 32-bit wrapping of pc/eip computation")
Signed-off-by: guoguangyao <guoguangyao18@mails.ucas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240115020804.30272-1-guoguangyao18@mails.ucas.ac.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With PCREL, we have a page-relative view of EIP, and an
approximation of PC = EIP+CSBASE that is good enough to
detect page crossings. If we try to recompute PC after
masking EIP, we will mess up that approximation and write
a corrupt value to EIP.
We already handled masking properly for PCREL, so the
fix in b5e0d5d2 was only needed for the !PCREL path.
Cc: qemu-stable@nongnu.org
Fixes: b5e0d5d22f ("target/i386: Fix 32-bit wrapping of pc/eip computation")
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240101230617.129349-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL)
instead, it is already widely used and unambiguous.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20240102153529.486531-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL)
instead, it is already widely used and unambiguous.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20240102153529.486531-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The Big QEMU Lock (BQL) has many names and they are confusing. The
actual QemuMutex variable is called qemu_global_mutex but it's commonly
referred to as the BQL in discussions and some code comments. The
locking APIs, however, are called qemu_mutex_lock_iothread() and
qemu_mutex_unlock_iothread().
The "iothread" name is historic and comes from when the main thread was
split into into KVM vcpu threads and the "iothread" (now called the main
loop thread). I have contributed to the confusion myself by introducing
a separate --object iothread, a separate concept unrelated to the BQL.
The "iothread" name is no longer appropriate for the BQL. Rename the
locking APIs to:
- void bql_lock(void)
- void bql_unlock(void)
- bool bql_locked(void)
There are more APIs with "iothread" in their names. Subsequent patches
will rename them. There are also comments and documentation that will be
updated in later patches.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20240102153529.486531-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Use generic cpu_model_from_type() when the CPU model name needs to
be extracted from the CPU type name.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231114235628.534334-23-gshan@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The main difficulty here is that a page fault when writing to the destination
must not overwrite the flags. Therefore, the flags computation must be
inlined instead of using gen_jcc1*.
For simplicity, I am using an unconditional cmpxchg operation, that becomes
a NOP if the comparison fails.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ALU instructions can write to both memory and flags. If the CC_SRC*
and CC_DST locations have been written already when a memory access
causes a fault, the value in CC_SRC* and CC_DST might be interpreted
with the wrong CC_OP (the one that is in effect before the instruction.
Besides just using the wrong result for the flags, something like
subtracting -1 can have disastrous effects if the current CC_OP is
CC_OP_EFLAGS: this is because QEMU does not expect bits outside the ALU
flags to be set in CC_SRC, and env->eflags can end up set to all-ones.
In the case of the attached testcase, this sets IOPL to 3 and would
cause an assertion failure if SUB is moved to the new decoder.
This mechanism is not really needed for BMI instructions, which can
only write to a register, but put it to use anyway for cleanliness.
In the case of BZHI, the code has to be modified slightly to ensure
that decode->cc_src is written, otherwise the new assertions trigger.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_jcc() has been changed to accept a relative offset since the
new decoder was written. Adjust the J operand, which is meant
to be used with jump instructions such as gen_jcc(), to not
include the program counter and to not truncate the result, as
both operations are now performed by common code.
The result is that J is now the same as the I operand.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Similar to gen_setcc1, make gen_cmovcc1 receive TCGv. This is more friendly
to simultaneous implementation in the old and the new decoder.
A small wart is that s->T0 of CMOV is currently the *second* argument (which
would ordinarily be in T1). Therefore, the condition has to be inverted in
order to overwrite s->T0 with cpu_regs[reg] if the MOV is not performed.
This only applies to the old decoder, and this code will go away soon.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not use gen_op, and pull the load from the accumulator into
disas_insn.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create a new temporary, to ease the register allocator's work.
Creation of the temporary is pushed into gen_ext_tl, which
also allows NULL as the first parameter now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just create a temporary for the occasion.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new x86 decoder wants the gen_* functions to compute EFLAGS before
writeback, which can be an issue for instructions with a memory
destination such as ARPL or shifts.
Extract code to compute the EFLAGS without clobbering CC_SRC, in case
the memory write causes a fault. The flags writeback mechanism will
take care of copying the result to CC_SRC.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new decoder would rather have the operand in T0 when expanding SCAS, rather
than use R_EAX directly as gen_scas currently does. This makes SCAS more similar
to CMP and SUB, in that CC_DST = T0 - T1.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new decoder likes to compute the address in A0 very early, so the
gen_lea_v_seg in gen_pop_T0 would clobber the address of the memory
operand. Instead use T0 since it is already available and will be
overwritten immediately after.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
decode->mem is only used if one operand has has_ea == true. String
operations will not use decode->mem and will load A0 on their own, because
they are the only case of two memory operands in a single instruction.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Usually the registers are just moved into s->T0 without much care for
their operand size. However, in some cases we can get more efficient
code if the operand fetching logic syncs with the emission function
on what is nicer.
All the current uses are mostly demonstrative and only reduce the code
in the emission functions, because the instructions do not support
memory operands. However the logic is generic and applies to several
more instructions such as MOVSXD (aka movslq), one-byte shift
instructions, multiplications, XLAT, and indirect calls/jumps.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
X86_SPECIAL_ZExtOp0 and X86_SPECIAL_ZExtOp2 are poorly named; they are a hack
that is needed by scalar insertion and extraction instructions, and not really
related to zero extension: for PEXTR the zero extension is done by the generation
functions, for PINSR the high bits are not used at all and in fact are *not*
filled with zeroes when loaded into s->T1.
Rename the values to match the effect described in the manual, and explain
better in the comments.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use _tl operations for 32-bit operands on 32-bit targets, and only go
through trunc and extu ops for 64-bit targets. While the trunc/ext
ops should be pretty much free after optimization, the optimizer also
does not like having the same temporary used in multiple EBBs.
Therefore it is nicer to not use tmpN* unless necessary.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The previous check erroneously allowed CMP to be modified with LOCK.
Instead, tag explicitly the instructions that do support LOCK.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
cpu_cc_compute_all() has an argument that is always equal to CC_OP for historical
reasons (dating back to commit a7812ae412, "TCG variable type checking.", 2008-11-17,
which added the argument to helper_cc_compute_all). It does not make sense for the
argument to have any other value, so remove it and clean up some lines that are not
too long anymore.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gen_lea_v_seg (called by gen_add_A0_ds_seg) already zeroes any
bits of s->A0 beyond s->aflag. It does so before summing the
segment base and, if not in 64-bit mode, also after summing it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
is_int is always 1, and error_code is always zero.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OF is equal to the carry flag, so use the same CCPrepare.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Take advantage of the fact that there can be no 1 bits between SF and OF.
If they were adjacent, you could sum SF and get a carry only if SF was
already set. Then the value of OF in the sum is the XOR of OF itself,
the carry (which is SF) and 0 (the value of the OF bit in the addend):
this is OF^SF exactly.
Because OF and SF are not adjacent, just place more 1 bits to the
left so that the carry propagates, which means summing CC_O - CC_S.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231221031652.119827-9-richard.henderson@linaro.org>
vCPU "reset" is only possible with system emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-ID: <20231212113640.30287-5-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In 32-bit mode, pc = eip + cs_base is also 32-bit, and must wrap.
Failure to do so results in incorrect memory exceptions to the guest.
Before 732d548732, this was implicitly done via truncation to
target_ulong but only in qemu-system-i386, not qemu-system-x86_64.
To fix this, we must add conditional zero-extensions.
Since we have to test for 32 vs 64-bit anyway, note that cs_base
is always zero in 64-bit mode.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2022
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231212172510.103305-1-richard.henderson@linaro.org>
Commit 7191f24c7f ("accel/kvm/kvm-all: Handle register access errors")
added error checking for KVM_SET_SREGS/KVM_SET_SREGS2. In doing so, it
exposed a long-running bug in current KVM support for SEV-ES where the
kernel assumes that MSR_EFER_LMA will be set explicitly by the guest
kernel, in which case EFER write traps would result in KVM eventually
seeing MSR_EFER_LMA get set and recording it in such a way that it would
be subsequently visible when accessing it via KVM_GET_SREGS/etc.
However, guest kernels currently rely on MSR_EFER_LMA getting set
automatically when MSR_EFER_LME is set and paging is enabled via
CR0_PG_MASK. As a result, the EFER write traps don't actually expose the
MSR_EFER_LMA bit, even though it is set internally, and when QEMU
subsequently tries to pass this EFER value back to KVM via
KVM_SET_SREGS* it will fail various sanity checks and return -EINVAL,
which is now considered fatal due to the aforementioned QEMU commit.
This can be addressed by inferring the MSR_EFER_LMA bit being set when
paging is enabled and MSR_EFER_LME is set, and synthesizing it to ensure
the expected bits are all present in subsequent handling on the host
side.
Ultimately, this handling will be implemented in the host kernel, but to
avoid breaking QEMU's SEV-ES support when using older host kernels, the
same handling can be done in QEMU just after fetching the register
values via KVM_GET_SREGS*. Implement that here.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Lara Lazier <laramglazier@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: <kvm@vger.kernel.org>
Fixes: 7191f24c7f ("accel/kvm/kvm-all: Handle register access errors")
Signed-off-by: Michael Roth <michael.roth@amd.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231206155821.1194551-1-michael.roth@amd.com>
Improve
$ qemu-system-x86_64 -device max-x86_64-cpu,vendor=me
qemu-system-x86_64: -device max-x86_64-cpu,vendor=me: Property '.vendor' doesn't take value 'me'
to
qemu-system-x86_64: -device max-x86_64-cpu,vendor=0123456789abc: value of property 'vendor' must consist of exactly 12 characters
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031111059.3407803-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[Typo corrected]
In the nr_threads' comment, specify it represents the
number of threads in the "core" to avoid confusion.
Also add comment for nr_dies in CPUX86State.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20231024090323.1859210-5-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
From CPUState.nr_cores' comment, it represents "number of cores within
this CPU package".
After 003f230e37 ("machine: Tweak the order of topology members in
struct CpuTopology"), the meaning of smp.cores changed to "the number of
cores in one die", but this commit missed to change CPUState.nr_cores'
calculation, so that CPUState.nr_cores became wrong and now it
misses to consider numbers of clusters and dies.
At present, only i386 is using CPUState.nr_cores.
But as for i386, which supports die level, the uses of CPUState.nr_cores
are very confusing:
Early uses are based on the meaning of "cores per package" (before die
is introduced into i386), and later uses are based on "cores per die"
(after die's introduction).
This difference is due to that commit a94e142899 ("target/i386: Add
CPUID.1F generation support for multi-dies PCMachine") misunderstood
that CPUState.nr_cores means "cores per die" when calculated
CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
wrong understanding.
With the influence of 003f230e37 and a94e142899, for i386 currently
the result of CPUState.nr_cores is "cores per die", thus the original
uses of CPUState.cores based on the meaning of "cores per package" are
wrong when multiple dies exist:
1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
incorrect because it expects "cpus per package" but now the
result is "cpus per die".
2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
EAX[bits 31:26] is incorrect because they expect "cpus per package"
but now the result is "cpus per die". The error not only impacts the
EAX calculation in cache_info_passthrough case, but also impacts other
cases of setting cache topology for Intel CPU according to cpu
topology (specifically, the incoming parameter "num_cores" expects
"cores per package" in encode_cache_cpuid4()).
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
15:00] is incorrect because the EBX of 0BH.01H (core level) expects
"cpus per package", which may be different with 1FH.01H (The reason
is 1FH can support more levels. For QEMU, 1FH also supports die,
1FH.01H:EBX[bits 15:00] expects "cpus per die").
4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
calculated, here "cpus per package" is expected to be checked, but in
fact, now it checks "cpus per die". Though "cpus per die" also works
for this code logic, this isn't consistent with AMD's APM.
5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
"cpus per package" but it obtains "cpus per die".
6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
package", but in these functions, it obtains "cpus per die" and
"cores per die".
On the other hand, these uses are correct now (they are added in/after
a94e142899):
1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
meets the actual meaning of CPUState.nr_cores ("cores per die").
2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
04H's calculation) considers number of dies, so it's correct.
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
15:00] needs "cpus per die" and it gets the correct result, and
CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
When CPUState.nr_cores is correctly changed to "cores per package" again
, the above errors will be fixed without extra work, but the "currently"
correct cases will go wrong and need special handling to pass correct
"cpus/cores per die" they want.
Fix CPUState.nr_cores' calculation to fit the original meaning "cores
per package", as well as changing calculation of topo_info.cores_per_die,
vcpus_per_socket and CPUID.1FH.
Fixes: a94e142899 ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
Fixes: 003f230e37 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20231024090323.1859210-4-zhao1.liu@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The OBJECT_DECLARE_CPU_TYPE() macro forward-declares each
ArchCPUClass type. These forward declarations are sufficient
for code in hw/ to use the QOM definitions. No need to expose
these structure definitions. Keep each local to their target/
by moving them to the corresponding "cpu.h" header.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-13-philmd@linaro.org>
While the default "info lapic" always synchronizes cpu state ...
mon_get_cpu()
-> mon_get_cpu_sync(mon, true)
-> cpu_synchronize_state(cpu)
-> ioctl KVM_GET_LAPIC (taking KVM as example)
... the cpu state is not synchronized when the apic-id is available as
argument.
The cpu state should be synchronized when apic-id is available. Otherwise
the "info lapic <apic-id>" always returns stale data.
Reference:
https://lore.kernel.org/all/20211028155457.967291-19-berrange@redhat.com/
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Message-ID: <20231030085336.2681386-1-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231026211938.162815-1-dongli.zhang@oracle.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230918160257.30127-4-philmd@linaro.org>
Follow the naming used by other files in target/i386/.
No functional changes.
Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231020111136.44401-4-philmd@linaro.org>
Follow the naming used by other files in target/i386/.
No functional changes.
Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231020111136.44401-3-philmd@linaro.org>
Follow C style guidelines and use CPUState forward
declaration from "qemu/typedefs.h".
No functional changes.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231020111136.44401-2-philmd@linaro.org>
When CPUArchState* is available (here CPUX86State*), we can
use the fast env_archcpu() macro to get ArchCPU* (here X86CPU*).
The QOM cast X86_CPU() macro will be slower when building with
--enable-qom-cast-debug.
Pass CPUX86State* as argument to simulate_rdmsr / simulate_wrmsr
instead of a CPUState* to avoid an extra cast.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231009110239.66778-7-philmd@linaro.org>
We already have 'x86_cpu = X86_CPU(cpu)'. Use the variable
instead of doing another QOM cast with X86_CPU().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231009110239.66778-6-philmd@linaro.org>
Hegerogeneous code needs access to the FOO_CPU_TYPE_NAME()
macro to resolve target CPU types. Move the declaration
(along with the required FOO_CPU_TYPE_SUFFIX) to "cpu-qom.h".
"target/foo/cpu-qom.h" is supposed to be target agnostic
(include-able by any target). Add such mention in the
header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231013140116.255-7-philmd@linaro.org>
Enforce the style described by commit 067109a11c ("docs/devel:
mention the spacing requirement for QOM"):
The first declaration of a storage or class structure should
always be the parent and leave a visual space between that
declaration and the new code. It is also useful to separate
backing for properties (options driven by the user) and internal
state to make navigation easier.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231013140116.255-2-philmd@linaro.org>
The primary console is special because the toolstack maps a page into
the guest for its ring, and also allocates the guest-side event channel.
The guest's grant table is even primed to export that page using a known
grant ref#. Add support for all that in emulated mode, so that we can
have a primary console.
For reasons unclear, the backends running under real Xen don't just use
a mapping of the well-known GNTTAB_RESERVED_CONSOLE grant ref (which
would also be in the ring-ref node in XenStore). Instead, the toolstack
sets the ring-ref node of the primary console to the GFN of the guest
page. The backend is expected to handle that special case and map it
with foreignmem operations instead.
We don't have an implementation of foreignmem ops for emulated Xen mode,
so just make it map GNTTAB_RESERVED_CONSOLE instead. This would probably
work for real Xen too, but we can't work out how to make real Xen create
a primary console of type "ioemu" to make QEMU drive it, so we can't
test that; might as well leave it as it is for now under Xen.
Now at last we can boot the Xen PV shim and run PV kernels in QEMU.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This will allow Linux guests (since v6.0) to use the per-vCPU upcall
vector delivered as MSI through the local APIC.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
A guest which has configured the per-vCPU upcall vector may set the
HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero.
For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support
for HVMOP_set_evtchn_upcall_vector") will just do this after setting the
vector:
/* Trick toolstack to think we are enlightened. */
if (!cpu)
rc = xen_set_callback_via(1);
That's explicitly setting the delivery to GSI#1, but it's supposed to be
overridden by the per-vCPU vector setting. This mostly works in Qemu
*except* for the logic to enable the in-kernel handling of event channels,
which falsely determines that the kernel cannot accelerate GSI delivery
in this case.
Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has
the vector set, and use that in xen_evtchn_set_callback_param() to
enable the kernel acceleration features even when the param *appears*
to be set to target a GSI.
Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to
*zero* the event channel delivery is disabled completely. (Which is
what that bizarre guest behaviour is working round in the first place.)
Cc: qemu-stable@nongnu.org
Fixes: 91cce75617 ("hw/xen: Add xen_evtchn device for event channel emulation")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The per-vCPU upcall vector support had three problems. Firstly it was
using the wrong hypercall argument and would always return -EFAULT when
the guest tried to set it up. Secondly it was using the wrong ioctl() to
pass the vector to the kernel and thus the *kernel* would always return
-EINVAL. Finally, even when delivering the event directly from userspace
with an MSI, it put the destination CPU ID into the wrong bits of the
MSI address.
Linux doesn't (yet) use this mode so it went without decent testing
for a while.
Cc: qemu-stable@nongnu.org
Fixes: 105b47fdf2 ("i386/xen: implement HVMOP_set_evtchn_upcall_vector")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This confuses lscpu into thinking it's running in PVH mode.
Cc: qemu-stable@nongnu.org
Fixes: bedcc13924 ("i386/xen: implement HYPERVISOR_xen_version")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This was introduced in KVM in Linux 2.6.33, we can require it
unconditionally. KVM_CLOCK_TSC_STABLE was only added in Linux 4.9,
for now do not require it (though it would allow the removal of some
pretty yucky code).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This was introduced in KVM in Linux 2.6.36, and could already be used at
the time to save/restore FPU data even on older processor. We can require
it unconditionally and stop using KVM_GET/SET_FPU.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This was introduced in KVM in Linux 3.5, we can require it unconditionally
in kvm_irqchip_send_msi(). However, not all architectures have to implement
it so check it only in x86, the only architecture that ever had MSI injection
but not KVM_CAP_SIGNAL_MSI.
ARM uses it to detect the presence of the ITS emulation in the kernel,
introduced in Linux 4.8. Assume that it's there and possibly fail when
realizing the arm-its-kvm device.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
PAE mode in x86 supports 36 bit address space. Check the PAE CPUID on the
guest processor and set phys_bits to 36 if PAE feature is set. This is in
addition to checking the presence of PSE36 CPUID feature for setting 36 bit
phys_bits.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20230912120650.371781-1-anisinha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instructions in VEX exception class 6 generally look at the value of
VEX.W. Note that the manual places some instructions incorrectly in
class 4, for example VPERMQ which has no non-VEX encoding and no legacy
SSE analogue. AMD does a mess of its own, as documented in the comment
that this patch adds.
Most of them are checked for VEX.W=0, and are listed in the manual
(though with an omission) in table 2-16; VPERMQ and VPERMPD check for
VEX.W=1, which is only listed in the instruction description. Others,
such as VPSRLV, VPSLLV and the FMA3 instructions, use VEX.W to switch
between a 32-bit and 64-bit operation.
Fix more of the class 4/class 6 mismatches, and implement the check for
VEX.W in TCG.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In preparation for adding more similar checks, move the VEX.L=0 check
and several X86_SPECIAL_* checks to a new field, where each bit represent
a common check on unused bits, or a restriction on the processor mode.
Likewise, many SVM intercepts can be checked during the decoding phase,
the main exception being the selective CR0 write, MSR and IOIO intercepts.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The implementation was validated with OpenSSL and with the test vectors in
https://github.com/rust-lang/stdarch/blob/master/crates/core_arch/src/x86/sha.rs.
The instructions provide a ~25% improvement on hashing a 64 MiB file:
runtime goes down from 1.8 seconds to 1.4 seconds; instruction count on
the host goes down from 5.8 billion to 4.8 billion with slightly better
IPC too. Good job Intel. ;)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Modify migrate_add_blocker and migrate_del_blocker to take an Error **
reason. This allows migration to own the Error object, so that if
an error occurs in migrate_add_blocker, migration code can free the Error
and clear the client handle, simplifying client code. It also simplifies
the migrate_del_blocker call site.
In addition, this is a pre-requisite for a proposed future patch that would
add a mode argument to migration requests to support live update, and
maintain a list of blockers for each mode. A blocker may apply to a single
mode or to multiple modes, and passing Error** will allow one Error object
to be registered for multiple modes.
No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Tested-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1697634216-84215-1-git-send-email-steven.sistare@oracle.com>
When explicitly booting a multiple vcpus vm with "-cpu +ht", it gets
warning of
warning: host doesn't support requested feature: CPUID.01H:EDX.ht [bit 28]
Make CPUID_HT as supported unconditionally can resolve the warning.
However it introduces another issue that it also expose CPUID_HT to
guest when "-cpu host/max" with only 1 vcpu. To fix this, need mark
CPUID_HT as the no_autoenable_flags.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20231010060539.210258-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit a908985971 ("target/i386/seg_helper: introduce tss_set_busy",
2023-09-26) failed to use the tss_selector argument of the new function,
which was therefore unused.
This shows up as a #GP fault when booting old versions of 32-bit
Linux.
Fixes: a908985971 ("target/i386/seg_helper: introduce tss_set_busy", 2023-09-26)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20231011135350.438492-1-pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
All implementations of gdb_arch_name() returns dynamic duplicates of
static strings. It's also unlikely that there will be an implementation
of gdb_arch_name() that returns a truly dynamic value due to the nature
of the function returning a well-known identifiers. Qualify the value
gdb_arch_name() with const and make all of its implementations return
static strings.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230912224107.29669-8-akihiko.odaki@daynix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231009164104.369749-15-alex.bennee@linaro.org>
Finish the convertion started with commit de6cd7599b
("meson: Replace softmmu_ss -> system_ss"). If the
$target_type is 'system', then use the target_system_arch[]
source set :)
Mechanical change doing:
$ sed -i -e s/target_softmmu_arch/target_system_arch/g \
$(git grep -l target_softmmu_arch)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-13-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Software MMU is TCG specific. Here 'softmmu' is misused
for system emulation. Anyhow, since KVM is system emulation
specific, just rename as 'i386_kvm_ss'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-10-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since we *might* have user emulation with softmmu,
replace the system emulation check by !user emulation one.
(target/ was cleaned from invalid CONFIG_SOFTMMU uses at
commit cab35c73be, but these files were merged few days
after, thus missed the cleanup.)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004082239.27251-1-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hvf_get_supported_cpuid() is only defined for x86 targets
(in target/i386/hvf/x86_cpuid.c).
Its declaration is pointless on all other targets.
All the calls to it in target/i386/cpu.c are guarded by
a call on hvf_enabled(), so are elided when HVF is not
built in. Therefore we can remove the unnecessary function
stub.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004092510.39498-3-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The only non standard type -- CPUArchState -- is forward
declared in "qemu/typedefs.h", so no particular header is
required here.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Roman Bolshakov <roman@roolebo.dev>
Tested-by: Roman Bolshakov <roman@roolebo.dev>
Message-ID: <20231004092510.39498-2-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This function is now empty, so remove it. In the case of
m68k and tricore, this empties the class instance initfn,
so remove those as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allow the name 'cpu_env' to be used for something else.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Retain the separate structure to emphasize its importance.
Enforce CPUArchState always follows CPUState without padding.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The omission of alignment has technically been wrong since
269bd5d8f6, where QEMU_ALIGNED was added to CPUTLBDescFast.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
accel_cpu_realize() is a generic function working with CPUs
from any target. Rename it using '_common_' to emphasis it is
not target specific.
Suggested-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231003123026.99229-4-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The AccelCPUClass::cpu_realizefn handler is meant for target
specific code, rename it using '_target_' to emphasis it.
Suggested-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20231003123026.99229-3-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We use the '*fn' suffix for handlers, this is a public method.
Drop the suffix.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20231003123026.99229-2-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Just remove the declaration. There is nothing in the function after the
switch statement, so it is safe to do.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Gather Data Sampling (GDS) is a side-channel attack using Gather
instructions. Some Intel processors will set ARCH_CAP_GDS_NO bit in
MSR IA32_ARCH_CAPABILITIES to report that they are not vulnerable to
GDS.
Make this bit available to guests.
Closes: https://lore.kernel.org/qemu-devel/CAMGffEmG6TNq0n3+4OJAgXc8J0OevY60KHZekXCBs3LoK9vehA@mail.gmail.com/
Reported-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Tested-by: Jack Wang <jinpu.wang@ionos.com>
Tested-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Message-ID: <fde42d81ce454477ca8e27d5429a190b7366fe86.1692074650.git.pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On parts that enumerate IA32_VMX_BASIC MSR bit as 1, any exception vector
can be delivered with or without an error code if the other consistency
checks are satisfied.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use generic routine for 64-bit carry-less multiply.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
x86_cpu_get_supported_cpuid() is generic and handles the different
accelerators. Use it instead of kvm_arch_get_supported_cpuid().
That fixes a link failure introduced by commit 3adce820cf
("target/i386: Remove unused KVM stubs") when QEMU is configured
as:
$ ./configure --cc=clang \
--target-list=x86_64-linux-user,x86_64-softmmu \
--enable-debug
We were getting:
[71/71] Linking target qemu-x86_64
FAILED: qemu-x86_64
/usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in function `cpu_x86_cpuid':
cpu.c:(.text+0x1374): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o: in function `x86_cpu_filter_features':
cpu.c:(.text+0x81c2): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: cpu.c:(.text+0x81da): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: cpu.c:(.text+0x81f2): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: cpu.c:(.text+0x820a): undefined reference to `kvm_arch_get_supported_cpuid'
/usr/bin/ld: libqemu-x86_64-linux-user.fa.p/target_i386_cpu.c.o:cpu.c:(.text+0x8225): more undefined references to `kvm_arch_get_supported_cpuid' follow
clang: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
For the record, this is because '--enable-debug' disables
optimizations (CFLAGS=-O0).
While at this (un)optimization level GCC eliminate the
following dead code (CPP output of mentioned build):
static void x86_cpu_get_supported_cpuid(uint32_t func, uint32_t index,
uint32_t *eax, uint32_t *ebx,
uint32_t *ecx, uint32_t *edx)
{
if ((0)) {
*eax = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EAX);
*ebx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EBX);
*ecx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_ECX);
*edx = kvm_arch_get_supported_cpuid(kvm_state, func, index, R_EDX);
} else if (0) {
*eax = 0;
*ebx = 0;
*ecx = 0;
*edx = 0;
} else {
*eax = 0;
*ebx = 0;
*ecx = 0;
*edx = 0;
}
Clang does not (see commit 2140cfa51d "i386: Fix build by
providing stub kvm_arch_get_supported_cpuid()").
Cc: qemu-stable@nongnu.org
Fixes: 3adce820cf ("target/i386: Remove unused KVM stubs")
Reported-by: Kevin Wolf <kwolf@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230913093009.83520-4-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86_cpu_get_supported_cpuid() already checks for KVM/HVF
accelerators, so it is not needed to manually check it via
a call to accel_uses_host_cpuid() before calling it.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230913093009.83520-3-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In case more code is added after the kvm_hyperv_expand_features()
call, check its return value (since it can fail).
Fixes: 071ce4b03b ("i386: expand Hyper-V features during CPU feature expansion time")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230913093009.83520-2-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target/i386: fix BQL handling of the legacy FERR interrupts
* target/i386: fix memory operand size for CVTPS2PD
* target/i386: Add support for AMX-COMPLEX in CPUID enumeration
* compile plugins on Darwin
* configure and meson cleanups
* drop mkvenv support for Python 3.7 and Debian10
* add wrap file for libblkio
* tweak KVM stubs
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT5t6UUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMmjwf+MpvVuq+nn+3PqGUXgnzJx5ccA5ne
O9Xy8+1GdlQPzBw/tPovxXDSKn3HQtBfxObn2CCE1tu/4uHWpBA1Vksn++NHdUf2
P0yoHxGskJu5iYYTtIcNw5cH2i+AizdiXuEjhfNjqD5Y234cFoHnUApt9e3zBvVO
cwGD7WpPuSb4g38hHkV6nKcx72o7b4ejDToqUVZJ2N+RkddSqB03fSdrOru0hR7x
V+lay0DYdFszNDFm05LJzfDbcrHuSryGA91wtty7Fzj6QhR/HBHQCUZJxMB5PI7F
Zy4Zdpu60zxtSxUqeKgIi7UhNFgMcax2Hf9QEqdc/B4ARoBbboh4q4u8kQ==
=dH7/
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* only build util/async-teardown.c when system build is requested
* target/i386: fix BQL handling of the legacy FERR interrupts
* target/i386: fix memory operand size for CVTPS2PD
* target/i386: Add support for AMX-COMPLEX in CPUID enumeration
* compile plugins on Darwin
* configure and meson cleanups
* drop mkvenv support for Python 3.7 and Debian10
* add wrap file for libblkio
* tweak KVM stubs
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT5t6UUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMmjwf+MpvVuq+nn+3PqGUXgnzJx5ccA5ne
# O9Xy8+1GdlQPzBw/tPovxXDSKn3HQtBfxObn2CCE1tu/4uHWpBA1Vksn++NHdUf2
# P0yoHxGskJu5iYYTtIcNw5cH2i+AizdiXuEjhfNjqD5Y234cFoHnUApt9e3zBvVO
# cwGD7WpPuSb4g38hHkV6nKcx72o7b4ejDToqUVZJ2N+RkddSqB03fSdrOru0hR7x
# V+lay0DYdFszNDFm05LJzfDbcrHuSryGA91wtty7Fzj6QhR/HBHQCUZJxMB5PI7F
# Zy4Zdpu60zxtSxUqeKgIi7UhNFgMcax2Hf9QEqdc/B4ARoBbboh4q4u8kQ==
# =dH7/
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 07 Sep 2023 07:44:37 EDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (51 commits)
docs/system/replay: do not show removed command line option
subprojects: add wrap file for libblkio
sysemu/kvm: Restrict kvm_pc_setup_irq_routing() to x86 targets
sysemu/kvm: Restrict kvm_has_pit_state2() to x86 targets
sysemu/kvm: Restrict kvm_get_apic_state() to x86 targets
sysemu/kvm: Restrict kvm_arch_get_supported_cpuid/msr() to x86 targets
target/i386: Restrict declarations specific to CONFIG_KVM
target/i386: Allow elision of kvm_hv_vpindex_settable()
target/i386: Allow elision of kvm_enable_x2apic()
target/i386: Remove unused KVM stubs
target/i386/cpu-sysemu: Inline kvm_apic_in_kernel()
target/i386/helper: Restrict KVM declarations to system emulation
hw/i386/fw_cfg: Include missing 'cpu.h' header
hw/i386/pc: Include missing 'cpu.h' header
hw/i386/pc: Include missing 'sysemu/tcg.h' header
Revert "mkvenv: work around broken pip installations on Debian 10"
mkvenv: assume presence of importlib.metadata
Python: Drop support for Python 3.7
configure: remove dead code
meson: list leftover CONFIG_* symbols
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
kvm_pc_setup_irq_routing() is only defined for x86 targets (in
hw/i386/kvm/apic.c). Its declaration is pointless on all
other targets.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-14-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_has_pit_state2() is only defined for x86 targets (in
target/i386/kvm/kvm.c). Its declaration is pointless on
all other targets. Have it return a boolean.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-13-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_get_apic_state() is only defined for x86 targets (in
hw/i386/kvm/apic.c). Its declaration is pointless on all
other targets.
Since we include "linux-headers/asm-x86/kvm.h", no need
to forward-declare 'struct kvm_lapic_state'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-12-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_arch_get_supported_cpuid() / kvm_arch_get_supported_msr_feature()
are only defined for x86 targets (in target/i386/kvm/kvm.c). Their
declarations are pointless on other targets.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-11-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Keep the function accessed by target/i386/ and hw/i386/
exposed, restrict the ones accessed by target/i386/kvm/.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-10-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Call kvm_enabled() before kvm_hv_vpindex_settable()
to let the compiler elide its call.
kvm-stub.c is now empty, remove it.
Suggested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-9-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Call kvm_enabled() before kvm_enable_x2apic() to let the compiler elide
its call. Cleanup the code by simplifying "!xen_enabled() &&
kvm_enabled()" to just "kvm_enabled()".
Suggested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-8-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All these functions:
- kvm_arch_get_supported_cpuid()
- kvm_has_smm(()
- kvm_hyperv_expand_features()
- kvm_set_max_apic_id()
are called after checking for kvm_enabled(), which is
false when KVM is not built. Since the compiler elides
these functions, their stubs are not used and can be
removed.
Inspired-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-7-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to have cpu-sysemu.c become accelerator-agnostic,
inline kvm_apic_in_kernel() -- which is a simple wrapper
to kvm_irqchip_in_kernel() -- and use the generic "sysemu/kvm.h"
header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-6-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
User emulation doesn't need any KVM declarations.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230904124325.79040-5-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Latest Intel platform GraniteRapids-D introduces AMX-COMPLEX, which adds
two instructions to perform matrix multiplication of two tiles containing
complex elements and accumulate the results into a packed single precision
tile.
AMX-COMPLEX is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 8]. Add the CPUID
definition for AMX-COMPLEX, AMX-COMPLEX will be enabled automatically when
using '-cpu host' and KVM advertises AMX-COMPLEX to userspace.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230830074324.84059-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVTPS2PD only loads a half-register for memory, unlike the other
operations under 0x0F 0x5A. "Unpack" the group into separate
emission functions instead of using gen_unary_fp_sse.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVTPS2PD only loads a half-register for memory, like CVTPH2PS. It can
reuse the "ph" packed half-precision size to load a half-register,
but rename it to "xh" because it is now a variation of "x" (it is not
used only for half-precision values).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All these files access the CPU LD/ST API declared in "exec/cpu_ldst.h".
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230828221314.18435-4-philmd@linaro.org>
By default, C function prototypes declared in headers are visible,
so there is no need to declare them as 'extern' functions.
Remove this redundancy in a single bulk commit; do not modify:
- meson.build (used to check function availability at runtime)
- pc-bios/
- libdecnumber/
- tests/
- *.c
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230605175647.88395-5-philmd@linaro.org>
HAX is deprecated since commits 73741fda6c ("MAINTAINERS: Abort
HAXM maintenance") and 90c167a1da ("docs/about/deprecated: Mark
HAXM in QEMU as deprecated"), released in v8.0.0.
Per the latest HAXM release (v7.8 [*]), the latest QEMU supported
is v7.2:
Note: Up to this release, HAXM supports QEMU from 2.9.0 to 7.2.0.
The next commit (https://github.com/intel/haxm/commit/da1b8ec072)
added:
HAXM v7.8.0 is our last release and we will not accept
pull requests or respond to issues after this.
It became very hard to build and test HAXM. Its previous
maintainers made it clear they won't help. It doesn't seem to be
a very good use of QEMU maintainers to spend their time in a dead
project. Save our time by removing this orphan zombie code.
[*] https://github.com/intel/haxm/releases/tag/v7.8.0
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230831082016.60885-1-philmd@linaro.org>
Current QEMU can expose waitpkg to guests when it is available. However,
VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE is still not recognized and
masked by QEMU. This can lead to an unexpected situation when a L1
hypervisor wants to expose waitpkg to a L2 guest. The L1 hypervisor can
assume that VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE exists as waitpkg is
available. The L1 hypervisor then can accidentally expose waitpkg to the
L2 guest. This will cause invalid opcode exception in the L2 guest when
it executes waitpkg related instructions.
This patch adds VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE support, and
sets up dependency between the bit and CPUID_7_0_ECX_WAITPKG. QEMU should
not expose waitpkg feature if VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE is
not available to avoid unexpected invalid opcode exception in L2 guests.
Signed-off-by: Ake Koomsin <ake@igel.co.jp>
Message-ID: <20230807093339.32091-2-ake@igel.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Changes the signature of the target-defined functions for
inserting/removing hvf hw breakpoints. The address and length arguments
are now of vaddr type, which both matches the type used internally in
accel/hvf/hvf-all.c and makes the api target-agnostic.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-5-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Changes the signature of the target-defined functions for
inserting/removing kvm hw breakpoints. The address and length arguments
are now of vaddr type, which both matches the type used internally in
accel/kvm/kvm-all.c and makes the api target-agnostic.
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230807155706.9580-4-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
kvm_arch_get_default_type() returns the default KVM type. This hook is
particularly useful to derive a KVM type that is valid for "none"
machine model, which is used by libvirt to probe the availability of
KVM.
For MIPS, the existing mips_kvm_type() is reused. This function ensures
the availability of VZ which is mandatory to use KVM on the current
QEMU.
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-2-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added doc comment for new function]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When CR0.TS=1, execution of x87 FPU, MMX, and some SSE instructions will
cause a Device Not Available (DNA) exception (#NM). System software uses
this exception event to lazily context switch FPU state.
Before this patch, enter_mmx helpers may be generated just before #NM
generation, prematurely resetting FPU state before the guest has a
chance to save it.
Signed-off-by: Matt Borgerson <contact@mborgerson.com>
Message-ID: <CADc=-s5F10muEhLs4f3mxqsEPAHWj0XFfOC2sfFMVHrk9fcpMg@mail.gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Coverity points out (CID 1507534, 1507968) that we sometimes access
env->xen_singleshot_timer_ns under the protection of
env->xen_timers_lock and sometimes not.
This isn't always an issue. There are two modes for the timers; if the
kernel supports the EVTCHN_SEND capability then it handles all the timer
hypercalls and delivery internally, and all we use the field for is to
get/set the timer as part of the vCPU state via an ioctl(). If the
kernel doesn't have that support, then we do all the emulation within
qemu, and *those* are the code paths where we actually care about the
locking.
But it doesn't hurt to be a little bit more consistent and avoid having
to explain *why* it's OK.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230801175747.145906-3-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This implements the AESDEC instruction.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This implements the AESENC instruction.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This implements the AESIMC instruction.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This implements the AESDECLAST instruction.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This implements the AESENCLAST instruction.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The GraniteRapids CPU model mainly adds the following new features
based on SapphireRapids:
- PREFETCHITI CPUID.(EAX=7,ECX=1):EDX[bit 14]
- AMX-FP16 CPUID.(EAX=7,ECX=1):EAX[bit 21]
And adds the following security fix for corresponding vulnerabilities:
- MCDT_NO CPUID.(EAX=7,ECX=2):EDX[bit 5]
- SBDR_SSDP_NO MSR_IA32_ARCH_CAPABILITIES[bit 13]
- FBSDP_NO MSR_IA32_ARCH_CAPABILITIES[bit 14]
- PSDP_NO MSR_IA32_ARCH_CAPABILITIES[bit 15]
- PBRSB_NO MSR_IA32_ARCH_CAPABILITIES[bit 24]
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230706054949.66556-7-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SapphireRapids has bit 13, 14 and 15 of MSR_IA32_ARCH_CAPABILITIES
enabled, which are related to some security fixes.
Add version 2 of SapphireRapids CPU model with those bits enabled also.
Signed-off-by: Lei Wang <lei4.wang@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20230706054949.66556-6-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, bit 13, 14, 15 and 24 of MSR_IA32_ARCH_CAPABILITIES are
disclosed for fixing security issues, so add those bit definitions.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-ID: <20230706054949.66556-5-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MCDT_NO bit indicates HW contains the security fix and doesn't need to
be mitigated to avoid data-dependent behaviour for certain instructions.
It needs no hypervisor support. Treat it as supported regardless of what
KVM reports.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230706054949.66556-4-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CPUID.(EAX=7,ECX=2):EDX[bit 5] enumerates MCDT_NO. Processors enumerate
this bit as 1 do not exhibit MXCSR Configuration Dependent Timing (MCDT)
behavior and do not need to be mitigated to avoid data-dependent behavior
for certain instructions.
Since MCDT_NO is in a new sub-leaf, add a new CPUID feature word
FEAT_7_2_EDX. Also update cpuid_level_func7 by FEAT_7_2_EDX.
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230706054949.66556-3-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If FEAT_7_1_EAX is 0 and FEAT_7_1_EDX is non-zero, as is the case
with a Granite Rapids host and
'-cpu host,-avx-vnni,-avx512-bf16,-fzrm,-fsrs,-fsrc,-amx-fp16', we can't
get CPUID_7_1 leaf even though CPUID_7_1_EDX has non-zero value.
Update cpuid_level_func7 according to CPUID_7_1_EDX, otherwise
guest may report wrong maximum number sub-leaves in leaf 07H.
Fixes: eaaa197d5b ("target/i386: Add support for AVX-VNNI-INT8 in CPUID enumeration")
Cc: qemu-stable@nongnu.org
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20230706054949.66556-2-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Fix backwards time with -icount auto
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSdRiQUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOqcwf9FGAqZ+0V34Y8XeXMu8Es3bFjEKG8
t3BpVNhTBOYDPvpshnPVx2I29nRT2opc1C4YkjMAv5/1nivj1kDM7hDObOSJQvqy
5FgTsJYqRtGj+J7uVBrspWZsP8BYeykKmXR6deBOPvCuw5nnLdDQ3dLV2F26lKUu
lsFyEVbi4dzf8+TVuNIXEg7mVBYytjBQwBmmHgeOofeikjq9WEudr49mwJMCHyzl
iXCatnctXGKZYSnp+eHIBiFRdSzjqdgrDRa0ysSqABoBI1pmkhyQKSay6cSjfG4n
gFlqPF/i9RqAWpsQrM1IMGgPK39SrT2dYlHDJV2P/NEQrS6kLh2HoW/ArQ==
=oj3B
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* Make named CPU models usable for qemu-{i386,x86_64}
* Fix backwards time with -icount auto
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSdRiQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOqcwf9FGAqZ+0V34Y8XeXMu8Es3bFjEKG8
# t3BpVNhTBOYDPvpshnPVx2I29nRT2opc1C4YkjMAv5/1nivj1kDM7hDObOSJQvqy
# 5FgTsJYqRtGj+J7uVBrspWZsP8BYeykKmXR6deBOPvCuw5nnLdDQ3dLV2F26lKUu
# lsFyEVbi4dzf8+TVuNIXEg7mVBYytjBQwBmmHgeOofeikjq9WEudr49mwJMCHyzl
# iXCatnctXGKZYSnp+eHIBiFRdSzjqdgrDRa0ysSqABoBI1pmkhyQKSay6cSjfG4n
# gFlqPF/i9RqAWpsQrM1IMGgPK39SrT2dYlHDJV2P/NEQrS6kLh2HoW/ArQ==
# =oj3B
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 29 Jun 2023 10:51:48 AM CEST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
target/i386: emulate 64-bit ring 0 for linux-user if LM feature is set
target/i386: ignore CPL0-specific features in user mode emulation
target/i386: ignore ARCH_CAPABILITIES features in user mode emulation
target/i386: Export MSR_ARCH_CAPABILITIES bits to guests
icount: don't adjust virtual time backwards after warp
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
32-bit binaries can run on a long mode processor even if the kernel
is 64-bit, of course, and this can have slightly different behavior;
for example, SYSCALL is allowed on Intel processors.
Allow reporting LM to programs running under user mode emulation,
so that "-cpu" can be used with named CPU models even for qemu-i386
and even without disabling LM by hand.
Fortunately, most of the runtime code in QEMU has to depend on HF_LMA_MASK
or on HF_CS64_MASK (which is anyway false for qemu-i386's 32-bit code
segment) rather than TARGET_X86_64, therefore all that is needed is an
update of linux-user's ring 0 setup.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1534
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Features such as PCID are only accessible through privileged operations,
and therefore have no impact on any user-mode operation. Allow reporting
them to programs running under user mode emulation, so that "-cpu" can be
used with more named CPU models.
XSAVES would be similar, but it doesn't make sense to provide it until
XSAVEC is implemented.
With this change, all CPUs up to Broadwell-v4 can be emulate. Skylake-Client
requires XSAVEC, while EPYC also requires SHA-NI, MISALIGNSSE and TOPOEXT.
MISALIGNSSE is not hard to implement, but I am not sure it is worth using
a precious hflags bit for it.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1534
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ARCH_CAPABILITIES is only accessible through a read-only MSR, so it has
no impact on any user-mode operation (user-mode cannot read the MSR).
So do not bother printing warnings about it in user mode emulation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On Intel CPUs there are certain bits in MSR_ARCH_CAPABILITIES that
indicates if the CPU is not affected by a vulnerability. Without these
bits guests may try to deploy the mitigation even if the CPU is not
affected.
Export the bits to guests that indicate immunity to hardware
vulnerabilities.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Message-ID: <63d85cc76d4cdc51e6c732478b81d8f13be11e5a.1687551881.git.pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add MEMORY_LISTNER_PRIORITY_ACCEL for the symbolic value for the memory
listener to replace the hard-coded value 10 for accel.
No functional change intended.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <feebe423becc6e2aa375f59f6abce9a85bc15abb.1687279702.git.isaku.yamahata@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
003f230e37 ("machine: Tweak the order of topology members in struct
CpuTopology") changes the meaning of MachineState.smp.cores from "the
number of cores in one package" to "the number of cores in one die"
and doesn't fix other uses of MachineState.smp.cores. And because of
the introduction of cluster, now smp.cores just means "the number of
cores in one cluster". This clearly does not fit the semantics here.
And before this error message, WHvSetPartitionProperty() is called to
set prop.ProcessorCount.
So the error message should show the prop.ProcessorCount other than
"cores per cluster" or "cores per package".
Cc: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230529124331.412822-1-zhao1.liu@linux.intel.com>
[PMD: Use '%u' format for ProcessorCount]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
We want all accelerators to share the same opaque pointer in
CPUState.
Rename the 'hvf_vcpu_state' structure as 'AccelCPUState'.
Use the generic 'accel' field of CPUState instead of 'hvf'.
Replace g_malloc0() by g_new0() for readability.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230624174121.11508-17-philmd@linaro.org>
Most of the codebase uses 'CPUState *cpu' or 'CPUState *cs'.
While 'cpu_state' is kind of explicit, it makes the code
harder to review. Simply rename as 'cs'.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230624174121.11508-16-philmd@linaro.org>
No need for this helper to access the CPUState::accel field.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230624174121.11508-15-philmd@linaro.org>
We want all accelerators to share the same opaque pointer in
CPUState. Rename WHPX 'whpx_vcpu' as 'AccelCPUState'; use
the typedef.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-14-philmd@linaro.org>
No need for this helper to access the CPUState::accel field.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230624174121.11508-12-philmd@linaro.org>
We want all accelerators to share the same opaque pointer in
CPUState. Rename NVMM 'qemu_vcpu' as 'AccelCPUState'; directly
use the typedef, remove unnecessary casts.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230624174121.11508-11-philmd@linaro.org>
hThread variable is only used by the HAX accelerator,
so move it to the accelerator specific context.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-9-philmd@linaro.org>
We want all accelerators to share the same opaque pointer in
CPUState. Start with the HAX context, renaming its forward
declarated structure 'hax_vcpu_state' as 'AccelCPUState'.
Document the CPUState field. Directly use the typedef.
Remove the amusing but now unnecessary casts in NVMM / WHPX.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-8-philmd@linaro.org>
All accelerators will share a single opaque context
in CPUState. Start by renaming 'hax_vcpu' as 'accel'.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230624174121.11508-7-philmd@linaro.org>
When the vCPU thread finished its processing, destroy
it and signal its destruction to generic vCPU management
layer.
Add a sanity check for the vCPU accelerator context.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-6-philmd@linaro.org>
hThread is only used on the error path in hax_kick_vcpu_thread().
Fixes: b0cb0a66d6 ("Plumb the HAXM-based hardware acceleration support")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-5-philmd@linaro.org>
On Windows hosts, cpu->hThread is assigned but never accessed:
remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230624174121.11508-4-philmd@linaro.org>
Since MinGW commit 395dcfdea ("rename hyper-v headers and def
files to lower case") [*], WinHvPlatform.h and WinHvEmulation.h
got respectively renamed as winhvplatform.h / winhvemulation.h.
The mingw64-headers package included in the Fedora version we
use for CI does include this commit; and meson fails to detect
these present-but-renamed headers while cross-building (on
case-sensitive filesystems).
Use the renamed header in order to detect and successfully
cross-build with the WHPX accelerator.
Note, on Windows hosts, the libraries are still named as
WinHvPlatform.dll and WinHvEmulation.dll, so we don't bother
renaming the definitions used by load_whp_dispatch_fns() in
target/i386/whpx/whpx-all.c.
[*] https://sourceforge.net/p/mingw-w64/mingw-w64/ci/395dcfdea
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230624142211.8888-3-philmd@linaro.org>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-4-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
AMD supports both 32-bit and 64-bit SYSCALL/SYSRET, but the TCG only
exposes it for 64-bit targets. For system emulation just reuse the
helper; for user-mode emulation the ABI is the same as "int $80".
The BSDs does not support any fast system call mechanism in 32-bit
mode so add to bsd-user the same stub that FreeBSD has for 64-bit
compatibility mode.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
RDPID corresponds to a RDMSR(TSC_AUX); however, it is unprivileged
so for user-mode emulation we must provide the value that the kernel
places in the MSR. For Linux, it is a combination of the current CPU
and the current NUMA node, both of which can be retrieved with getcpu(2).
Also try sched_getcpu(), which might be there on the BSDs. If there is
no portable way to retrieve the current CPU id from userspace, return 0.
RDTSCP is reimplemented as RDTSC + RDPID ECX; the differences in terms
of serializability are not relevant to QEMU.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
WBNOINVD is the same as INVD or WBINVD as far as TCG is concerned,
since there is no cache in TCG and therefore no invalidation side effect
in WBNOINVD.
With respect to SVM emulation, processors that do not support WBNOINVD
will ignore the prefix and treat it as WBINVD, while those that support
it will generate exactly the same vmexit.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
XSAVEERPTR is actually a fix for an errata; TCG does not have the issue.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
TCG implements RDSEED, and in fact uses qcrypto_random_bytes which is
secure enough to match hardware behavior. Expose it to guests.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The AMD prefetch(w) instructions have not been deprecated together with the rest
of 3DNow!, and in fact are even supported by newer Intel processor. Mark them
as supported by TCG, as it supports all of 3DNow!.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Due to a typo or perhaps a brain fart, the INVD vmexit was never generated.
Fix it (but not that fixing just the typo would break both INVD and WBINVD,
due to a case of two wrongs making a right).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We use the user_ss[] array to hold the user emulation sources,
and the softmmu_ss[] array to hold the system emulation ones.
Hold the latter in the 'system_ss[]' array for parity with user
emulation.
Mechanical change doing:
$ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-10-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since we *might* have user emulation with softmmu,
use the clearer 'CONFIG_SYSTEM_ONLY' key to check
for system emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-9-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since cpu_mmu_index() is well-defined for user-only,
we can remove the surrounding #ifdef'ry entirely.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-2-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.
Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented as the Coding Style:
If you do use template header files they should be named with
the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
being included for expansion.
Therefore move the included templates in the tcg/ directory and
rename as '.h.inc'.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230608133108.72655-5-philmd@linaro.org>
Move the #ifdef'ry inside do_cpu_init() instead of
declaring an empty stub for user emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230602224628.59546-3-philmd@linaro.org>
Since commit 604664726f ("target/i386: Restrict cpu_exec_interrupt()
handler to sysemu"), do_cpu_sipi() isn't called anymore on user
emulation. Remove the now pointless stub.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230602224628.59546-2-philmd@linaro.org>
Guests can now be debugged through the gdbstub. Support is added for
single-stepping, software breakpoints, hardware breakpoints and
watchpoints. The code has been structured like the KVM counterpart.
While guest debugging is enabled, the guest can still read and write the
DBG*_EL1 registers but they don't have any effect.
Signed-off-by: Francesco Cagnin <fcagnin@quarkslab.com>
Message-id: 20230601153107.81955-5-fcagnin@quarkslab.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Required for guest debugging. The code has been structured like the KVM
counterpart.
Signed-off-by: Francesco Cagnin <fcagnin@quarkslab.com>
Message-id: 20230601153107.81955-4-fcagnin@quarkslab.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
New wrapper around gen_io_start which takes care of the USE_ICOUNT
check, as well as marking the DisasContext to end the TB.
Remove exec/gen-icount.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will enable replacement of TARGET_INSN_START_WORDS in tcg.c.
Split out "tcg/insn-start-words.h" and use it in target/.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for compiling tcg/ only once, eliminate
the all_helpers array. Instantiate the info structs for
the generic helpers in accel/tcg/, and the structs for
the target-specific helpers in each translate.c.
Since we don't see all of the info structs at startup,
initialize at first use, using g_once_init_* to make
sure we don't race while doing so.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This had been pulled in from exec/cpu_ldst.h, via exec/exec-all.h,
but the include of tcg.h will be removed.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Based on the kernel commit "b0563468ee x86/CPU/AMD: Disable XSAVES on
AMD family 0x17", host system with EPYC-Rome can clear XSAVES capability
bit. In another words, EPYC-Rome host without XSAVES can occur. Thus, we
need an EPYC-Rome cpu model (without this feature) that matches the
solution of fixing this erratum
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-Id: <20230524213748.8918-1-davydov-max@yandex-team.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Before this change, MOVNTPS and MOVNTPD were labeled as Exception Class
4 (only requiring alignment for legacy SSE instructions). This changes
them to Exception Class 1 (always requiring memory alignment), as
documented in the Intel manual.
Message-Id: <20230501111428.95998-3-ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the exception classes for some SSE/AVX instructions to match what is
documented in the Intel manual.
These changes are expected to have no functional effect on the behavior
that qemu implements (primarily >= 16-byte memory alignment checks). For
instance, since qemu does not implement the AC flag, there is no
difference in behavior between Exception Classes 4 and 5 for
instructions where the SSE version only takes <16 byte memory operands.
Message-Id: <20230501111428.95998-2-ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds some comments describing what instructions correspond to decoding
table entries and fixes some existing comments which named the wrong
instruction.
Message-Id: <20230501111428.95998-1-ricky@rzhou.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
vzeroall: xmm_regs should be used instead of xmm_t0
vpermdq: bit 3 and 7 of imm should be considered
Signed-off-by: Xinyu Li <lixinyu20s@ict.ac.cn>
Message-Id: <20230510145222.586487-1-lixinyu20s@ict.ac.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compared to other SSE instructions, VUCOMISx and VCOMISx are different:
the single and double precision versions are distinguished through a
prefix, however they use no-prefix and 0x66 for SS and SD respectively.
Scalar values usually are associated with 0xF2 and 0xF3.
Because of these, they incorrectly perform a 128-bit memory load instead
of a 32- or 64-bit load. Fix this by writing a custom decoding function.
I tested that the reproducer is fixed and the test-avx output does not
change.
Reported-by: Gabriele Svelto <gsvelto@mozilla.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1637
Fixes: f8d19eec0d ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As reported by the Intel's doc:
"FB_CLEAR: The processor will overwrite fill buffer values as part of
MD_CLEAR operations with the VERW instruction.
On these processors, L1D_FLUSH does not overwrite fill buffer values."
If this cpu feature is present in host, allow QEMU to choose whether to
show it to the guest too.
One disadvantage of not exposing it is that the guest will report
a non existing vulnerability in
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
because the mitigation is present only when the cpu has
(FLUSH_L1D and MD_CLEAR) or FB_CLEAR
features enabled.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230201135759.555607-3-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As reported by Intel's doc:
"L1D_FLUSH: Writeback and invalidate the L1 data cache"
If this cpu feature is present in host, allow QEMU to choose whether to
show it to the guest too.
One disadvantage of not exposing it is that the guest will report
a non existing vulnerability in
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
because the mitigation is present only when the cpu has
(FLUSH_L1D and MD_CLEAR) or FB_CLEAR
features enabled.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230201135759.555607-2-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds the support for AMD EPYC Genoa generation processors. The model
display for the new processor will be EPYC-Genoa.
Adds the following new feature bits on top of the feature bits from
the previous generation EPYC models.
avx512f : AVX-512 Foundation instruction
avx512dq : AVX-512 Doubleword & Quadword Instruction
avx512ifma : AVX-512 Integer Fused Multiply Add instruction
avx512cd : AVX-512 Conflict Detection instruction
avx512bw : AVX-512 Byte and Word Instructions
avx512vl : AVX-512 Vector Length Extension Instructions
avx512vbmi : AVX-512 Vector Byte Manipulation Instruction
avx512_vbmi2 : AVX-512 Additional Vector Byte Manipulation Instruction
gfni : AVX-512 Galois Field New Instructions
avx512_vnni : AVX-512 Vector Neural Network Instructions
avx512_bitalg : AVX-512 Bit Algorithms, add bit algorithms Instructions
avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
Quadword Instructions
avx512_bf16 : AVX-512 BFLOAT16 instructions
la57 : 57-bit virtual address support (5-level Page Tables)
vnmi : Virtual NMI (VNMI) allows the hypervisor to inject the NMI
into the guest without using Event Injection mechanism
meaning not required to track the guest NMI and intercepting
the IRET.
auto-ibrs : The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, unlike e.g.,
s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
resources automatically across CPL transitions.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <20230504205313.225073-8-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following featute bits.
vnmi: Virtual NMI (VNMI) allows the hypervisor to inject the NMI into the
guest without using Event Injection mechanism meaning not required to
track the guest NMI and intercepting the IRET.
The presence of this feature is indicated via the CPUID function
0x8000000A_EDX[25].
automatic-ibrs :
The AMD Zen4 core supports a new feature called Automatic IBRS.
It is a "set-and-forget" feature that means that, unlike e.g.,
s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
resources automatically across CPL transitions.
The presence of this feature is indicated via the CPUID function
0x80000021_EAX[8].
The documention for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-7-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits for EPYC-Milan model and bump the version.
vaes : Vector VAES(ENC|DEC), VAES(ENC|DEC)LAST instruction support
vpclmulqdq : Vector VPCLMULQDQ instruction support
stibp-always-on : Single Thread Indirect Branch Prediction Mode has enhanced
performance and may be left Always on
amd-psfd : Predictive Store Forward Disable
no-nested-data-bp : Processor ignores nested data breakpoints
lfence-always-serializing : LFENCE instruction is always serializing
null-sel-clr-base : Null Selector Clears Base. When this bit is
set, a null segment load clears the segment base
These new features will be added in EPYC-Milan-v2. The "-cpu help" output
after the change will be.
x86 EPYC-Milan (alias configured by machine type)
x86 EPYC-Milan-v1 AMD EPYC-Milan Processor
x86 EPYC-Milan-v2 AMD EPYC-Milan Processor
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. SECURITY ANALYSIS OF AMD PREDICTIVE STORE FORWARDING
c. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-6-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits.
no-nested-data-bp : Processor ignores nested data breakpoints.
lfence-always-serializing : LFENCE instruction is always serializing.
null-sel-cls-base : Null Selector Clears Base. When this bit is
set, a null segment load clears the segment base.
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
40332 4.05 Date October 2022
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
Message-Id: <20230504205313.225073-5-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the following feature bits.
amd-psfd : Predictive Store Forwarding Disable:
PSF is a hardware-based micro-architectural optimization
designed to improve the performance of code execution by
predicting address dependencies between loads and stores.
While SSBD (Speculative Store Bypass Disable) disables both
PSF and speculative store bypass, PSFD only disables PSF.
PSFD may be desirable for the software which is concerned
with the speculative behavior of PSF but desires a smaller
performance impact than setting SSBD.
Depends on the following kernel commit:
b73a54321ad8 ("KVM: x86: Expose Predictive Store Forwarding Disable")
stibp-always-on :
Single Thread Indirect Branch Prediction mode has enhanced
performance and may be left always on.
The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
Revision B1 Processors
b. SECURITY ANALYSIS OF AMD PREDICTIVE STORE FORWARDING
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Message-Id: <20230504205313.225073-4-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce new EPYC cpu versions: EPYC-v4 and EPYC-Rome-v3.
The only difference vs. older models is an updated cache_info with
the 'complex_indexing' bit unset, since this bit is not currently
defined for AMD and may cause problems should it be used for
something else in the future. Setting this bit will also cause
CPUID validation failures when running SEV-SNP guests.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230504205313.225073-3-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
New EPYC CPUs versions require small changes to their cache_info's.
Because current QEMU x86 CPU definition does not support versioned
cach_info, we would have to declare a new CPU type for each such case.
To avoid the dup work, add "cache_info" in X86CPUVersionDefinition",
to allow new cache_info pointers to be specified for a new CPU version.
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230504205313.225073-2-babu.moger@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Latest Intel platform Granite Rapids has introduced a new instruction -
PREFETCHIT0/1, which moves code to memory (cache) closer to the
processor depending on specific hints.
The bit definition:
CPUID.(EAX=7,ECX=1):EDX[bit 14]
Add CPUID definition for PREFETCHIT0/1.
Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20230303065913.1246327-7-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AVX-NE-CONVERT is a new set of instructions which can convert low
precision floating point like BF16/FP16 to high precision floating point
FP32, as well as convert FP32 elements to BF16. This instruction allows
the platform to have improved AI capabilities and better compatibility.
The bit definition:
CPUID.(EAX=7,ECX=1):EDX[bit 5]
Add CPUID definition for AVX-NE-CONVERT.
Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20230303065913.1246327-6-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AVX-VNNI-INT8 is a new set of instructions in the latest Intel platform
Sierra Forest, aims for the platform to have superior AI capabilities.
This instruction multiplies the individual bytes of two unsigned or
unsigned source operands, then adds and accumulates the results into the
destination dword element size operand.
The bit definition:
CPUID.(EAX=7,ECX=1):EDX[bit 4]
AVX-VNNI-INT8 is on a new feature bits leaf. Add a CPUID feature word
FEAT_7_1_EDX for this leaf.
Add CPUID definition for AVX-VNNI-INT8.
Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20230303065913.1246327-5-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AVX-IFMA is a new instruction in the latest Intel platform Sierra
Forest. This instruction packed multiplies unsigned 52-bit integers and
adds the low/high 52-bit products to Qword Accumulators.
The bit definition:
CPUID.(EAX=7,ECX=1):EAX[bit 23]
Add CPUID definition for AVX-IFMA.
Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20230303065913.1246327-4-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Latest Intel platform Granite Rapids has introduced a new instruction -
AMX-FP16, which performs dot-products of two FP16 tiles and accumulates
the results into a packed single precision tile. AMX-FP16 adds FP16
capability and allows a FP16 GPU trained model to run faster without
loss of accuracy or added SW overhead.
The bit definition:
CPUID.(EAX=7,ECX=1):EAX[bit 21]
Add CPUID definition for AMX-FP16.
Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20230303065913.1246327-3-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CMPccXADD is a new set of instructions in the latest Intel platform
Sierra Forest. This new instruction set includes a semaphore operation
that can compare and add the operands if condition is met, which can
improve database performance.
The bit definition:
CPUID.(EAX=7,ECX=1):EAX[bit 7]
Add CPUID definition for CMPCCXADD.
Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20230303065913.1246327-2-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Update the setting of CPUID 0x8000001F EBX to clearly document the ranges
associated with fields being set.
Fixes: 6cb8f2a663 ("cpu/i386: populate CPUID 0x8000_001F when SEV is active")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <5822fd7d02b575121380e1f493a8f6d9eba2b11a.1664550870.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The value of the reduced-phys-bits parameter is propogated to the CPUID
information exposed to the guest. Update the current validation check to
account for the size of the CPUID field (6-bits), ensuring the value is
in the range of 1 to 63.
Maintain backward compatibility, to an extent, by allowing a value greater
than 1 (so that the previously documented value of 5 still works), but not
allowing anything over 63.
Fixes: d8575c6c02 ("sev/i386: add command to initialize the memory encryption context")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <cca5341a95ac73f904e6300f10b04f9c62e4e8ff.1664550870.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We want to get rid of the "#ifdef TARGET_X86_64" compile-time switch
in the long run, so we can drop the separate compilation of the
"qemu-system-i386" binary one day - but we then still need a way to
run a guest with max. CPU settings in 32-bit mode. So the "max" CPU
should determine its family/model/stepping settings according to the
"large mode" (LM) CPU feature bit during runtime, so that it is
possible to run "qemu-system-x86_64 -cpu max,lm=off" and still get
a sane family/model/stepping setting for the guest CPU.
To be able to check the LM bit, we have to move the code that sets
up these properties to a "realize" function, since the LM setting is
not available yet when the "instance_init" function is being called.
Message-Id: <20230306154311.476458-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The previous patch wrongly replaced FEAT_XSAVE_XCR0_{LO|HI} with
FEAT_XSAVE_XSS_{LO|HI} in CPUID(EAX=12,ECX=1):{ECX,EDX}. As a result,
SGX enclaves only supported SSE and x87 feature (xfrm=0x3).
Fixes: 301e90675c ("target/i386: Enable support for XSAVES based features")
Signed-off-by: Yang Zhong <yang.zhong@linux.intel.com>
Reviewed-by: Yang Weijiang <weijiang.yang@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-Id: <20230406064041.420039-1-yang.zhong@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Coverity complains (CID 1507880) that the declaration "int error_code;"
in mmu_translate() is unreachable code. Since this is only a declaration,
this isn't actually a bug, but:
* it's a bear-trap for future changes, because if it was changed to
include an initialization 'int error_code = foo;' then the
initialization wouldn't actually happen (being dead code)
* it's against our coding style, which wants declarations to be
at the start of blocks
* it means that anybody reading the code has to go and look up
exactly what the C rules are for skipping over variable declarations
using a goto
Move the declaration to the top of the function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230406155946.3362077-1-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This had been pulled in via qemu/plugin.h from hw/core/cpu.h,
but that will be removed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230310195252.210956-5-richard.henderson@linaro.org>
[AJB: add various additional cases shown by CI]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230315174331.2959-15-alex.bennee@linaro.org>
Reviewed-by: Emilio Cota <cota@braap.org>
Upstream commit ddf0fd9ae1 "hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback"
added kvm_xen_maybe_deassert_callback usage to target/i386/kvm/kvm.c file without
conditional preprocessing check. This breaks any build not using CONFIG_XEN_EMU.
Protect call by conditional preprocessing to allow build without CONFIG_XEN_EMU.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20230308130557.2420-1-mrezanin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All uses are strictly read-only. Most of the obviously so,
as direct arguments to gen_helper_*.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230306175230.7110-9-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- split user and softmmu code
- use cleaner headers for tb_flush, target_ulong
- probe for gdb multiarch support at configure
- make syscall handling target independent
- add update guest debug of accel ops
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmQHomMACgkQ+9DbCVqe
KkTtFAf/aEyKY0iUNxtB4/oV1L2VnLaZi+iKoZT4RQgrhOhzr5WV6/3/V05cw1RJ
SIwcl8wB4gowYILs44eM/Qzcixiugl++2rvM4YVXiQyWKzkH6sY4X2iFuPGTwHLp
y+E7RM77QNS7M9xYaVkdsQawnbsgjG67wZKbb88aaekFEn61UuDg1V2Nqa2ICy7Y
/8yGIB2ixDfXOF0z4g8NOG44BXTDBtJbcEzf5GMz6D4HGnPZUbENy1nT0OcBk3zK
PqKPNkPFZ360pqA9MtougjZ3xTBb7Afe9nRRMquV2RoFmkkY2otSjdPBFQu5GBlm
NyTXEzjIQ6tCZlbS0eqdPVrUHHUx9g==
=Al36
-----END PGP SIGNATURE-----
Merge tag 'pull-gdbstub-070323-3' of https://gitlab.com/stsquad/qemu into staging
gdbstub refactor:
- split user and softmmu code
- use cleaner headers for tb_flush, target_ulong
- probe for gdb multiarch support at configure
- make syscall handling target independent
- add update guest debug of accel ops
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmQHomMACgkQ+9DbCVqe
# KkTtFAf/aEyKY0iUNxtB4/oV1L2VnLaZi+iKoZT4RQgrhOhzr5WV6/3/V05cw1RJ
# SIwcl8wB4gowYILs44eM/Qzcixiugl++2rvM4YVXiQyWKzkH6sY4X2iFuPGTwHLp
# y+E7RM77QNS7M9xYaVkdsQawnbsgjG67wZKbb88aaekFEn61UuDg1V2Nqa2ICy7Y
# /8yGIB2ixDfXOF0z4g8NOG44BXTDBtJbcEzf5GMz6D4HGnPZUbENy1nT0OcBk3zK
# PqKPNkPFZ360pqA9MtougjZ3xTBb7Afe9nRRMquV2RoFmkkY2otSjdPBFQu5GBlm
# NyTXEzjIQ6tCZlbS0eqdPVrUHHUx9g==
# =Al36
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Mar 2023 20:45:23 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-gdbstub-070323-3' of https://gitlab.com/stsquad/qemu: (30 commits)
gdbstub: move update guest debug to accel ops
gdbstub: Build syscall.c once
stubs: split semihosting_get_target from system only stubs
gdbstub: Adjust gdb_do_syscall to only use uint32_t and uint64_t
gdbstub: Remove gdb_do_syscallv
gdbstub: split out softmmu/user specifics for syscall handling
include: split target_long definition from cpu-defs
testing: probe gdb for supported architectures ahead of time
gdbstub: only compile gdbstub twice for whole build
gdbstub: move syscall handling to new file
gdbstub: move register helpers into standalone include
gdbstub: don't use target_ulong while handling registers
gdbstub: fix address type of gdb_set_cpu_pc
gdbstub: specialise stub_can_reverse
gdbstub: introduce gdb_get_max_cpus
gdbstub: specialise target_memory_rw_debug
gdbstub: specialise handle_query_attached
gdbstub: abstract target specific details from gdb_put_packet_binary
gdbstub: rationalise signal mapping in softmmu
gdbstub: move chunks of user code into own files
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
These inline helpers are all used by target specific code so move them
out of the general header so we don't needlessly pollute the rest of
the API with target specific stuff.
Note we have to include cpu.h in semihosting as it was relying on a
side effect before.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230302190846.2593720-21-alex.bennee@linaro.org>
Message-Id: <20230303025805.625589-21-richard.henderson@linaro.org>
This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).
Some more work on the actual PV back ends and xen-bus code is going to be
needed to really make soft reset and migration fully functional, and this
part is the basis for that.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Compute the eflags write mask separately, leaving one call
to the helper. Use tcg_constant_i32.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Translators are no longer required to free tcg temporaries.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since commit a0e61807a3 ("qapi: Remove QMP events and commands from
user-mode builds") we don't generate the "qapi-commands-machine.h"
header in a user-emulation-only build.
Guard qmp_query_cpu_definitions() within CONFIG_USER_ONLY; move
x86_cpu_class_check_missing_features() closer since it is only used
by this QMP command handler.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230223155540.30370-3-philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Since tcg_temp_new is now identical, use that.
In some cases we can avoid a copy from A0 or T0.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for returning the number of insns generated
via the same pointer. Adjust only the prototypes so far.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230227135202.9710-23-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230227135202.9710-11-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230227135202.9710-8-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230227135202.9710-3-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI
devices. Allow it to be increased if the user desires.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The way that Xen handles MSI PIRQs is kind of awful.
There is a special MSI message which targets a PIRQ. The vector in the
low bits of data must be zero. The low 8 bits of the PIRQ# are in the
destination ID field, the extended destination ID field is unused, and
instead the high bits of the PIRQ# are in the high 32 bits of the address.
Using the high bits of the address means that we can't intercept and
translate these messages in kvm_send_msi(), because they won't be caught
by the APIC — addresses like 0x1000fee46000 aren't in the APIC's range.
So we catch them in pci_msi_trigger() instead, and deliver the event
channel directly.
That isn't even the worst part. The worst part is that Xen snoops on
writes to devices' MSI vectors while they are *masked*. When a MSI
message is written which looks like it targets a PIRQ, it remembers
the device and vector for later.
When the guest makes a hypercall to bind that PIRQ# (snooped from a
marked MSI vector) to an event channel port, Xen *unmasks* that MSI
vector on the device. Xen guests using PIRQ delivery of MSI don't
ever actually unmask the MSI for themselves.
Now that this is working we can finally enable XENFEAT_hvm_pirqs and
let the guest use it all.
Tested with passthrough igb and emulated e1000e + AHCI.
CPU0 CPU1
0: 65 0 IO-APIC 2-edge timer
1: 0 14 xen-pirq 1-ioapic-edge i8042
4: 0 846 xen-pirq 4-ioapic-edge ttyS0
8: 1 0 xen-pirq 8-ioapic-edge rtc0
9: 0 0 xen-pirq 9-ioapic-level acpi
12: 257 0 xen-pirq 12-ioapic-edge i8042
24: 9600 0 xen-percpu -virq timer0
25: 2758 0 xen-percpu -ipi resched0
26: 0 0 xen-percpu -ipi callfunc0
27: 0 0 xen-percpu -virq debug0
28: 1526 0 xen-percpu -ipi callfuncsingle0
29: 0 0 xen-percpu -ipi spinlock0
30: 0 8608 xen-percpu -virq timer1
31: 0 874 xen-percpu -ipi resched1
32: 0 0 xen-percpu -ipi callfunc1
33: 0 0 xen-percpu -virq debug1
34: 0 1617 xen-percpu -ipi callfuncsingle1
35: 0 0 xen-percpu -ipi spinlock1
36: 8 0 xen-dyn -event xenbus
37: 0 6046 xen-pirq -msi ahci[0000:00:03.0]
38: 1 0 xen-pirq -msi-x ens4
39: 0 73 xen-pirq -msi-x ens4-rx-0
40: 14 0 xen-pirq -msi-x ens4-rx-1
41: 0 32 xen-pirq -msi-x ens4-tx-0
42: 47 0 xen-pirq -msi-x ens4-tx-1
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This wires up the basic infrastructure but the actual interrupts aren't
there yet, so don't advertise it to the guest.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Just hook up the basic hypercalls to stubs in xen_evtchn.c for now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Just the basic shell, with the event channel hookup. It only dumps the
buffer for now; a real ring implmentation will come in a subsequent patch.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Which is used to fetch xenstore PFN and port to be used
by the guest. This is preallocated by the toolstack when
guest will just read those and use it straight away.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Xen has eight frames at 0xfeff8000 for this; we only really need two for
now and KVM puts the identity map at 0xfeffc000, so limit ourselves to
four.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Introduce support for one shot and periodic mode of Xen PV timers,
whereby timer interrupts come through a special virq event channel
with deadlines being set through:
1) set_timer_op hypercall (only oneshot)
2) vcpu_op hypercall for {set,stop}_{singleshot,periodic}_timer
hypercalls
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The guest is permitted to specify an arbitrary domain/bus/device/function
and INTX pin from which the callback IRQ shall appear to have come.
In QEMU we can only easily do this for devices that actually exist, and
even that requires us "knowing" that it's a PCMachine in order to find
the PCI root bus — although that's OK really because it's always true.
We also don't get to get notified of INTX routing changes, because we
can't do that as a passive observer; if we try to register a notifier
it will overwrite any existing notifier callback on the device.
But in practice, guests using PCI_INTX will only ever use pin A on the
Xen platform device, and won't swizzle the INTX routing after they set
it up. So this is just fine.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The GSI callback (and later PCI_INTX) is a level triggered interrupt. It
is asserted when an event channel is delivered to vCPU0, and is supposed
to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0
is cleared again.
Thankfully, Xen does *not* assert the GSI if the guest sets its own
evtchn_upcall_pending field; we only need to assert the GSI when we
have delivered an event for ourselves. So that's the easy part, kind of.
There's a slight complexity in that we need to hold the BQL before we
can call qemu_set_irq(), and we definitely can't do that while holding
our own port_lock (because we'll need to take that from the qemu-side
functions that the PV backend drivers will call). So if we end up
wanting to set the IRQ in a context where we *don't* already hold the
BQL, defer to a BH.
However, we *do* need to poll for the evtchn_upcall_pending flag being
cleared. In an ideal world we would poll that when the EOI happens on
the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd
pairs — one is used to trigger the interrupt, and the other works in the
other direction to 'resample' on EOI, and trigger the first eventfd
again if the line is still active.
However, QEMU doesn't seem to do that. Even VFIO level interrupts seem
to be supported by temporarily unmapping the device's BARs from the
guest when an interrupt happens, then trapping *all* MMIO to the device
and sending the 'resample' event on *every* MMIO access until the IRQ
is cleared! Maybe in future we'll plumb the 'resample' concept through
QEMU's irq framework but for now we'll do what Xen itself does: just
check the flag on every vmexit if the upcall GSI is known to be
asserted.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Add the array of virq ports to each vCPU so that we can deliver timers,
debug ports, etc. Global virqs are allocated against vCPU 0 initially,
but can be migrated to other vCPUs (when we implement that).
The kernel needs to know about VIRQ_TIMER in order to accelerate timers,
so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value
of the singleshot timer across migration, as the kernel will handle the
hypercalls automatically now.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This finally comes with a mechanism for actually injecting events into
the guest vCPU, with all the atomic-test-and-set that's involved in
setting the bit in the shinfo, then the index in the vcpu_info, and
injecting either the lapic vector as MSI, or letting KVM inject the
bare vector.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
It calls an internal close_port() helper which will also be used from
EVTCHNOP_reset and will actually do the work to disconnect/unbind a port
once any of that is actually implemented in the first place.
That in turn calls a free_port() internal function which will be in
error paths after allocation.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This adds the basic structure for maintaining the port table and reporting
the status of ports therein.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The kvm_xen_inject_vcpu_callback_vector() function will either deliver
the per-vCPU local APIC vector (as an MSI), or just kick the vCPU out
of the kernel to trigger KVM's automatic delivery of the global vector.
Support for asserting the GSI/PCI_INTX callbacks will come later.
Also add kvm_xen_get_vcpu_info_hva() which returns the vcpu_info of
a given vCPU.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Include basic support for setting HVM_PARAM_CALLBACK_IRQ to the global
vector method HVM_PARAM_CALLBACK_TYPE_VECTOR, which is handled in-kernel
by raising the vector whenever the vCPU's vcpu_info->evtchn_upcall_pending
flag is set.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This is the hook for adding the HVM_PARAM_CALLBACK_IRQ parameter in a
subsequent commit.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Split out from another commit]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
The HVMOP_set_evtchn_upcall_vector hypercall sets the per-vCPU upcall
vector, to be delivered to the local APIC just like an MSI (with an EOI).
This takes precedence over the system-wide delivery method set by the
HVMOP_set_param hypercall with HVM_PARAM_CALLBACK_IRQ. It's used by
Windows and Xen (PV shim) guests but normally not by Linux.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Rework for upstream kernel changes and split from HVMOP_set_param]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch event_channel_op_compat which was never available to HVM guests]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Allow guest to setup the vcpu runstates which is used as
steal clock.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
In order to support Linux vdso in Xen.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Handle the hypercall to set a per vcpu info, and also wire up the default
vcpu_info in the shared_info page for the first 32 vCPUs.
To avoid deadlock within KVM a vCPU thread must set its *own* vcpu_info
rather than it being set from the context in which the hypercall is
invoked.
Add the vcpu_info (and default) GPA to the vmstate_x86_cpu for migration,
and restore it in kvm_arch_put_registers() appropriately.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This is simply when guest tries to register a vcpu_info
and since vcpu_info placement is optional in the minimum ABI
therefore we can just fail with -ENOSYS
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This is when guest queries for support for HVMOP_pagetable_dying.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Specifically XENMEM_add_to_physmap with space XENMAPSPACE_shared_info to
allow the guest to set its shared_info page.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Use the xen_overlay device, add compat support]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Xen will "latch" the guest's 32-bit or 64-bit ("long mode") setting when
the guest writes the MSR to fill in the hypercall page, or when the guest
sets the event channel callback in HVM_PARAM_CALLBACK_IRQ.
KVM handles the former and sets the kernel's long_mode flag accordingly.
The latter will be handled in userspace. Keep them in sync by noticing
when a hypercall is made in a mode that doesn't match qemu's idea of
the guest mode, and resyncing from the kernel. Do that same sync right
before serialization too, in case the guest has set the hypercall page
but hasn't yet made a system call.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
They both do the same thing and just call sched_yield. This is enough to
stop the Linux guest panicking when running on a host kernel which doesn't
intercept SCHEDOP_poll and lets it reach userspace.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
It allows to shutdown itself via hypercall with any of the 3 reasons:
1) self-reboot
2) shutdown
3) crash
Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather
than leading to triple faults if it remains unimplemented.
In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset
Xen shared pages and other enlightenments and leave a clean slate for the
new kernel without the hypervisor helpfully writing information at
unexpected addresses.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Ditch sched_op_compat which was never available for HVM guests,
Add SCHEDOP_soft_reset]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This is just meant to serve as an example on how we can implement
hypercalls. xen_version specifically since Qemu does all kind of
feature controllability. So handling that here seems appropriate.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Implement kvm_gva_rw() safely]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This means handling the new exit reason for Xen but still
crashing on purpose. As we implement each of the hypercalls
we will then return the right return code.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
There are (at least) three different vCPU ID number spaces. One is the
internal KVM vCPU index, based purely on which vCPU was chronologically
created in the kernel first. If userspace threads are all spawned and
create their KVM vCPUs in essentially random order, then the KVM indices
are basically random too.
The second number space is the APIC ID space, which is consistent and
useful for referencing vCPUs. MSIs will specify the target vCPU using
the APIC ID, for example, and the KVM Xen APIs also take an APIC ID
from userspace whenever a vCPU needs to be specified (as opposed to
just using the appropriate vCPU fd).
The third number space is not normally relevant to the kernel, and is
the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But
Xen timer hypercalls use it, and Xen timer hypercalls *really* want
to be accelerated in the kernel rather than handled in userspace, so
the kernel needs to be told.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Introduce support for emulating CPUID for Xen HVM guests. It doesn't make
sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally
when the xen-version machine property is set.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Obtain xen_version from KVM property, make it automatic]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This just initializes the basic Xen support in KVM for now. Only permitted
on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap
overlay, event channel, grant tables and other stuff will exist. There's
no point having the basic hypercall support if nothing else works.
Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used
later by support devices.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Change to match the recent change to probe_access_flags.
All existing callers updated to supply 0, so no change in behaviour.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- Various header cleaned up (removing pointless headers)
- Mark various files/code user/system specific
- Make various objects target-independent
- Remove tswapN() calls from dump.o
- Suggest g_assert_not_reached() instead of assert(0)
- qdev / qom
- Replace various container_of() by QOM cast macros
- Declare some QOM macros using OBJECT_DECLARE_TYPE()
- Embed OHCI QOM child in SM501 chipset
- hw (ISA & IDE)
- add some documentation, improve function names
- un-inline, open-code few functions
- have ISA API accessing IRQ/DMA prefer ISABus over ISADevice
- Demote IDE subsystem maintenance to "Odd Fixes"
- ui: Improve Ctrl+Alt hint on Darwin Cocoa
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmP9IeAACgkQ4+MsLN6t
wN7bdQ//SxJYJuQvqTT6s+O0LmP6NbqvhxCXX7YAwK2jCTM+zTgcqqRZCcisLQol
3ENu2UhnZmiLKHSOxatOVozbws08/u8Vl+WkW4UTMUb1yo5KPaPtq808Y95RdAJB
7D7B5juDGnFRAHXZz38zVk9uIuEkm+Po/pD0JQa+upBtAAgOJTqGavDNSR5+T0Yl
VjGdwK0b10skPqiF6OABYoy/4IFHVJJFIbARZh+a7hrF0llsbzUts5JiYsOxEEHQ
t3woUItdMnS1m0+Ty4AQ8m0Yv9y4HZOIzixvsZ+vChj5ariwUhL9/7wC/s/UCYEg
gKVA5X8R6n/ME6DScK99a+CyR/MXkz70b/rOUZxoutXhV3xdh4X1stL4WN9W/m3z
D4i4ZrUsDUcKCGWlj49of/dKbOPwk1+e/mT0oDZD6JzG0ODjfdVxvJ/JEV2iHgS3
WqHuSKzX/20H9j7/MgfbQ0HjBFOQ8tl781vQzhD+y+cF/IiTsHhrE6esIWho4bob
kfSdVydUWWRnBsnyGoRZXoEMX9tn+pu0nKxEDm2Bo2+jajsa0aZZPokgjxaz4MnD
Hx+/p1E+8IuOn05JgzQSgTJmKFdSbya203tXIsTo1kL2aJTJ6QfMvgEPP/fkn+lS
oQyVBFZmb1JDdTM1MxOncnlWLg74rp/CWEc+u5pSdbxMO/M/uac=
=AV/+
-----END PGP SIGNATURE-----
Merge tag 'buildsys-qom-qdev-ui-20230227' of https://github.com/philmd/qemu into staging
- buildsys
- Various header cleaned up (removing pointless headers)
- Mark various files/code user/system specific
- Make various objects target-independent
- Remove tswapN() calls from dump.o
- Suggest g_assert_not_reached() instead of assert(0)
- qdev / qom
- Replace various container_of() by QOM cast macros
- Declare some QOM macros using OBJECT_DECLARE_TYPE()
- Embed OHCI QOM child in SM501 chipset
- hw (ISA & IDE)
- add some documentation, improve function names
- un-inline, open-code few functions
- have ISA API accessing IRQ/DMA prefer ISABus over ISADevice
- Demote IDE subsystem maintenance to "Odd Fixes"
- ui: Improve Ctrl+Alt hint on Darwin Cocoa
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmP9IeAACgkQ4+MsLN6t
# wN7bdQ//SxJYJuQvqTT6s+O0LmP6NbqvhxCXX7YAwK2jCTM+zTgcqqRZCcisLQol
# 3ENu2UhnZmiLKHSOxatOVozbws08/u8Vl+WkW4UTMUb1yo5KPaPtq808Y95RdAJB
# 7D7B5juDGnFRAHXZz38zVk9uIuEkm+Po/pD0JQa+upBtAAgOJTqGavDNSR5+T0Yl
# VjGdwK0b10skPqiF6OABYoy/4IFHVJJFIbARZh+a7hrF0llsbzUts5JiYsOxEEHQ
# t3woUItdMnS1m0+Ty4AQ8m0Yv9y4HZOIzixvsZ+vChj5ariwUhL9/7wC/s/UCYEg
# gKVA5X8R6n/ME6DScK99a+CyR/MXkz70b/rOUZxoutXhV3xdh4X1stL4WN9W/m3z
# D4i4ZrUsDUcKCGWlj49of/dKbOPwk1+e/mT0oDZD6JzG0ODjfdVxvJ/JEV2iHgS3
# WqHuSKzX/20H9j7/MgfbQ0HjBFOQ8tl781vQzhD+y+cF/IiTsHhrE6esIWho4bob
# kfSdVydUWWRnBsnyGoRZXoEMX9tn+pu0nKxEDm2Bo2+jajsa0aZZPokgjxaz4MnD
# Hx+/p1E+8IuOn05JgzQSgTJmKFdSbya203tXIsTo1kL2aJTJ6QfMvgEPP/fkn+lS
# oQyVBFZmb1JDdTM1MxOncnlWLg74rp/CWEc+u5pSdbxMO/M/uac=
# =AV/+
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 27 Feb 2023 21:34:24 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'buildsys-qom-qdev-ui-20230227' of https://github.com/philmd/qemu: (125 commits)
ui/cocoa: user friendly characters for release mouse
dump: Add create_win_dump() stub for non-x86 targets
dump: Simplify compiling win_dump.o by introducing win_dump_available()
dump: Clean included headers
dump: Replace TARGET_PAGE_SIZE -> qemu_target_page_size()
dump: Replace tswapN() -> cpu_to_dumpN()
hw/ide/pci: Add PCIIDEState::isa_irq[]
hw/ide/via: Replace magic 2 value by ARRAY_SIZE / MAX_IDE_DEVS
hw/ide/piix: Refactor pci_piix_init_ports as pci_piix_init_bus per bus
hw/ide/piix: Pass Error* to pci_piix_init_ports() for better error msg
hw/ide/piix: Remove unused includes
hw/ide/pci: Unexport bmdma_active_if()
hw/ide/ioport: Remove unnecessary includes
hw/ide: Declare ide_get_[geometry/bios_chs_trans] in 'hw/ide/internal.h'
hw/ide: Rename idebus_active_if() -> ide_bus_active_if()
hw/ide: Rename ide_init2() -> ide_bus_init_output_irq()
hw/ide: Rename ide_exec_cmd() -> ide_bus_exec_cmd()
hw/ide: Rename ide_register_restart_cb -> ide_bus_register_restart_cb
hw/ide: Rename ide_create_drive() -> ide_bus_create_drive()
hw/ide: Rename ide_set_irq() -> ide_bus_set_irq()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ioapic sources reside in hw/intc already. Move the headers there
as well.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-11-shentey@gmail.com>
[PMD: Keep ioapic_internal.h in hw/intc/, not under include/]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
x86_cpu_dump_local_apic_state() is called from monitor.c which
is only compiled for system emulation since commit bf95728400
("monitor: remove target-specific code from monitor.c").
Interestingly this stub was added few weeks later in commit
1f871d49e3 ("hmp: added local apic dump state") and was not
necessary by that time.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221216220158.6317-5-philmd@linaro.org>
NEED_CPU_H is always defined for these target-specific headers.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221216220158.6317-2-philmd@linaro.org>
The 'hwaddr' type is only available / meaningful on system emulation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221216215519.5522-5-philmd@linaro.org>
The new CPU model mostly inherits features from Icelake-Server, while
adding new features:
- AMX (Advance Matrix eXtensions)
- Bus Lock Debug Exception
and new instructions:
- AVX VNNI (Vector Neural Network Instruction):
- VPDPBUS: Multiply and Add Unsigned and Signed Bytes
- VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation
- VPDPWSSD: Multiply and Add Signed Word Integers
- VPDPWSSDS: Multiply and Add Signed Integers with Saturation
- FP16: Replicates existing AVX512 computational SP (FP32) instructions
using FP16 instead of FP32 for ~2X performance gain
- SERIALIZE: Provide software with a simple way to force the processor to
complete all modifications, faster, allowed in all privilege levels and
not causing an unconditional VM exit
- TSX Suspend Load Address Tracking: Allows programmers to choose which
memory accesses do not need to be tracked in the TSX read set
- AVX512_BF16: Vector Neural Network Instructions supporting BFLOAT16
inputs and conversion instructions from IEEE single precision
- fast zero-length MOVSB (KVM doesn't support yet)
- fast short STOSB (KVM doesn't support yet)
- fast short CMPSB, SCASB (KVM doesn't support yet)
Features that may be added in future versions:
- CET (virtualization support hasn't been merged)
Signed-off-by: Wang, Lei <lei4.wang@intel.com>
Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <20220812055751.14553-1-lei4.wang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are just a flag that documents the performance characteristic of
an instruction; it needs no hypervisor support. So include them even
if KVM does not show them. In particular, FZRM/FSRS/FSRC have only
been added very recently, but they are available on Sapphire Rapids
processors.
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are three more markers for string operation optimizations.
They can all be added to TCG, whose string operations are more or
less as fast as they can be for short lengths.
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fast short REP MOVS can be added to TCG, since a trivial translation
of string operation is a good option for short lengths.
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We did not correctly handle N >= operand size.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1374
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114233206.3118472-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230207075115.1525-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Before this commit, when GDB attached an OS working on QEMU, order of FPU
stack registers printed by GDB command 'info float' was wrong. There was a
bug causing the problem in 'g' packets sent by QEMU to GDB. The packets have
values of registers of machine emulated by QEMU containing FPU stack
registers. There are 2 ways to specify a x87 FPU stack register. The first
is specifying by absolute indexed register names (R0, ..., R7). The second
is specifying by stack top relative indexed register names (ST0, ..., ST7).
Values of the FPU stack registers should be located in 'g' packet and be
ordered by the relative index. But QEMU had located these registers ordered
by the absolute index. After this commit, when QEMU reads registers to make
a 'g' packet, QEMU specifies FPU stack registers by the relative index.
Then, the registers are ordered correctly in the packet. As a result, GDB,
the packet receiver, can print FPU stack registers in the correct order.
Signed-off-by: TaiseiIto <taisei1212@outlook.jp>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <TY0PR0101MB4285923FBE9AD97CE832D95BA4E59@TY0PR0101MB4285.apcprd01.prod.exchangelabs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Failure to truncate the inputs results in garbage for the carry-out.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1373
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230115012103.3131796-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When ADCX is followed by ADOX or vice versa, the second instruction's
carry comes from EFLAGS and the condition codes use the CC_OP_ADCOX
operation. Retrieving the carry from EFLAGS is handled by this bit
of gen_ADCOX:
tcg_gen_extract_tl(carry_in, cpu_cc_src,
ctz32(cc_op == CC_OP_ADCX ? CC_C : CC_O), 1);
Unfortunately, in this case cc_op has been overwritten by the previous
"if" statement to CC_OP_ADCOX. This works by chance when the first
instruction is ADCX; however, if the first instruction is ADOX,
ADCX will incorrectly take its carry from OF instead of CF.
Fix by moving the computation of the new cc_op at the end of the function.
The included exhaustive test case fails without this patch and passes
afterwards.
Because ADCX/ADOX need not be invoked through the VEX prefix, this
regression bisects to commit 16fc5726a6 ("target/i386: reimplement
0x0f 0x38, add AVX", 2022-10-18). However, the mistake happened a
little earlier, when BMI instructions were rewritten using the new
decoder framework.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1471
Reported-by: Paul Jolly <https://gitlab.com/myitcv>
Fixes: 1d0b926150 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We forgot to set cc_src, which is used for computing C.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1370
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114180601.2993644-1-richard.henderson@linaro.org>
Cc: qemu-stable@nongnu.org
Fixes: 1d0b926150 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There were two problems here: not limiting the input to operand bits,
and not correctly handling large extraction length.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1372
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114230542.3116013-3-richard.henderson@linaro.org>
Cc: qemu-stable@nongnu.org
Fixes: 1d0b926150 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use tcg_gen_atomic_cmpxchg_i128 for the atomic case,
and tcg_gen_qemu_ld/st_i128 otherwise.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_gen_atomic_cmpxchg_i64 for the atomic case,
and tcg_gen_nonatomic_cmpxchg_i64 otherwise.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The HAXM project has been retired (see https://github.com/intel/haxm#status),
so we should mark the code in QEMU as deprecated (and finally remove it
unless somebody else picks the project up again - which is quite unlikely
since there are now whpx and hvf on these operating systems, too).
Message-Id: <20230126121034.1035138-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We have two inclusion loops:
block/block.h
-> block/block-global-state.h
-> block/block-common.h
-> block/blockjob.h
-> block/block.h
block/block.h
-> block/block-io.h
-> block/block-common.h
-> block/blockjob.h
-> block/block.h
I believe these go back to Emanuele's reorganization of the block API,
merged a few months ago in commit d7e2fe4aac.
Fortunately, breaking them is merely a matter of deleting unnecessary
includes from headers, and adding them back in places where they are
now missing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221221133551.3967339-2-armbru@redhat.com>