Commit Graph

218 Commits

Author SHA1 Message Date
Paolo Bonzini
442e38c4fb target/i386: extract gen_far_call/jmp, reordering temporaries
Extract the code into new functions, and swap T0/T1 so that T0 corresponds
to the first immediate in the instruction stream.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:52:19 +02:00
Paolo Bonzini
cc1d28bdbe target/i386: move 00-5F opcodes to new decoder
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:52:19 +02:00
Paolo Bonzini
445457693c target/i386: reintroduce debugging mechanism
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:52:12 +02:00
Paolo Bonzini
8b5de7ea56 target/i386: cleanup *gen_eob*
Create a new wrapper for syscall/sysret, and do not go through multiple
layers of wrappers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:52:09 +02:00
Paolo Bonzini
ccfabc00e0 target/i386: clarify the "reg" argument of functions returning CCPrepare
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:52:02 +02:00
Paolo Bonzini
89e4e65ac0 target/i386: do not use s->T0 and s->T1 as scratch registers for CCPrepare
Instead of using s->T0 or s->T1, create a scratch register
when computing the C, NC, L or LE conditions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:51:59 +02:00
Paolo Bonzini
bccb0c138e target/i386: extend cc_* when using them to compute flags
Instead of using s->tmp0 or s->tmp4 as the result, just extend the cc_*
registers in place.  It is harmless and, if multiple setcc instructions
are used, the optimizer will be able to remove the redundant ones.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:51:53 +02:00
Paolo Bonzini
dd17322be7 target/i386: pull cc_op update to callers of gen_jmp_rel{,_csize}
gen_update_cc_op must be called before control flow splits.  Doing it
in gen_jmp_rel{,_csize} may hide bugs, instead assert that cc_op is
clean---even if that means a few more calls to gen_update_cc_op().

With this new invariant, setting cc_op to CC_OP_DYNAMIC is unnecessary
since the caller should have done it.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:51:43 +02:00
Paolo Bonzini
bbba9594e8 target/i386: cleanup cc_op changes for REP/REPZ/REPNZ
gen_update_cc_op must be called before control flow splits.  Do it
where the jump on ECX!=0 is translated.

On the other hand, remove the call before gen_jcc1, which takes care of
it already, and explain why REPZ/REPNZ need not use CC_OP_DYNAMIC---the
translation block ends before any control-flow-dependent cc_op could
be observed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:51:31 +02:00
Paolo Bonzini
64ddadc6bb target/i386: cc_op is not dynamic in gen_jcc1
Resetting cc_op to CC_OP_DYNAMIC should be done at control flow junctions,
which is not the case here.  This translation block is ending and the
only effect of calling set_cc_op() would be a discard of s->cc_srcT.
This discard is useless (it's a temporary, not a global) and in fact
prevents gen_prepare_cc from returning s->cc_srcT.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:51:17 +02:00
Paolo Bonzini
e995f3f944 target/i386: remove mask from CCPrepare
With the introduction of TSTEQ and TSTNE the .mask field is always -1,
so remove all the now-unnecessary code.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:50:39 +02:00
Paolo Bonzini
9309b53e83 target/i386: use TSTEQ/TSTNE to check flags
The new conditions obviously come in handy when testing individual bits
of EFLAGS, and they make it possible to remove the .mask field of
CCPrepare.

Lowering to shift+and is done by the optimizer if necessary.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:50:39 +02:00
Paolo Bonzini
15957eb9ef target/i386: use TSTEQ/TSTNE to test low bits
When testing the sign bit or equality to zero of a partial register, it
is useful to use a single TSTEQ or TSTNE operation.  It can also be used
to test the parity flag, using bit 0 of the population count.

Do not do this for target_ulong-sized values however; the optimizer would
produce a comparison against zero anyway, and it avoids shifts by 64
which are undefined behavior.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-07 08:50:39 +02:00
Ruihan Li
6a5a63f74b target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK
When emulated with QEMU, interrupts will never come in the following
loop. However, if the NOP instruction is uncommented, interrupts will
fire as normal.

	loop:
		cli
    		call do_sti
		jmp loop

	do_sti:
		sti
		# nop
		ret

This behavior is different from that of a real processor. For example,
if KVM is enabled, interrupts will always fire regardless of whether the
NOP instruction is commented or not. Also, the Intel Software Developer
Manual states that after the STI instruction is executed, the interrupt
inhibit should end as soon as the next instruction (e.g., the RET
instruction if the NOP instruction is commented) is executed.

This problem is caused because the previous code may choose not to end
the TB even if the HF_INHIBIT_IRQ_MASK has just been reset (e.g., in the
case where the STI instruction is immediately followed by the RET
instruction), so that IRQs may not have a change to trigger. This commit
fixes the problem by always terminating the current TB to give IRQs a
chance to trigger when HF_INHIBIT_IRQ_MASK is reset.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
Message-ID: <20240415064518.4951-4-lrh2000@pku.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-03 14:15:48 +02:00
Mark Cave-Ayland
7653b44534 target/i386/translate.c: always write 32-bits for SGDT and SIDT
The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits
or 32-bits depending upon the operand size, but this is incorrect. Not only do
the Intel CPU manuals give contradictory information between processor
revisions, but this information doesn't even match real-life behaviour.

In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT
and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 with
Win32s to function correctly. Remove the masking applied due to the operand size
for SGDT and SIDT so that the TCG behaviour matches the behaviour on real
hardware.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198

--
MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this
patch fixes the issue in WFW 3.11 with Win32s. For more technical information I
highly recommend the excellent write-up at
https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/.
Message-ID: <20240419195147.434894-1-mark.cave-ayland@ilande.co.uk>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23 17:35:26 +02:00
Richard Henderson
8df1ba49d7 target/i386: Preserve DisasContextBase.insn_start across rewind
When aborting translation of the current insn, restore the
previous value of insn_start.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Jørgen Hansen <Jorgen.Hansen@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-04-09 07:45:09 -10:00
Ziqiao Kong
99d0dcd7f1 target/i386: Generate an illegal opcode exception on cmp instructions with lock prefix
target/i386: As specified by Intel Manual Vol2 3-180, cmp instructions
are not allowed to have lock prefix and a `UD` should be raised. Without
this patch, s1->T0 will be uninitialized and used in the case OP_CMPL.

Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
Message-ID: <20240215095015.570748-2-ziqiaokong@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-02-16 13:56:09 +01:00
Richard Henderson
3b91614004 include/exec: Change cpu_mmu_index argument to CPUState
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-02-03 16:46:10 +10:00
Anton Johansson
32f0c394bb target: Use vaddr in gen_intermediate_code
Makes gen_intermediate_code() signature target agnostic so the function
can be called from accel/tcg/translate-all.c without target specifics.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20240119144024.14289-9-anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-29 07:06:03 +10:00
Paolo Bonzini
729ba8e933 target/i386: pcrel: store low bits of physical address in data[0]
For PC-relative translation blocks, env->eip changes during the
execution of a translation block, Therefore, QEMU must be able to
recover an instruction's PC just from the TranslationBlock struct and
the instruction data with.  Because a TB will not span two pages, QEMU
stores all the low bits of EIP in the instruction data and replaces them
in x86_restore_state_to_opc.  Bits 12 and higher (which may vary between
executions of a PCREL TB, since these only use the physical address in
the hash key) are kept unmodified from env->eip.  The assumption is that
these bits of EIP, unlike bits 0-11, will not change as the translation
block executes.

Unfortunately, this is incorrect when the CS base is not aligned to a page.
Then the linear address of the instructions (i.e. the one with the
CS base addred) indeed will never span two pages, but bits 12+ of EIP
can actually change.  For example, if CS base is 0x80262200 and EIP =
0x6FF4, the first instruction in the translation block will be at linear
address 0x802691F4.  Even a very small TB will cross to EIP = 0x7xxx,
while the linear addresses will remain comfortably within a single page.

The fix is simply to use the low bits of the linear address for data[0],
since those don't change.  Then x86_restore_state_to_opc uses tb->cs_base
to compute a temporary linear address (referring to some unknown
instruction in the TB, but with the correct values of bits 12 and higher);
the low bits are replaced with data[0], and EIP is obtained by subtracting
again the CS base.

Huge thanks to Mark Cave-Ayland for the image and initial debugging,
and to Gitlab user @kjliew for help with bisecting another occurrence
of (hopefully!) the same bug.

It should be relatively easy to write a testcase that performs MMIO on
an EIP with different bits 12+ than the first instruction of the translation
block; any help is welcome.

Fixes: e3a79e0e87 ("target/i386: Enable TARGET_TB_PCREL", 2022-10-11)
Cc: qemu-stable@nongnu.org
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Richard Henderson <richard.henderson@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1759
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1964
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2012
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-18 10:43:42 +01:00
guoguangyao
2926eab896 target/i386: fix incorrect EIP in PC-relative translation blocks
The PCREL patches introduced a bug when updating EIP in the !CF_PCREL case.
Using s->pc in func gen_update_eip_next() solves the problem.

Cc: qemu-stable@nongnu.org
Fixes: b5e0d5d22f ("target/i386: Fix 32-bit wrapping of pc/eip computation")
Signed-off-by: guoguangyao <guoguangyao18@mails.ucas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240115020804.30272-1-guoguangyao18@mails.ucas.ac.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-18 10:43:14 +01:00
Richard Henderson
a58506b748 target/i386: Do not re-compute new pc with CF_PCREL
With PCREL, we have a page-relative view of EIP, and an
approximation of PC = EIP+CSBASE that is good enough to
detect page crossings.  If we try to recompute PC after
masking EIP, we will mess up that approximation and write
a corrupt value to EIP.

We already handled masking properly for PCREL, so the
fix in b5e0d5d2 was only needed for the !PCREL path.

Cc: qemu-stable@nongnu.org
Fixes: b5e0d5d22f ("target/i386: Fix 32-bit wrapping of pc/eip computation")
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240101230617.129349-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-18 10:43:14 +01:00
Paolo Bonzini
405c7c0708 target/i386: implement CMPccXADD
The main difficulty here is that a page fault when writing to the destination
must not overwrite the flags.  Therefore, the flags computation must be
inlined instead of using gen_jcc1*.

For simplicity, I am using an unconditional cmpxchg operation, that becomes
a NOP if the comparison fails.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:04:40 +01:00
Paolo Bonzini
d4f611711a target/i386: move operand load and writeback out of gen_cmovcc1
Similar to gen_setcc1, make gen_cmovcc1 receive TCGv.  This is more friendly
to simultaneous implementation in the old and the new decoder.

A small wart is that s->T0 of CMOV is currently the *second* argument (which
would ordinarily be in T1).  Therefore, the condition has to be inverted in
order to overwrite s->T0 with cpu_regs[reg] if the MOV is not performed.

This only applies to the old decoder, and this code will go away soon.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:04:15 +01:00
Paolo Bonzini
3497f1646f target/i386: prepare for implementation of STOS/SCAS in new decoder
Do not use gen_op, and pull the load from the accumulator into
disas_insn.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:59 +01:00
Paolo Bonzini
9a5922d6bd target/i386: do not use s->tmp0 for jumps on ECX ==/!= 0
Create a new temporary, to ease the register allocator's work.

Creation of the temporary is pushed into gen_ext_tl, which
also allows NULL as the first parameter now.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:55 +01:00
Paolo Bonzini
1ec46bf237 target/i386: do not use s->tmp4 for push
Just create a temporary for the occasion.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:53 +01:00
Paolo Bonzini
80e55f54ac target/i386: split eflags computation out of gen_compute_eflags
The new x86 decoder wants the gen_* functions to compute EFLAGS before
writeback, which can be an issue for instructions with a memory
destination such as ARPL or shifts.

Extract code to compute the EFLAGS without clobbering CC_SRC, in case
the memory write causes a fault.  The flags writeback mechanism will
take care of copying the result to CC_SRC.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:50 +01:00
Paolo Bonzini
c0099cd40e target/i386: do not clobber T0 on string operations
The new decoder would rather have the operand in T0 when expanding SCAS, rather
than use R_EAX directly as gen_scas currently does.  This makes SCAS more similar
to CMP and SUB, in that CC_DST = T0 - T1.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:24 +01:00
Paolo Bonzini
24c0573bb0 target/i386: do not clobber A0 in POP translation
The new decoder likes to compute the address in A0 very early, so the
gen_lea_v_seg in gen_pop_T0 would clobber the address of the memory
operand.  Instead use T0 since it is already available and will be
overwritten immediately after.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:21 +01:00
Paolo Bonzini
8cc746525c target/i386: remove unnecessary truncations
gen_lea_v_seg (called by gen_add_A0_ds_seg) already zeroes any
bits of s->A0 beyond s->aflag.  It does so before summing the
segment base and, if not in 64-bit mode, also after summing it.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:02:58 +01:00
Paolo Bonzini
1e7dde8008 target/i386: speedup JO/SETO after MUL or IMUL
OF is equal to the carry flag, so use the same CCPrepare.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:02:52 +01:00
Paolo Bonzini
6032627f07 target/i386: optimize computation of JL and JLE from flags
Take advantage of the fact that there can be no 1 bits between SF and OF.
If they were adjacent, you could sum SF and get a carry only if SF was
already set.  Then the value of OF in the sum is the XOR of OF itself,
the carry (which is SF) and 0 (the value of the OF bit in the addend):
this is OF^SF exactly.

Because OF and SF are not adjacent, just place more 1 bits to the
left so that the carry propagates, which means summing CC_O - CC_S.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:02:48 +01:00
Richard Henderson
b5e0d5d22f target/i386: Fix 32-bit wrapping of pc/eip computation
In 32-bit mode, pc = eip + cs_base is also 32-bit, and must wrap.
Failure to do so results in incorrect memory exceptions to the guest.
Before 732d548732, this was implicitly done via truncation to
target_ulong but only in qemu-system-i386, not qemu-system-x86_64.

To fix this, we must add conditional zero-extensions.
Since we have to test for 32 vs 64-bit anyway, note that cs_base
is always zero in 64-bit mode.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2022
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231212172510.103305-1-richard.henderson@linaro.org>
2023-12-12 13:35:08 -08:00
Richard Henderson
23f3d586e4 target/i386: Use tcg_gen_ext_tl
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-22 16:43:31 -07:00
Richard Henderson
46c684c862 target/i386: Use i128 for 128 and 256-bit loads and stores
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-22 16:32:28 -07:00
Paolo Bonzini
24b34590d0 target/i386: check intercept for XSETBV
Note that this intercept is special; it is checked before the #GP
exception.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-17 15:20:53 +02:00
Philippe Mathieu-Daudé
1da389c5db target/i386: Check for USER_ONLY definition instead of SOFTMMU one
Since we *might* have user emulation with softmmu,
replace the system emulation check by !user emulation one.

(target/ was cleaned from invalid CONFIG_SOFTMMU uses at
commit cab35c73be, but these files were merged few days
after, thus missed the cleanup.)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004082239.27251-1-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-07 19:02:33 +02:00
Richard Henderson
b77af26e97 accel/tcg: Replace CPUState.env_ptr with cpu_env()
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04 11:03:54 -07:00
Richard Henderson
ad75a51e84 tcg: Rename cpu_env to tcg_env
Allow the name 'cpu_env' to be used for something else.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-03 08:01:02 -07:00
Paolo Bonzini
19729affe1 target/i386/translate: avoid shadowed local variables
Just remove the declaration.  There is nothing in the function after the
switch statement, so it is safe to do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-09-26 16:41:49 +02:00
Michael Tokarev
bad5cfcd60 i386: spelling fixes
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-09-20 07:54:34 +03:00
Paolo Bonzini
c1f27a0c6a target/i386: raise FERR interrupt with iothread locked
Otherwise tcg_handle_interrupt() triggers an assertion failure:

  #5  0x0000555555c97369 in tcg_handle_interrupt (cpu=0x555557434cb0, mask=2) at ../accel/tcg/tcg-accel-ops.c:83
  #6  tcg_handle_interrupt (cpu=0x555557434cb0, mask=2) at ../accel/tcg/tcg-accel-ops.c:81
  #7  0x0000555555b4d58b in pic_irq_request (opaque=<optimized out>, irq=<optimized out>, level=1) at ../hw/i386/x86.c:555
  #8  0x0000555555b4f218 in gsi_handler (opaque=0x5555579423d0, n=13, level=1) at ../hw/i386/x86.c:611
  #9  0x00007fffa42bde14 in code_gen_buffer ()
  #10 0x0000555555c724bb in cpu_tb_exec (cpu=cpu@entry=0x555557434cb0, itb=<optimized out>, tb_exit=tb_exit@entry=0x7fffe9bfd658) at ../accel/tcg/cpu-exec.c:457

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1808
Reported-by: NyanCatTW1 <https://gitlab.com/a0939712328>
Co-developed-by: Richard Henderson <richard.henderson@linaro.org>'
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-09-01 23:44:39 +02:00
Paolo Bonzini
40a205da41 target/i386: emulate 64-bit ring 0 for linux-user if LM feature is set
32-bit binaries can run on a long mode processor even if the kernel
is 64-bit, of course, and this can have slightly different behavior;
for example, SYSCALL is allowed on Intel processors.

Allow reporting LM to programs running under user mode emulation,
so that "-cpu" can be used with named CPU models even for qemu-i386
and even without disabling LM by hand.

Fortunately, most of the runtime code in QEMU has to depend on HF_LMA_MASK
or on HF_CS64_MASK (which is anyway false for qemu-i386's 32-bit code
segment) rather than TARGET_X86_64, therefore all that is needed is an
update of linux-user's ring 0 setup.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1534
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-29 10:49:43 +02:00
Paolo Bonzini
63fd8ef080 target/i386: implement SYSCALL/SYSRET in 32-bit emulators
AMD supports both 32-bit and 64-bit SYSCALL/SYSRET, but the TCG only
exposes it for 64-bit targets.  For system emulation just reuse the
helper; for user-mode emulation the ABI is the same as "int $80".

The BSDs does not support any fast system call mechanism in 32-bit
mode so add to bsd-user the same stub that FreeBSD has for 64-bit
compatibility mode.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:56 +02:00
Paolo Bonzini
6750485bf4 target/i386: implement RDPID in TCG
RDPID corresponds to a RDMSR(TSC_AUX); however, it is unprivileged
so for user-mode emulation we must provide the value that the kernel
places in the MSR.  For Linux, it is a combination of the current CPU
and the current NUMA node, both of which can be retrieved with getcpu(2).
Also try sched_getcpu(), which might be there on the BSDs.  If there is
no portable way to retrieve the current CPU id from userspace, return 0.

RDTSCP is reimplemented as RDTSC + RDPID ECX; the differences in terms
of serializability are not relevant to QEMU.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:56 +02:00
Paolo Bonzini
53b9b4cc9f target/i386: sysret and sysexit are privileged
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:56 +02:00
Paolo Bonzini
75a02adf81 target/i386: AMD only supports SYSENTER/SYSEXIT in 32-bit mode
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:44 +02:00
Paolo Bonzini
fd5dcb1ccd target/i386: Intel only supports SYSCALL/SYSRET in long mode
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:07 +02:00
Paolo Bonzini
431c51e9d4 target/i386: TCG supports WBNOINVD
WBNOINVD is the same as INVD or WBINVD as far as TCG is concerned,
since there is no cache in TCG and therefore no invalidation side effect
in WBNOINVD.

With respect to SVM emulation, processors that do not support WBNOINVD
will ignore the prefix and treat it as WBINVD, while those that support
it will generate exactly the same vmexit.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:07 +02:00