Commit Graph

243 Commits

Author SHA1 Message Date
Paolo Bonzini
9a5922d6bd target/i386: do not use s->tmp0 for jumps on ECX ==/!= 0
Create a new temporary, to ease the register allocator's work.

Creation of the temporary is pushed into gen_ext_tl, which
also allows NULL as the first parameter now.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:55 +01:00
Paolo Bonzini
1ec46bf237 target/i386: do not use s->tmp4 for push
Just create a temporary for the occasion.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:53 +01:00
Paolo Bonzini
80e55f54ac target/i386: split eflags computation out of gen_compute_eflags
The new x86 decoder wants the gen_* functions to compute EFLAGS before
writeback, which can be an issue for instructions with a memory
destination such as ARPL or shifts.

Extract code to compute the EFLAGS without clobbering CC_SRC, in case
the memory write causes a fault.  The flags writeback mechanism will
take care of copying the result to CC_SRC.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:50 +01:00
Paolo Bonzini
c0099cd40e target/i386: do not clobber T0 on string operations
The new decoder would rather have the operand in T0 when expanding SCAS, rather
than use R_EAX directly as gen_scas currently does.  This makes SCAS more similar
to CMP and SUB, in that CC_DST = T0 - T1.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:24 +01:00
Paolo Bonzini
24c0573bb0 target/i386: do not clobber A0 in POP translation
The new decoder likes to compute the address in A0 very early, so the
gen_lea_v_seg in gen_pop_T0 would clobber the address of the memory
operand.  Instead use T0 since it is already available and will be
overwritten immediately after.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:03:21 +01:00
Paolo Bonzini
8cc746525c target/i386: remove unnecessary truncations
gen_lea_v_seg (called by gen_add_A0_ds_seg) already zeroes any
bits of s->A0 beyond s->aflag.  It does so before summing the
segment base and, if not in 64-bit mode, also after summing it.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:02:58 +01:00
Paolo Bonzini
1e7dde8008 target/i386: speedup JO/SETO after MUL or IMUL
OF is equal to the carry flag, so use the same CCPrepare.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:02:52 +01:00
Paolo Bonzini
6032627f07 target/i386: optimize computation of JL and JLE from flags
Take advantage of the fact that there can be no 1 bits between SF and OF.
If they were adjacent, you could sum SF and get a carry only if SF was
already set.  Then the value of OF in the sum is the XOR of OF itself,
the carry (which is SF) and 0 (the value of the OF bit in the addend):
this is OF^SF exactly.

Because OF and SF are not adjacent, just place more 1 bits to the
left so that the carry propagates, which means summing CC_O - CC_S.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-29 22:02:48 +01:00
Richard Henderson
b5e0d5d22f target/i386: Fix 32-bit wrapping of pc/eip computation
In 32-bit mode, pc = eip + cs_base is also 32-bit, and must wrap.
Failure to do so results in incorrect memory exceptions to the guest.
Before 732d548732, this was implicitly done via truncation to
target_ulong but only in qemu-system-i386, not qemu-system-x86_64.

To fix this, we must add conditional zero-extensions.
Since we have to test for 32 vs 64-bit anyway, note that cs_base
is always zero in 64-bit mode.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2022
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231212172510.103305-1-richard.henderson@linaro.org>
2023-12-12 13:35:08 -08:00
Richard Henderson
23f3d586e4 target/i386: Use tcg_gen_ext_tl
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-22 16:43:31 -07:00
Richard Henderson
46c684c862 target/i386: Use i128 for 128 and 256-bit loads and stores
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-22 16:32:28 -07:00
Paolo Bonzini
24b34590d0 target/i386: check intercept for XSETBV
Note that this intercept is special; it is checked before the #GP
exception.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-17 15:20:53 +02:00
Philippe Mathieu-Daudé
1da389c5db target/i386: Check for USER_ONLY definition instead of SOFTMMU one
Since we *might* have user emulation with softmmu,
replace the system emulation check by !user emulation one.

(target/ was cleaned from invalid CONFIG_SOFTMMU uses at
commit cab35c73be, but these files were merged few days
after, thus missed the cleanup.)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004082239.27251-1-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-07 19:02:33 +02:00
Richard Henderson
b77af26e97 accel/tcg: Replace CPUState.env_ptr with cpu_env()
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-04 11:03:54 -07:00
Richard Henderson
ad75a51e84 tcg: Rename cpu_env to tcg_env
Allow the name 'cpu_env' to be used for something else.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-10-03 08:01:02 -07:00
Paolo Bonzini
19729affe1 target/i386/translate: avoid shadowed local variables
Just remove the declaration.  There is nothing in the function after the
switch statement, so it is safe to do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-09-26 16:41:49 +02:00
Michael Tokarev
bad5cfcd60 i386: spelling fixes
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-09-20 07:54:34 +03:00
Paolo Bonzini
c1f27a0c6a target/i386: raise FERR interrupt with iothread locked
Otherwise tcg_handle_interrupt() triggers an assertion failure:

  #5  0x0000555555c97369 in tcg_handle_interrupt (cpu=0x555557434cb0, mask=2) at ../accel/tcg/tcg-accel-ops.c:83
  #6  tcg_handle_interrupt (cpu=0x555557434cb0, mask=2) at ../accel/tcg/tcg-accel-ops.c:81
  #7  0x0000555555b4d58b in pic_irq_request (opaque=<optimized out>, irq=<optimized out>, level=1) at ../hw/i386/x86.c:555
  #8  0x0000555555b4f218 in gsi_handler (opaque=0x5555579423d0, n=13, level=1) at ../hw/i386/x86.c:611
  #9  0x00007fffa42bde14 in code_gen_buffer ()
  #10 0x0000555555c724bb in cpu_tb_exec (cpu=cpu@entry=0x555557434cb0, itb=<optimized out>, tb_exit=tb_exit@entry=0x7fffe9bfd658) at ../accel/tcg/cpu-exec.c:457

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1808
Reported-by: NyanCatTW1 <https://gitlab.com/a0939712328>
Co-developed-by: Richard Henderson <richard.henderson@linaro.org>'
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-09-01 23:44:39 +02:00
Paolo Bonzini
40a205da41 target/i386: emulate 64-bit ring 0 for linux-user if LM feature is set
32-bit binaries can run on a long mode processor even if the kernel
is 64-bit, of course, and this can have slightly different behavior;
for example, SYSCALL is allowed on Intel processors.

Allow reporting LM to programs running under user mode emulation,
so that "-cpu" can be used with named CPU models even for qemu-i386
and even without disabling LM by hand.

Fortunately, most of the runtime code in QEMU has to depend on HF_LMA_MASK
or on HF_CS64_MASK (which is anyway false for qemu-i386's 32-bit code
segment) rather than TARGET_X86_64, therefore all that is needed is an
update of linux-user's ring 0 setup.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1534
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-29 10:49:43 +02:00
Paolo Bonzini
63fd8ef080 target/i386: implement SYSCALL/SYSRET in 32-bit emulators
AMD supports both 32-bit and 64-bit SYSCALL/SYSRET, but the TCG only
exposes it for 64-bit targets.  For system emulation just reuse the
helper; for user-mode emulation the ABI is the same as "int $80".

The BSDs does not support any fast system call mechanism in 32-bit
mode so add to bsd-user the same stub that FreeBSD has for 64-bit
compatibility mode.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:56 +02:00
Paolo Bonzini
6750485bf4 target/i386: implement RDPID in TCG
RDPID corresponds to a RDMSR(TSC_AUX); however, it is unprivileged
so for user-mode emulation we must provide the value that the kernel
places in the MSR.  For Linux, it is a combination of the current CPU
and the current NUMA node, both of which can be retrieved with getcpu(2).
Also try sched_getcpu(), which might be there on the BSDs.  If there is
no portable way to retrieve the current CPU id from userspace, return 0.

RDTSCP is reimplemented as RDTSC + RDPID ECX; the differences in terms
of serializability are not relevant to QEMU.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:56 +02:00
Paolo Bonzini
53b9b4cc9f target/i386: sysret and sysexit are privileged
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:56 +02:00
Paolo Bonzini
75a02adf81 target/i386: AMD only supports SYSENTER/SYSEXIT in 32-bit mode
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:44 +02:00
Paolo Bonzini
fd5dcb1ccd target/i386: Intel only supports SYSCALL/SYSRET in long mode
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:07 +02:00
Paolo Bonzini
431c51e9d4 target/i386: TCG supports WBNOINVD
WBNOINVD is the same as INVD or WBINVD as far as TCG is concerned,
since there is no cache in TCG and therefore no invalidation side effect
in WBNOINVD.

With respect to SVM emulation, processors that do not support WBNOINVD
will ignore the prefix and treat it as WBINVD, while those that support
it will generate exactly the same vmexit.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:07 +02:00
Paolo Bonzini
f9e0dbae78 target/i386: do not accept RDSEED if CPUID bit absent
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:02 +02:00
Paolo Bonzini
4d714d1a0b target/i386: fix INVD vmexit
Due to a typo or perhaps a brain fart, the INVD vmexit was never generated.
Fix it (but not that fixing just the typo would break both INVD and WBINVD,
due to a case of two wrongs making a right).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-26 10:23:01 +02:00
Philippe Mathieu-Daudé
1dc7bb0e96 target/i386: Simplify i386_tr_init_disas_context()
Since cpu_mmu_index() is well-defined for user-only,
we can remove the surrounding #ifdef'ry entirely.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-2-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Richard Henderson
dfd1b81274 accel/tcg: Introduce translator_io_start
New wrapper around gen_io_start which takes care of the USE_ICOUNT
check, as well as marking the DisasContext to end the TB.
Remove exec/gen-icount.h.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
d53106c997 tcg: Pass TCGHelperInfo to tcg_gen_callN
In preparation for compiling tcg/ only once, eliminate
the all_helpers array.  Instantiate the info structs for
the generic helpers in accel/tcg/, and the structs for
the target-specific helpers in each translate.c.

Since we don't see all of the info structs at startup,
initialize at first use, using g_once_init_* to make
sure we don't race while doing so.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
732e89f4c4 tcg: Replace tcg_abort with g_assert_not_reached
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-04-23 08:17:46 +01:00
Richard Henderson
3df11bb14a target/i386: Avoid use of tcg_const_* throughout
All uses are strictly read-only.  Most of the obviously so,
as direct arguments to gen_helper_*.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-13 06:44:37 -07:00
Richard Henderson
3e7da311d7 target/i386: Simplify POPF
Compute the eflags write mask separately, leaving one call
to the helper.  Use tcg_constant_i32.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-05 13:45:31 -08:00
Richard Henderson
5128d58480 target/i386: Drop tcg_temp_free
Translators are no longer required to free tcg temporaries.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-05 13:44:08 -08:00
Richard Henderson
3a5d177322 target/i386: Don't use tcg_temp_local_new
Since tcg_temp_new is now identical, use that.
In some cases we can avoid a copy from A0 or T0.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-01 07:33:28 -10:00
Richard Henderson
597f9b2d30 accel/tcg: Pass max_insn to gen_intermediate_code by pointer
In preparation for returning the number of insns generated
via the same pointer.  Adjust only the prototypes so far.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-01 07:33:27 -10:00
Anton Johansson
2e3afe8e19 target/i386: Replace TARGET_TB_PCREL with CF_PCREL
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230227135202.9710-8-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-03-01 07:31:56 -10:00
Richard Henderson
5f0dd8cd33 target/i386: Inline cmpxchg16b
Use tcg_gen_atomic_cmpxchg_i128 for the atomic case,
and tcg_gen_qemu_ld/st_i128 otherwise.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-02-04 06:19:43 -10:00
Richard Henderson
326ad06cf5 target/i386: Inline cmpxchg8b
Use tcg_gen_atomic_cmpxchg_i64 for the atomic case,
and tcg_gen_nonatomic_cmpxchg_i64 otherwise.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-02-04 06:19:43 -10:00
Richard Henderson
6218c177af target/i386: Split out gen_cmpxchg8b, gen_cmpxchg16b
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-02-04 06:19:43 -10:00
Paolo Bonzini
35d95e4126 target/i386: hardcode R_EAX as destination register for LAHF/SAHF
When translating code that is using LAHF and SAHF in combination with the
REX prefix, the instructions should not use any other register than AH;
however, QEMU selects SPL (SP being register 4, just like AH) if the
REX prefix is present.  To fix this, use deposit directly without
going through gen_op_mov_v_reg and gen_op_mov_reg_v.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/130
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-15 09:34:42 +10:00
Paolo Bonzini
d1bb978ba1 target/i386: fix cmpxchg with 32-bit register destination
Unlike the memory case, where "the destination operand receives a write
cycle without regard to the result of the comparison", rm must not be
touched altogether if the write fails, including not zero-extending
it on 64-bit processors.  This is not how the movcond currently works,
because it is always followed by a gen_op_mov_reg_v to rm.

To fix it, introduce a new function that is similar to gen_op_mov_reg_v
but writes to a TCG temporary.

Considering that gen_extu(ot, oldv) is not needed in the memory case
either, the two cases for register and memory destinations are different
enough that one might as well fuse the two "if (mod == 3)" into one.
So do that too.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/508
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[rth: Add a test case ]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-15 09:34:42 +10:00
Stefan Hajnoczi
7f5acfcb66 * bug fixes
* reduced memory footprint for IPI virtualization on Intel processors
 * asynchronous teardown support (Linux only)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNiVykUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0Swf/YxjphCtFgYYSO14WP+7jAnfRZLhm
 0xWChWP8rco5I352OBFeFU64Av5XoLGNn6SZLl8lcg86lQ/G0D27jxu6wOcDDHgw
 0yTDO1gevj51UKsbxoC66OWSZwKTEo398/BHPDcI2W41yOFycSdtrPgspOrFRVvf
 7M3nNjuNPsQorZeuu8NGr3jakqbt99ZDXcyDEWbrEAcmy2JBRMbGgT0Kdnc6aZfW
 CvL+1ljxzldNwGeNBbQW2QgODbfHx5cFZcy4Daze35l5Ra7K/FrgAzr6o/HXptya
 9fEs5LJQ1JWI6JtpaWwFy7fcIIOsJ0YW/hWWQZSDt9JdAJFE5/+vF+Kz5Q==
 =CgrO
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* bug fixes
* reduced memory footprint for IPI virtualization on Intel processors
* asynchronous teardown support (Linux only)

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNiVykUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0Swf/YxjphCtFgYYSO14WP+7jAnfRZLhm
# 0xWChWP8rco5I352OBFeFU64Av5XoLGNn6SZLl8lcg86lQ/G0D27jxu6wOcDDHgw
# 0yTDO1gevj51UKsbxoC66OWSZwKTEo398/BHPDcI2W41yOFycSdtrPgspOrFRVvf
# 7M3nNjuNPsQorZeuu8NGr3jakqbt99ZDXcyDEWbrEAcmy2JBRMbGgT0Kdnc6aZfW
# CvL+1ljxzldNwGeNBbQW2QgODbfHx5cFZcy4Daze35l5Ra7K/FrgAzr6o/HXptya
# 9fEs5LJQ1JWI6JtpaWwFy7fcIIOsJ0YW/hWWQZSDt9JdAJFE5/+vF+Kz5Q==
# =CgrO
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 02 Nov 2022 07:40:25 EDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
  target/i386: Fix test for paging enabled
  util/log: Close per-thread log file on thread termination
  target/i386: Set maximum APIC ID to KVM prior to vCPU creation
  os-posix: asynchronous teardown for shutdown on Linux
  target/i386: Fix calculation of LOCK NEG eflags

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-11-03 10:54:37 -04:00
Richard Henderson
6317933086 target/i386: Expand eflags updates inline
The helpers for reset_rf, cli, sti, clac, stac are
completely trivial; implement them inline.

Drop some nearby #if 0 code.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01 08:31:41 +11:00
Qi Hu
1215317510 target/i386: Fix calculation of LOCK NEG eflags
After:

        lock negl -0x14(%rbp)
        pushf
        pop    %rax

%rax will contain the wrong value because the "lock neg" calculates the
wrong eflags.  Simple test:

        #include <assert.h>

        int main()
        {
          __volatile__ unsigned test = 0x2363a;
          __volatile__ char cond = 0;
          asm(
              "lock negl %0 \n\t"
              "sets %1"
              : "=m"(test), "=r"(cond));
          assert(cond & 1);
          return 0;
        }

Reported-by: Jinyang Shen <shenjinyang@loongson.cn>
Co-Developed-by: Xuehai Chen <chenxuehai@loongson.cn>
Signed-off-by: Xuehai Chen <chenxuehai@loongson.cn>
Signed-off-by: Qi Hu <huqi@loongson.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-31 09:46:34 +01:00
Richard Henderson
434382e640 target/i386: Convert to tcg_ops restore_state_to_opc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Paolo Bonzini
2872b0f390 target/i386: implement FMA instructions
The only issue with FMA instructions is that there are _a lot_ of them (30
opcodes, each of which comes in up to 4 versions depending on VEX.W and
VEX.L; a total of 96 possibilities).  However, they can be implement with
only 6 helpers, two for scalar operations and four for packed operations.
(Scalar versions do not do any merging; they only affect the bottom 32
or 64 bits of the output operand.  Therefore, there is no separate XMM
and YMM of the scalar helpers).

First, we can reduce the number of helpers to one third by passing four
operands (one output and three inputs); the reordering of which operands
go to the multiply and which go to the add is done in emit.c.

Second, the different instructions also dispatch to the same softfloat
function, so the flags for float32_muladd and float64_muladd are passed
in the helper as int arguments, with a little extra complication to
handle FMADDSUB and FMSUBADD.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22 09:05:54 +02:00
Paolo Bonzini
653fad2497 target/i386: remove old SSE decoder
With all SSE (and AVX!) instructions now implemented in disas_insn_new,
it's possible to remove gen_sse, as well as the helpers for instructions
that now use gvec.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
71a0891d61 target/i386: move 3DNow to the new decoder
This adds another kind of weirdness when you thought you had seen it all:
an opcode byte that comes _after_ the address, not before.  It's not
worth adding a new X86_SPECIAL_* constant for it, but it's actually
not unlike VCMP; so, forgive me for exploiting the similarity and just
deciding to dispatch to the right gen_helper_* call in a single code
generation function.

In fact, the old decoder had a bug where s->rip_offset should have
been set to 1 for 3DNow! instructions, and it's fixed now.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
f8d19eec0d target/i386: reimplement 0x0f 0x28-0x2f, add AVX
Here the code is a bit uglier due to the truncation and extension
of registers to and from 32-bit.  There is also a mistake in the
manual with respect to the size of the memory operand of CVTPS2PI
and CVTTPS2PI, reported by Ricky Zhou.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
7170a17ec3 target/i386: reimplement 0x0f 0x10-0x17, add AVX
These are mostly moves, and yet are a total pain.  The main issue
is that:

1) some instructions are selected by mod==11 (register operand)
vs. mod=00/01/10 (memory operand)

2) stores to memory are two-operand operations, while the 3-register
and load-from-memory versions operate on the entire contents of the
destination; this makes it easier to separate the gen_* function for
the store case

3) it's inefficient to load into xmm_T0 only to move the value out
again, so the gen_* function for the load case is separated too

The manual also has various mistakes in the operands here, for example
the store case of MOVHPS operates on a 128-bit source (albeit discarding
the bottom 64 bits) and therefore should be Mq,Vdq rather than Mq,Vq.
Likewise for the destination and source of MOVHLPS.

VUNPCK?PS and VUNPCK?PD are the same as VUNPCK?DQ and VUNPCK?QDQ,
but encoded as prefixes rather than separate operands.  The helpers
can be reused however.

For MOVSLDUP, MOVSHDUP and MOVDDUP I chose to reimplement them as
helpers.  I named the helper for MOVDDUP "movdldup" in preparation
for possible future introduction of MOVDHDUP and to clarify the
similarity with MOVSLDUP.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
aba2b8ecb9 target/i386: reimplement 0x0f 0xc2, 0xc4-0xc6, add AVX
Nothing special going on here, for once.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
16fc5726a6 target/i386: reimplement 0x0f 0x38, add AVX
There are several special cases here:

1) extending moves have different widths for the helpers vs. for the
memory loads, and the width for memory loads depends on VEX.L too.
This is represented by X86_SPECIAL_AVXExtMov.

2) some instructions, such as variable-width shifts, select the vector element
size via REX.W.

3) VSIB instructions (VGATHERxPy, VPGATHERxy) are also part of this group,
and they have (among other things) two output operands.

3) the macros for 4-operand blends (which are under 0x0f 0x3a) have to be
extended to support 2-operand blends.  The 2-operand variant actually
came a few years earlier, but it is clearer to implement them in the
opposite order.

X86_TYPE_WM, introduced earlier for unaligned loads, is reused for helpers
that accept a Reg* but have a M argument.

These three-byte opcodes also include AVX new instructions, for which
the helpers were originally implemented by Paul Brook <paul@nowt.org>.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
7906847768 target/i386: reimplement 0x0f 0x3a, add AVX
The more complicated operations here are insertions and extractions.
Otherwise, there are just more entries than usual because the PS/PD/SS/SD
variations are encoded in the opcode rater than in the prefixes.

These three-byte opcodes also include AVX new instructions, whose
implementation in the helpers was originally done by Paul Brook
<paul@nowt.org>.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
6bbeb98d10 target/i386: reimplement 0x0f 0xd0-0xd7, 0xe0-0xe7, 0xf0-0xf7, add AVX
The more complicated ones here are d6-d7, e6-e7, f7.  The others
are trivial.

For LDDQU, using gen_load_sse directly might corrupt the register if
the second part of the load fails.  Therefore, add a custom X86_TYPE_WM
value; like X86_TYPE_W it does call gen_load(), but it also rejects a
value of 11 in the ModRM field like X86_TYPE_M.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
ce4fcb9478 target/i386: reimplement 0x0f 0x70-0x77, add AVX
This includes shifts by immediate, which use bits 3-5 of the ModRM byte
as an opcode extension.  With the exception of 128-bit shifts, they are
implemented using gvec.

This also covers VZEROALL and VZEROUPPER, which use the same opcode
as EMMS.  If we were wanting to optimize out gen_clear_ymmh then this
would be one of the starting points.  The implementation of the VZEROALL
and VZEROUPPER helpers is by Paul Brook.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
d1c1a4222c target/i386: reimplement 0x0f 0x78-0x7f, add AVX
These are a mixed batch, including the first two horizontal
(66 and F2 only) operations, more moves, and SSE4a extract/insert.

Because SSE4a is pretty rare, I chose to leave the helper as they are,
but it is possible to unify them by loading index and length from the
source XMM register and generating deposit or extract TCG ops.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
03b4588070 target/i386: reimplement 0x0f 0x50-0x5f, add AVX
These are mostly floating-point SSE operations.  The odd ones out
are MOVMSK and CVTxx2yy, the others are straightforward.

Unary operations are a bit special in AVX because they have 2 operands
for PD/PS operands (VEX.vvvv must be 1111b), and 3 operands for SD/SS.
They are handled using X86_OP_GROUP3 for compactness.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:05 +02:00
Paolo Bonzini
1d0efbdb35 target/i386: reimplement 0x0f 0xd8-0xdf, 0xe8-0xef, 0xf8-0xff, add AVX
These are more simple integer instructions present in both MMX and SSE/AVX,
with no holes that were later occupied by newer instructions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
92ec056a6b target/i386: reimplement 0x0f 0x60-0x6f, add AVX
These are both MMX and SSE/AVX instructions, except for vmovdqu.  In both
cases the inputs and output is in s->ptr{0,1,2}, so the only difference
between MMX, SSE, and AVX is which helper to call.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
620f75566a target/i386: provide 3-operand versions of unary scalar helpers
Compared to Paul's implementation, the new decoder will use a different approach
to implement AVX's merging of dst with src1 on scalar operations.  Adjust the
old SSE decoder to be compatible with new-style helpers.

The affected instructions are CVTSx2Sx, ROUNDSx, RSQRTSx, SQRTSx, RCPSx.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
f05f9789f5 target/i386: extend helpers to support VEX.V 3- and 4- operand encodings
Add to the helpers all the operands that are needed to implement AVX.

Extracted from a patch by Paul Brook <paul@nowt.org>.

Message-Id: <20220424220204.2493824-26-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
1d0b926150 target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder
Because these are the only VEX instructions that QEMU supports, the
new decoder is entered on the first byte of a valid VEX prefix, and VEX
decoding only needs to be done in decode-new.c.inc.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
20581aadec target/i386: validate VEX prefixes via the instructions' exception classes
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
268dc4648f target/i386: add CPUID[EAX=7,ECX=0].ECX to DisasContext
TCG will shortly implement VAES instructions, so add the relevant feature
word to the DisasContext.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
6ba13999be target/i386: add ALU load/writeback core
Add generic code generation that takes care of preparing operands
around calls to decode.e.gen in a table-driven manner, so that ALU
operations need not take care of that.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
b3e22b2318 target/i386: add core of new i386 decoder
The new decoder is based on three principles:

- use mostly table-driven decoding, using tables derived as much as possible
  from the Intel manual.  Centralizing the decode the operands makes it
  more homogeneous, for example all immediates are signed.  All modrm
  handling is in one function, and can be shared between SSE and ALU
  instructions (including XMM<->GPR instructions).  The SSE/AVX decoder
  will also not have duplicated code between the 0F, 0F38 and 0F3A tables.

- keep the code as "non-branchy" as possible.  Generally, the code for
  the new decoder is more verbose, but the control flow is simpler.
  Conditionals are not nested and have small bodies.  All instruction
  groups are resolved even before operands are decoded, and code
  generation is separated as much as possible within small functions
  that only handle one instruction each.

- keep address generation and (for ALU operands) memory loads and writeback
  as much in common code as possible.  All ALU operations for example
  are implemented as T0=f(T0,T1).  For non-ALU instructions,
  read-modify-write memory operations are rare, but registers do not
  have TCGv equivalents: therefore, the common logic sets up pointer
  temporaries with the operands, while load and writeback are handled
  by gvec or by helpers.

These principles make future code review and extensibility simpler, at
the cost of having a relatively large amount of code in the form of this
patch.  Even EVEX should not be _too_ hard to implement (it's just a crazy
large amount of possibilities).

This patch introduces the main decoder flow, and integrates the old
decoder with the new one.  The old decoder takes care of parsing
prefixes and then optionally drops to the new one.  The changes to the
old decoder are minimal and allow it to be replaced incrementally with
the new one.

There is a debugging mechanism through a "LIMIT" environment variable.
In user-mode emulation, the variable is the number of instructions
decoded by the new decoder before permanently switching to the old one.
In system emulation, the variable is the highest opcode that is decoded
by the new decoder (this is less friendly, but it's the best that can
be done without requiring deterministic execution).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
a61ef762f9 target/i386: make rex_w available even in 32-bit mode
REX.W can be used even in 32-bit mode by AVX instructions, where it is retroactively
renamed to VEX.W.  Make the field available even in 32-bit mode but keep the REX_W()
macro as it was; this way, that the handling of dflag does not use it by mistake and
the AVX code more clearly points at the special VEX behavior of the bit.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Paolo Bonzini
12a2c9c72c target/i386: make ldo/sto operations consistent with ldq
ldq takes a pointer to the first byte to load the 64-bit word in;
ldo takes a pointer to the first byte of the ZMMReg.  Make them
consistent, which will be useful in the new SSE decoder's
load/writeback routines.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Richard Henderson
913f0836f6 target/i386: Save and restore pc_save before tcg_remove_ops_after
Restore pc_save while undoing any state change that may have
happened while decoding the instruction.  Leave a TODO about
removing all of that when the table-based decoder is complete.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221016222303.288551-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-18 13:58:04 +02:00
Richard Henderson
e3a79e0e87 target/i386: Enable TARGET_TB_PCREL
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-27-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
7db973bece target/i386: Inline gen_jmp_im
Expand this function at each of its callers.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-26-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
f771ca6a61 target/i386: Add cpu_eip
Create a tcg global temp for this, and use it instead of explicit stores.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-25-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
75ec746a07 target/i386: Create eip_cur_tl
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-24-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
900cc7e536 target/i386: Merge gen_jmp_tb and gen_goto_tb into gen_jmp_rel
These functions have only one caller, and the logic is more
obvious this way.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-23-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
0ebacb5d1e target/i386: Remove MemOp argument to gen_op_j*_ecx
These functions are always passed aflag, so we might as well
read it from DisasContext directly.  While we're at it, use
a common subroutine for these two functions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-22-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
5f7ec6efcc target/i386: Use gen_jmp_rel for DISAS_TOO_MANY
With gen_jmp_rel, we may chain between two translation blocks
which may only be separated because of TB size limits.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-21-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
54b191de67 target/i386: Use gen_jmp_rel for gen_jcc
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-20-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
2255da493a target/i386: Use gen_jmp_rel for loop, repz, jecxz insns
With gen_jmp_rel, we may chain to the next tb instead of merely
writing to eip and exiting.  For repz, subtract cur_insn_len to
restart the current insn.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-19-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
8760ded661 target/i386: Create gen_jmp_rel
Create a common helper for pc-relative branches.  The jmp jb insn
was missing a mask for CODE32.  In all cases the CODE64 check was
incorrectly placed, allowing PREFIX_DATA to truncate %rip to 16 bits.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-18-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
202005f1f8 target/i386: Use DISAS_TOO_MANY to exit after gen_io_start
We can set is_jmp early, using only one if, and let that
be overwritten by gen_rep*'s calls to gen_jmp_tb.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-17-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
9e599bf707 target/i386: Create eip_next_*
Create helpers for loading the address of the next insn.
Use tcg_constant_* in adjacent code where convenient.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-16-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
8c03ab9f74 target/i386: Truncate values for lcall_real to i32
Use i32 not int or tl for eip and cs arguments.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-15-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
faf9ea5fa5 target/i386: Introduce DISAS_JUMP
Drop the unused dest argument to gen_jr().
Remove most of the calls to gen_jr, and use DISAS_JUMP.
Remove some unused loads of eip for lcall and ljmp.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-14-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
122e6d7b4a target/i386: Remove cur_eip, next_eip arguments to gen_repz*
All callers pass s->base.pc_next and s->pc, which we can just
as well compute within the functions.  Pull out common helpers
and reduce the amount of code under macros.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-13-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
ad1d6f072d target/i386: Create cur_insn_len, cur_insn_len_i32
Create common routines for computing the length of the insn.
Use tcg_constant_i32 in the new function, while we're at it.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-12-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
6424ac8eec target/i386: USe DISAS_EOB_ONLY
Replace lone calls to gen_eob() with the new enumerator.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-11-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
634a405193 target/i386: Use DISAS_EOB_NEXT
Replace sequences of gen_update_cc_op, gen_update_eip_next,
and gen_eob with the new is_jmp enumerator.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-10-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
4da4523c6c target/i386: Use DISAS_EOB* in gen_movl_seg_T0
Set is_jmp properly in gen_movl_seg_T0, so that the callers
need to nothing special.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-9-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
200ef60399 target/i386: Introduce DISAS_EOB*
Add a few DISAS_TARGET_* aliases to reduce the number of
calls to gen_eob() and gen_eob_inhibit_irq().  So far,
only update i386_tr_translate_insn for exiting the block
because of single-step or previous inhibit irq.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-8-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
09e99df4d5 target/i386: Create gen_update_eip_next
Sync EIP before exiting a translation block.
Replace all gen_jmp_im that use s->pc.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-7-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
65e4af230d target/i386: Create gen_update_eip_cur
Like gen_update_cc_op, sync EIP before doing something
that could raise an exception.  Replace all gen_jmp_im
that use s->base.pc_next.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-6-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
8ed6c98501 target/i386: Remove cur_eip, next_eip arguments to gen_interrupt
All callers pass s->base.pc_next and s->pc, which we can just as
well compute within the function.  Adjust to use tcg_constant_i32
while we're at it.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-5-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
522365508e target/i386: Remove cur_eip argument to gen_exception
All callers pass s->base.pc_next - s->cs_base, which we can just
as well compute within the function.  Note the special case of
EXCP_VSYSCALL in which s->cs_base wasn't subtracted, but cs_base
is always zero in 64-bit mode, when vsyscall is used.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-4-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
f66c8e8cd9 target/i386: Return bool from disas_insn
Instead of returning the new pc, which is present in
DisasContext, return true if an insn was translated.
This is false when we detect a page crossing and must
undo the insn under translation.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221001140935.465607-3-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Richard Henderson
ddf83b35bd target/i386: Remove pc_start
The DisasContext member and the disas_insn local variable of
the same name are identical to DisasContextBase.pc_next.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221001140935.465607-2-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Paolo Bonzini
efcca7ef17 target/i386: introduce insn_get_addr
The "O" operand type in the Intel SDM needs to load an 8- to 64-bit
unsigned value, while insn_get is limited to 32 bits.  Extract the code
out of disas_insn and into a separate function.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-19 15:16:00 +02:00
Paolo Bonzini
5c2f60bd1b target/i386: REPZ and REPNZ are mutually exclusive
The later prefix wins if both are present, make it show in s->prefix too.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-19 15:16:00 +02:00
Paolo Bonzini
ca4b1b43bc target/i386: fix INSERTQ implementation
INSERTQ is defined to not modify any bits in the lower 64 bits of the
destination, other than the ones being replaced with bits from the
source operand.  QEMU instead is using unshifted bits from the source
for those bits.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-19 15:16:00 +02:00
Paolo Bonzini
958e1dd130 target/i386: Raise #GP on unaligned m128 accesses when required.
Many instructions which load/store 128-bit values are supposed to
raise #GP when the memory operand isn't 16-byte aligned. This includes:
 - Instructions explicitly requiring memory alignment (Exceptions Type 1
   in the "AVX and SSE Instruction Exception Specification" section of
   the SDM)
 - Legacy SSE instructions that load/store 128-bit values (Exceptions
   Types 2 and 4).

This change sets MO_ALIGN_16 on 128-bit memory accesses that require
16-byte alignment. It adds cpu_record_sigbus and cpu_do_unaligned_access
hooks that simulate a #GP exception in qemu-user and qemu-system,
respectively.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/217
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ricky Zhou <ricky@rzhou.org>
Message-Id: <20220830034816.57091-2-ricky@rzhou.org>
[Do not bother checking PREFIX_VEX, since AVX is not supported. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-18 09:17:40 +02:00