mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Paolo Bonzini	24c0573bb0	target/i386: do not clobber A0 in POP translation The new decoder likes to compute the address in A0 very early, so the gen_lea_v_seg in gen_pop_T0 would clobber the address of the memory operand. Instead use T0 since it is already available and will be overwritten immediately after. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-12-29 22:03:21 +01:00
Paolo Bonzini	8cc746525c	target/i386: remove unnecessary truncations gen_lea_v_seg (called by gen_add_A0_ds_seg) already zeroes any bits of s->A0 beyond s->aflag. It does so before summing the segment base and, if not in 64-bit mode, also after summing it. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-12-29 22:02:58 +01:00
Paolo Bonzini	1e7dde8008	target/i386: speedup JO/SETO after MUL or IMUL OF is equal to the carry flag, so use the same CCPrepare. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-12-29 22:02:52 +01:00
Paolo Bonzini	6032627f07	target/i386: optimize computation of JL and JLE from flags Take advantage of the fact that there can be no 1 bits between SF and OF. If they were adjacent, you could sum SF and get a carry only if SF was already set. Then the value of OF in the sum is the XOR of OF itself, the carry (which is SF) and 0 (the value of the OF bit in the addend): this is OF^SF exactly. Because OF and SF are not adjacent, just place more 1 bits to the left so that the carry propagates, which means summing CC_O - CC_S. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-12-29 22:02:48 +01:00
Richard Henderson	b5e0d5d22f	target/i386: Fix 32-bit wrapping of pc/eip computation In 32-bit mode, pc = eip + cs_base is also 32-bit, and must wrap. Failure to do so results in incorrect memory exceptions to the guest. Before `732d548732`, this was implicitly done via truncation to target_ulong but only in qemu-system-i386, not qemu-system-x86_64. To fix this, we must add conditional zero-extensions. Since we have to test for 32 vs 64-bit anyway, note that cs_base is always zero in 64-bit mode. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2022 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20231212172510.103305-1-richard.henderson@linaro.org>	2023-12-12 13:35:08 -08:00
Richard Henderson	23f3d586e4	target/i386: Use tcg_gen_ext_tl Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-10-22 16:43:31 -07:00
Richard Henderson	46c684c862	target/i386: Use i128 for 128 and 256-bit loads and stores Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-10-22 16:32:28 -07:00
Paolo Bonzini	24b34590d0	target/i386: check intercept for XSETBV Note that this intercept is special; it is checked before the #GP exception. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-17 15:20:53 +02:00
Philippe Mathieu-Daudé	1da389c5db	target/i386: Check for USER_ONLY definition instead of SOFTMMU one Since we might have user emulation with softmmu, replace the system emulation check by !user emulation one. (target/ was cleaned from invalid CONFIG_SOFTMMU uses at commit `cab35c73be`, but these files were merged few days after, thus missed the cleanup.) Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20231004082239.27251-1-philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-10-07 19:02:33 +02:00
Richard Henderson	b77af26e97	accel/tcg: Replace CPUState.env_ptr with cpu_env() Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-10-04 11:03:54 -07:00
Richard Henderson	ad75a51e84	tcg: Rename cpu_env to tcg_env Allow the name 'cpu_env' to be used for something else. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-10-03 08:01:02 -07:00
Paolo Bonzini	19729affe1	target/i386/translate: avoid shadowed local variables Just remove the declaration. There is nothing in the function after the switch statement, so it is safe to do. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-09-26 16:41:49 +02:00
Michael Tokarev	bad5cfcd60	i386: spelling fixes Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2023-09-20 07:54:34 +03:00
Paolo Bonzini	c1f27a0c6a	target/i386: raise FERR interrupt with iothread locked Otherwise tcg_handle_interrupt() triggers an assertion failure: #5 0x0000555555c97369 in tcg_handle_interrupt (cpu=0x555557434cb0, mask=2) at ../accel/tcg/tcg-accel-ops.c:83 #6 tcg_handle_interrupt (cpu=0x555557434cb0, mask=2) at ../accel/tcg/tcg-accel-ops.c:81 #7 0x0000555555b4d58b in pic_irq_request (opaque=<optimized out>, irq=<optimized out>, level=1) at ../hw/i386/x86.c:555 #8 0x0000555555b4f218 in gsi_handler (opaque=0x5555579423d0, n=13, level=1) at ../hw/i386/x86.c:611 #9 0x00007fffa42bde14 in code_gen_buffer () #10 0x0000555555c724bb in cpu_tb_exec (cpu=cpu@entry=0x555557434cb0, itb=<optimized out>, tb_exit=tb_exit@entry=0x7fffe9bfd658) at ../accel/tcg/cpu-exec.c:457 Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1808 Reported-by: NyanCatTW1 <https://gitlab.com/a0939712328> Co-developed-by: Richard Henderson <richard.henderson@linaro.org>' Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-09-01 23:44:39 +02:00
Paolo Bonzini	40a205da41	target/i386: emulate 64-bit ring 0 for linux-user if LM feature is set 32-bit binaries can run on a long mode processor even if the kernel is 64-bit, of course, and this can have slightly different behavior; for example, SYSCALL is allowed on Intel processors. Allow reporting LM to programs running under user mode emulation, so that "-cpu" can be used with named CPU models even for qemu-i386 and even without disabling LM by hand. Fortunately, most of the runtime code in QEMU has to depend on HF_LMA_MASK or on HF_CS64_MASK (which is anyway false for qemu-i386's 32-bit code segment) rather than TARGET_X86_64, therefore all that is needed is an update of linux-user's ring 0 setup. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1534 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-29 10:49:43 +02:00
Paolo Bonzini	63fd8ef080	target/i386: implement SYSCALL/SYSRET in 32-bit emulators AMD supports both 32-bit and 64-bit SYSCALL/SYSRET, but the TCG only exposes it for 64-bit targets. For system emulation just reuse the helper; for user-mode emulation the ABI is the same as "int $80". The BSDs does not support any fast system call mechanism in 32-bit mode so add to bsd-user the same stub that FreeBSD has for 64-bit compatibility mode. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-26 10:23:56 +02:00
Paolo Bonzini	6750485bf4	target/i386: implement RDPID in TCG RDPID corresponds to a RDMSR(TSC_AUX); however, it is unprivileged so for user-mode emulation we must provide the value that the kernel places in the MSR. For Linux, it is a combination of the current CPU and the current NUMA node, both of which can be retrieved with getcpu(2). Also try sched_getcpu(), which might be there on the BSDs. If there is no portable way to retrieve the current CPU id from userspace, return 0. RDTSCP is reimplemented as RDTSC + RDPID ECX; the differences in terms of serializability are not relevant to QEMU. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-26 10:23:56 +02:00
Paolo Bonzini	53b9b4cc9f	target/i386: sysret and sysexit are privileged Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-26 10:23:56 +02:00
Paolo Bonzini	75a02adf81	target/i386: AMD only supports SYSENTER/SYSEXIT in 32-bit mode Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-26 10:23:44 +02:00
Paolo Bonzini	fd5dcb1ccd	target/i386: Intel only supports SYSCALL/SYSRET in long mode Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-26 10:23:07 +02:00
Paolo Bonzini	431c51e9d4	target/i386: TCG supports WBNOINVD WBNOINVD is the same as INVD or WBINVD as far as TCG is concerned, since there is no cache in TCG and therefore no invalidation side effect in WBNOINVD. With respect to SVM emulation, processors that do not support WBNOINVD will ignore the prefix and treat it as WBINVD, while those that support it will generate exactly the same vmexit. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-26 10:23:07 +02:00
Paolo Bonzini	f9e0dbae78	target/i386: do not accept RDSEED if CPUID bit absent Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-26 10:23:02 +02:00
Paolo Bonzini	4d714d1a0b	target/i386: fix INVD vmexit Due to a typo or perhaps a brain fart, the INVD vmexit was never generated. Fix it (but not that fixing just the typo would break both INVD and WBINVD, due to a case of two wrongs making a right). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-06-26 10:23:01 +02:00
Philippe Mathieu-Daudé	1dc7bb0e96	target/i386: Simplify i386_tr_init_disas_context() Since cpu_mmu_index() is well-defined for user-only, we can remove the surrounding #ifdef'ry entirely. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-2-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-20 10:01:30 +02:00
Richard Henderson	dfd1b81274	accel/tcg: Introduce translator_io_start New wrapper around gen_io_start which takes care of the USE_ICOUNT check, as well as marking the DisasContext to end the TB. Remove exec/gen-icount.h. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-05 12:04:29 -07:00
Richard Henderson	d53106c997	tcg: Pass TCGHelperInfo to tcg_gen_callN In preparation for compiling tcg/ only once, eliminate the all_helpers array. Instantiate the info structs for the generic helpers in accel/tcg/, and the structs for the target-specific helpers in each translate.c. Since we don't see all of the info structs at startup, initialize at first use, using g_once_init_* to make sure we don't race while doing so. Reviewed-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-05 12:04:29 -07:00
Richard Henderson	732e89f4c4	tcg: Replace tcg_abort with g_assert_not_reached Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-04-23 08:17:46 +01:00
Richard Henderson	3df11bb14a	target/i386: Avoid use of tcg_const_* throughout All uses are strictly read-only. Most of the obviously so, as direct arguments to gen_helper_*. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-03-13 06:44:37 -07:00
Richard Henderson	3e7da311d7	target/i386: Simplify POPF Compute the eflags write mask separately, leaving one call to the helper. Use tcg_constant_i32. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-03-05 13:45:31 -08:00
Richard Henderson	5128d58480	target/i386: Drop tcg_temp_free Translators are no longer required to free tcg temporaries. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-03-05 13:44:08 -08:00
Richard Henderson	3a5d177322	target/i386: Don't use tcg_temp_local_new Since tcg_temp_new is now identical, use that. In some cases we can avoid a copy from A0 or T0. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-03-01 07:33:28 -10:00
Richard Henderson	597f9b2d30	accel/tcg: Pass max_insn to gen_intermediate_code by pointer In preparation for returning the number of insns generated via the same pointer. Adjust only the prototypes so far. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-03-01 07:33:27 -10:00
Anton Johansson	2e3afe8e19	target/i386: Replace `TARGET_TB_PCREL` with `CF_PCREL` Signed-off-by: Anton Johansson <anjo@rev.ng> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230227135202.9710-8-anjo@rev.ng> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-03-01 07:31:56 -10:00
Richard Henderson	5f0dd8cd33	target/i386: Inline cmpxchg16b Use tcg_gen_atomic_cmpxchg_i128 for the atomic case, and tcg_gen_qemu_ld/st_i128 otherwise. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-02-04 06:19:43 -10:00
Richard Henderson	326ad06cf5	target/i386: Inline cmpxchg8b Use tcg_gen_atomic_cmpxchg_i64 for the atomic case, and tcg_gen_nonatomic_cmpxchg_i64 otherwise. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-02-04 06:19:43 -10:00
Richard Henderson	6218c177af	target/i386: Split out gen_cmpxchg8b, gen_cmpxchg16b Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-02-04 06:19:43 -10:00
Paolo Bonzini	35d95e4126	target/i386: hardcode R_EAX as destination register for LAHF/SAHF When translating code that is using LAHF and SAHF in combination with the REX prefix, the instructions should not use any other register than AH; however, QEMU selects SPL (SP being register 4, just like AH) if the REX prefix is present. To fix this, use deposit directly without going through gen_op_mov_v_reg and gen_op_mov_reg_v. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/130 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-11-15 09:34:42 +10:00
Paolo Bonzini	d1bb978ba1	target/i386: fix cmpxchg with 32-bit register destination Unlike the memory case, where "the destination operand receives a write cycle without regard to the result of the comparison", rm must not be touched altogether if the write fails, including not zero-extending it on 64-bit processors. This is not how the movcond currently works, because it is always followed by a gen_op_mov_reg_v to rm. To fix it, introduce a new function that is similar to gen_op_mov_reg_v but writes to a TCG temporary. Considering that gen_extu(ot, oldv) is not needed in the memory case either, the two cases for register and memory destinations are different enough that one might as well fuse the two "if (mod == 3)" into one. So do that too. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/508 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [rth: Add a test case ] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-11-15 09:34:42 +10:00
Stefan Hajnoczi	7f5acfcb66	* bug fixes * reduced memory footprint for IPI virtualization on Intel processors * asynchronous teardown support (Linux only) -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNiVykUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0Swf/YxjphCtFgYYSO14WP+7jAnfRZLhm 0xWChWP8rco5I352OBFeFU64Av5XoLGNn6SZLl8lcg86lQ/G0D27jxu6wOcDDHgw 0yTDO1gevj51UKsbxoC66OWSZwKTEo398/BHPDcI2W41yOFycSdtrPgspOrFRVvf 7M3nNjuNPsQorZeuu8NGr3jakqbt99ZDXcyDEWbrEAcmy2JBRMbGgT0Kdnc6aZfW CvL+1ljxzldNwGeNBbQW2QgODbfHx5cFZcy4Daze35l5Ra7K/FrgAzr6o/HXptya 9fEs5LJQ1JWI6JtpaWwFy7fcIIOsJ0YW/hWWQZSDt9JdAJFE5/+vF+Kz5Q== =CgrO -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging * bug fixes * reduced memory footprint for IPI virtualization on Intel processors * asynchronous teardown support (Linux only) # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNiVykUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0Swf/YxjphCtFgYYSO14WP+7jAnfRZLhm # 0xWChWP8rco5I352OBFeFU64Av5XoLGNn6SZLl8lcg86lQ/G0D27jxu6wOcDDHgw # 0yTDO1gevj51UKsbxoC66OWSZwKTEo398/BHPDcI2W41yOFycSdtrPgspOrFRVvf # 7M3nNjuNPsQorZeuu8NGr3jakqbt99ZDXcyDEWbrEAcmy2JBRMbGgT0Kdnc6aZfW # CvL+1ljxzldNwGeNBbQW2QgODbfHx5cFZcy4Daze35l5Ra7K/FrgAzr6o/HXptya # 9fEs5LJQ1JWI6JtpaWwFy7fcIIOsJ0YW/hWWQZSDt9JdAJFE5/+vF+Kz5Q== # =CgrO # -----END PGP SIGNATURE----- # gpg: Signature made Wed 02 Nov 2022 07:40:25 EDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: target/i386: Fix test for paging enabled util/log: Close per-thread log file on thread termination target/i386: Set maximum APIC ID to KVM prior to vCPU creation os-posix: asynchronous teardown for shutdown on Linux target/i386: Fix calculation of LOCK NEG eflags Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-11-03 10:54:37 -04:00
Richard Henderson	6317933086	target/i386: Expand eflags updates inline The helpers for reset_rf, cli, sti, clac, stac are completely trivial; implement them inline. Drop some nearby #if 0 code. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-11-01 08:31:41 +11:00
Qi Hu	1215317510	target/i386: Fix calculation of LOCK NEG eflags After: lock negl -0x14(%rbp) pushf pop %rax %rax will contain the wrong value because the "lock neg" calculates the wrong eflags. Simple test: #include <assert.h> int main() { __volatile__ unsigned test = 0x2363a; __volatile__ char cond = 0; asm( "lock negl %0 \n\t" "sets %1" : "=m"(test), "=r"(cond)); assert(cond & 1); return 0; } Reported-by: Jinyang Shen <shenjinyang@loongson.cn> Co-Developed-by: Xuehai Chen <chenxuehai@loongson.cn> Signed-off-by: Xuehai Chen <chenxuehai@loongson.cn> Signed-off-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-31 09:46:34 +01:00
Richard Henderson	434382e640	target/i386: Convert to tcg_ops restore_state_to_opc Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-10-26 11:11:28 +10:00
Paolo Bonzini	2872b0f390	target/i386: implement FMA instructions The only issue with FMA instructions is that there are _a lot_ of them (30 opcodes, each of which comes in up to 4 versions depending on VEX.W and VEX.L; a total of 96 possibilities). However, they can be implement with only 6 helpers, two for scalar operations and four for packed operations. (Scalar versions do not do any merging; they only affect the bottom 32 or 64 bits of the output operand. Therefore, there is no separate XMM and YMM of the scalar helpers). First, we can reduce the number of helpers to one third by passing four operands (one output and three inputs); the reordering of which operands go to the multiply and which go to the add is done in emit.c. Second, the different instructions also dispatch to the same softfloat function, so the flags for float32_muladd and float64_muladd are passed in the helper as int arguments, with a little extra complication to handle FMADDSUB and FMSUBADD. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-22 09:05:54 +02:00
Paolo Bonzini	653fad2497	target/i386: remove old SSE decoder With all SSE (and AVX!) instructions now implemented in disas_insn_new, it's possible to remove gen_sse, as well as the helpers for instructions that now use gvec. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:05 +02:00
Paolo Bonzini	71a0891d61	target/i386: move 3DNow to the new decoder This adds another kind of weirdness when you thought you had seen it all: an opcode byte that comes _after_ the address, not before. It's not worth adding a new X86_SPECIAL_* constant for it, but it's actually not unlike VCMP; so, forgive me for exploiting the similarity and just deciding to dispatch to the right gen_helper_* call in a single code generation function. In fact, the old decoder had a bug where s->rip_offset should have been set to 1 for 3DNow! instructions, and it's fixed now. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:05 +02:00
Paolo Bonzini	f8d19eec0d	target/i386: reimplement 0x0f 0x28-0x2f, add AVX Here the code is a bit uglier due to the truncation and extension of registers to and from 32-bit. There is also a mistake in the manual with respect to the size of the memory operand of CVTPS2PI and CVTTPS2PI, reported by Ricky Zhou. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:05 +02:00
Paolo Bonzini	7170a17ec3	target/i386: reimplement 0x0f 0x10-0x17, add AVX These are mostly moves, and yet are a total pain. The main issue is that: 1) some instructions are selected by mod==11 (register operand) vs. mod=00/01/10 (memory operand) 2) stores to memory are two-operand operations, while the 3-register and load-from-memory versions operate on the entire contents of the destination; this makes it easier to separate the gen_* function for the store case 3) it's inefficient to load into xmm_T0 only to move the value out again, so the gen_* function for the load case is separated too The manual also has various mistakes in the operands here, for example the store case of MOVHPS operates on a 128-bit source (albeit discarding the bottom 64 bits) and therefore should be Mq,Vdq rather than Mq,Vq. Likewise for the destination and source of MOVHLPS. VUNPCK?PS and VUNPCK?PD are the same as VUNPCK?DQ and VUNPCK?QDQ, but encoded as prefixes rather than separate operands. The helpers can be reused however. For MOVSLDUP, MOVSHDUP and MOVDDUP I chose to reimplement them as helpers. I named the helper for MOVDDUP "movdldup" in preparation for possible future introduction of MOVDHDUP and to clarify the similarity with MOVSLDUP. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:05 +02:00
Paolo Bonzini	aba2b8ecb9	target/i386: reimplement 0x0f 0xc2, 0xc4-0xc6, add AVX Nothing special going on here, for once. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:05 +02:00
Paolo Bonzini	16fc5726a6	target/i386: reimplement 0x0f 0x38, add AVX There are several special cases here: 1) extending moves have different widths for the helpers vs. for the memory loads, and the width for memory loads depends on VEX.L too. This is represented by X86_SPECIAL_AVXExtMov. 2) some instructions, such as variable-width shifts, select the vector element size via REX.W. 3) VSIB instructions (VGATHERxPy, VPGATHERxy) are also part of this group, and they have (among other things) two output operands. 3) the macros for 4-operand blends (which are under 0x0f 0x3a) have to be extended to support 2-operand blends. The 2-operand variant actually came a few years earlier, but it is clearer to implement them in the opposite order. X86_TYPE_WM, introduced earlier for unaligned loads, is reused for helpers that accept a Reg* but have a M argument. These three-byte opcodes also include AVX new instructions, for which the helpers were originally implemented by Paul Brook <paul@nowt.org>. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:05 +02:00
Paolo Bonzini	7906847768	target/i386: reimplement 0x0f 0x3a, add AVX The more complicated operations here are insertions and extractions. Otherwise, there are just more entries than usual because the PS/PD/SS/SD variations are encoded in the opcode rater than in the prefixes. These three-byte opcodes also include AVX new instructions, whose implementation in the helpers was originally done by Paul Brook <paul@nowt.org>. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:05 +02:00

1 2 3 4

189 Commits