Commit 9d8bf2d1 moved the softmmu slow path out of line and introduce a
regression at the same time by always calling tcg_out_tlb_load with
is_load=1. This makes impossible to run any significant code under
qemu-system-mips*.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Since all backends have been converted, remove the compatibility code.
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Now that the code_gen_buffer is constrained to not cross 256mb
regions, we are assured that we can use J to reach another TB.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use the same table to fold comparisons as with setcond.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emitting a single branch instead of (up to) 3, using setcond2
to generate the composite compare.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The original code results in one too many insns per zero
present in the input. And since comparing 64-bit numbers
vs zero is common...
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Using tcg_unsigned_cond and tcg_high_cond.
Also, move the function up in the file for future cleanups.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use the same table to fold comparisons as with setcond.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use a table to fold comparisons to less-than.
Also, move the function up in the file for futher simplifications.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Most opcodes fall in to one of a couple of patterns.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Since we must use ADDUI, we would generate incorrect code for -32768.
Leaving off subtract of +32768 makes things easier for a follow-on patch.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At the same time, tidy deposit by introducing tcg_out_opc_bf.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
T0 is an argument register for the n32 and n64 abis. T9 is the call
address register for the abis, and is more directly under the control
of the backend.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use these instead of hard-coding the registers to use for temporaries.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use FP (also known as S8) as a normal call-saved register.
Include T0 in the allocation order and call-clobbered list
even though it's currently used as a TCG temporary.
Put the argument registers at the end of the allocation order.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
In addition, fill delay slots calling the helpers and tail
call to the store helpers.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At the same time, tidy up the call helpers, avoiding a memory reference.
Split out several subroutines. Use TCGMemOp constants. Make endianness
selectable at runtime.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
For userland builds calls will normally be in range,
and for the exit_tb opcode the branch to the epilogue.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The INDEX_op_call case has just been obsoleted; the mov and movi
cases have not been reachable for years. Attempt to document this
both in each tcg_out_op switch, and via TCG_OPF_NOT_PRESENT.
Because of the TCG_OPF_NOT_PRESENT change, this must be done for
all targets in a single commit.
Signed-off-by: Richard Henderson <rth@twiddle.net>
And use tcg pointer differencing functions as appropriate.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Instead require either mulu2_i32 or muluh_i32. The code in tcg-op.h
already supports looking for both. Previous incomplete conversion?
Signed-off-by: Richard Henderson <rth@twiddle.net>
Most 64-bit targets need to be able to ignore the high bits
of a TCG_TYPE_I32 value.
Suggested-by: Stuart Brady <sdb@zubnet.me.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Step two in the transition, adding the new ldst opcodes. Keep the old
opcodes around until all backends support the new opcodes.
Signed-off-by: Richard Henderson <rth@twiddle.net>
This is a no-op backend data implementation, for those targets that
are not currently using the load/store optimization path.
This is prepatory to always requiring these functions in all backends.
Signed-off-by: Richard Henderson <rth@twiddle.net>
* 'tcg-next' of git://github.com/rth7680/qemu: (29 commits)
tcg-i386: Make use of zero-extended memory helper routines
tcg: Introduce zero and sign-extended versions of load helpers
exec: Split softmmu_defs.h
target: Include softmmu_exec.h where forgotten
exec: Rename USUFFIX to LSUFFIX
tcg-i386: Don't perform GETPC adjustment in TCG code
exec: Reorganize the GETRA/GETPC macros
configure: Allow x32 as a host
tcg-i386: Adjust tcg_out_tlb_load for x32
tcg-i386: Use intptr_t appropriately
tcg: Fix jit debug for x32
tcg: Use appropriate types in tcg_reg_alloc_call
tcg: Change tcg_out_ld/st offset to intptr_t
tcg: Change tcg_gen_exit_tb argument to uintptr_t
tcg: Use uintptr_t in TCGHelperInfo
tcg: Change relocation offsets to intptr_t
tcg: Change memory offsets to intptr_t
tcg: Change frame pointer offsets to intptr_t
tcg: Define TCG_ptr properly
tcg: Define TCG_TYPE_PTR properly
...
On MIPS ext8s and ext16s ops are implemented with a dedicated
instruction only on MIPS32R2, otherwise the same kind of implementation
than at TCG level (shift left followed by shift right) is used.
Change that by only implementing the ext8s and ext16s ops on MIPS32R2 so
that optimizations can be done by the optimizer. Use an inline version to
avoid having to test again for MIPS32R2 instructions. Keep the shift
implementation for the ld/st routines.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use an inline version for the bswap16 and bswap32 ops to avoid
testing for MIPS32R2 instructions availability, as these ops are
only available in that case.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now that TCG supports enabling and disabling ops at runtime, it's
possible to detect the available host instructions at runtime, and
enable the corresponding ops accordingly.
Unfortunately it's not easy to probe for available instructions on
MIPS, the information is partially available in /proc/cpuinfo, and
not available in AUXV. This patch therefore probes for the instructions
by trying to execute them and by catching a possible SIGILL signal.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The _cmmu helpers can be moved to exec-all.h. The helpers that are
used from TCG will shortly need access to tcg_target_long so move
their declarations into tcg.h.
This requires minor include adjustments to all TCG backends.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
With the optimization in tcg_liveness_analysis,
we can avoid the MFLO when it is unused.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use them in places where mulu2 and muls2 are used.
Optimize mulx2 with dead low part to mulxh.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg/mips/tcg-target.h defines various operations conditionally depending
upon the isa revision, however these operations are included in
mips_op_defs[] unconditionally resulting in the following runtime errors
if CONFIG_DEBUG_TCG is defined:
Invalid op definition for movcond_i32
Invalid op definition for rotl_i32
Invalid op definition for rotr_i32
Invalid op definition for deposit_i32
Invalid op definition for bswap16_i32
Invalid op definition for bswap32_i32
tcg/tcg.c:1196: tcg fatal error
Fix with ifdefs like the i386 backend does for movcond_i32.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
There are several hosts with only a "div" insn. Remainder is computed
manually from the quotient and inputs. We can do this generically.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
MIPS32 and later instruction sets have a multiplication instruction
directly operating on GPRs. It only produces a 32-bit result but
it is exactly what is needed by QEMU.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>