Use them in places where mulu2 and muls2 are used.
Optimize mulx2 with dead low part to mulxh.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
There are several hosts with only a "div" insn. Remainder is computed
manually from the quotient and inputs. We can do this generically.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
On 32-bit TCG targets, when emulating deposit_i64 with a mov_i32 +
deposit_i32, care should be taken to not overwrite the low part of
the second argument before the deposit when it is the same the
destination.
This fixes the shld instruction in qemu-system-x86_64, which in turns
fixes booting "system rescue CD version 2.8.0" on this target.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
We're going to have use for this shortly in implementing other helpers.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The current helper flags, TCG_CALL_CONST and TCG_CALL_PURE might be
confusing and doesn't provide enough granularity for some helpers (FP
helpers for example).
This patch changes them into the following helpers flags:
- TCG_CALL_NO_READ_GLOBALS means that the helper does not read globals,
either directly or via an exception. They will not be saved to their
canonical location before calling the helper.
- TCG_CALL_NO_WRITE_GLOBALS means that the helper does not modify any
globals. They will only be saved to their canonical locations before
calling helpers, but they won't be reloaded afterwise.
- TCG_CALL_NO_SIDE_EFFECTS means that the call to the function is
removed if the return value is not used.
It provides convenience flags, to avoid helper definitions longer than
80 characters. It also provides compatibility flags, and updates the
documentation.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Like add2, do operand ordering, constant folding, and dead operand
elimination. The latter happens about 15% of all mulu2 during an
x86_64 bios boot.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
There are several cases that can be handled easier inside both
translators and code generators if we have out-of-band values
for conditions. It's easy enough to handle ALWAYS and NEVER in
the natural way inside the tcg middle-end.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When movcond_i32 is available we can further reduce the generated
op count from 12 to 6, and the generated code size on i686 from
88 to 74 bytes.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Avoiding 64-bit arithmetic (outside of the compare) reduces the
generated op count from 15 to 12, and the generated code size on
i686 from 105 to 88 bytes.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Checking that we don't try for idx != [01] is trivial. Checking
that we don't issue more than one of any index requires a tad
more data and some ifdefs protecting that new variable.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Given these are constants, checking once here means everything
after can assume they're correct.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For tcg_gen_concat_i32_i64 we only use deposit if the host supports it.
For tcg_gen_concat32_i64 even if the host does not, as we get identical
code before and after.
Note that this relies on the ANDI -> EXTU patch for the identity claim.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Note that xori_i64 failed to perform even the minimal
optimizations promised by the README.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Note that andi_i64 failed to perform even the minimal
optimizations promised by the README.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implemented with setcond if the target does not provide
the optional opcode.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Make tcg_const_ptr() include a cast so that you can pass it a
pointer. This allows us to drop the casts we had in all the places
that use this macro.
Acked-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
The error being caused by the failure to copy the other half of
the input to the output after having narrowed the deposit operation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: malc <av1474@comtv.ru>
If the deposit replaces the entire word, optimize to a move.
If we're inserting to the top of the word, avoid the mask of arg2
as we'll be shifting out all of the garbage and shifting in zeros.
If the host is 32-bit, reduce a 64-bit deposit to a 32-bit deposit
when possible.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
x86 cannot provide an optimized generic deposit implementation. But at
least for a few special cases, namely for writing bits 0..7, 8..15, and
0..15, versions using only a single instruction are feasible.
Introducing such limited support improves emulating 16-bit x86 code on
x86, but also rarer cases where 32-bit or 64-bit code accesses bytes or
words.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
By always defining these symbols, we can eliminate a lot of ifdefs.
To allow this to be checked reliably, the semantics of the
TCG_TARGET_HAS_* macros must be changed from def/undef to true/false.
This allows even more ifdefs to be removed, converting them into
C if statements.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When compiling with DEBUG_TCGV enabled, make the TCGv_ptr type distinct
from TCGv_i32/TCGv_i64. This means that using an i32 or i64 TCG op to
manipulate a TCGv_ptr will always be detected at compile time, rather
than only if compiling on a host system with the other word size.
NB: the tcg_add_ptr and tcg_sub_ptr macros have been removed as they
were not used anywhere.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The prototypes for the ld/st functions on a 64 bit host declared
the address parameter as a TCGv_i64 rather than a TCGv_ptr. This
worked OK (since the two are aliases), but needs to be fixed to
allow extension of TCG type debugging to i64/i32/ptr mismatches.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The arguments to tcg_gen_helper32 for these functions were not
updated correctly in rev 2bece2c883.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Some hosts (amd64, ia64) have an ABI that ignores the high bits
of the 64-bit register when passing 32-bit arguments. Others
require the value to be properly sign-extended for the type.
I.e. "int32_t" must be sign-extended and "uint32_t" must be
zero-extended to 64-bits.
To effect this, extend the "sizemask" parameter to tcg_gen_callN
to include the signedness of the type of each parameter. If the
tcg target requires it, extend each 32-bit argument into a 64-bit
temp and pass that to the function call.
This ABI feature is required by sparc64, ppc64 and s390x.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Some targets (e.g. Alpha and MIPS64) need to keep 32-bit operands
sign-extended in 64-bit registers (regardless of the "real" sign
of the operand). For that, we need to be able to distinguish
between a 32-bit load with a 32-bit result and a 32-bit load with
a given extension to a 64-bit result. This distinction already
exists for the ld* loads, but not the qemu_ld* loads.
Reserve qemu_ld32u for 64-bit outputs and introduce qemu_ld32 for
32-bit outputs. Adjust all code generators to match.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The TCGType name was already used consistently. Changing it
to an enumeration instead of a set of defines aids debugging.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the TCGCond enumeration type in the brcond and setcond
related prototypes in tcg-op.h and each code generator.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Give the enumeration formed from tcg-opc.h a name: TCGOpcode.
Use that enumeration type instead of "int" whereever appropriate.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
TCG internal helpers only access to the values passed in arguments, and
do not modify the CPU internal state. Thus they can be declared as
const and pure.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Some targets like ARM would benefit to use 32-bit helpers for
div/rem/divu/remu.
Create a #define for div2 so that targets can select between
div, div2 and helper implementation. Use the helper version if none
of the #define are present.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Previously ORC was always implemented by tcg-op.h with
an explicit NOT opcode. Allow a target implementation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Previously ANDC was always implemented by tcg-op.h with
an explicit NOT opcode. Allow a target implementation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Defines setcond_{i32,i64} and setcond2_i32 for 64-on-32-bit.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Currently zero extensions ops are implemented by a and op with a
constant. This is then catched in some backend, and replaced by
a zero extension instruction. While this works well on RISC
machines, this adds a useless register move on non-RISC machines.
Example on x86:
ext16u_i32 r1, r2
is translated into
mov %eax,%ebx
movzwl %bx, %ebx
while the optimized version should be:
movzwl %ax, %ebx
This patch adds ext{8,16,32}u_i{32,64} TCG ops that can be
implemented in the backends to avoid emitting useless register
moves.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Rename bswap_i32 into bswap32_i32 and bswap_i64 into bswap64_i64
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6829 c046a42c-6fe2-441c-8c8c-71466251a162
Simplify nand/nor/eqv and move their optimizations to and/or/xor
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6805 c046a42c-6fe2-441c-8c8c-71466251a162
The introduction of TCGV_EQUAL and not op is slightly broken.
The definition of DEBUG_TCGV shows that.
Signed-off-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6802 c046a42c-6fe2-441c-8c8c-71466251a162
This patch removes useless type information in some calls to
tcg_temp_local_new. It also removes the parameter from the
macro declaration; if a target has to use a specific non-default
size then it should use tcg_temp_local_new_{i32,i64}.
Signed-off-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6146 c046a42c-6fe2-441c-8c8c-71466251a162
this patch removes some now unused things after dyngen removal.
1. dyngen-exec.h: op_param, op _jmp and some associated macros
are now unused;
2. Makefile.target: tcg-dyngen is not needed anymore
2. tcg/tcg-op.h, tcg/tcg-opc.h: gen-op.h is dead
3. tcg.c:
- INDEX_op_end is now the first op
- CONFIG_DYNGEN_OP is never defined
4. tcg.h: dyngen_op not needed anymore
5. exec-all.h: remove some ASM macros.
Signed-off-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5922 c046a42c-6fe2-441c-8c8c-71466251a162
This fixes a few things after Paul's improvements for TCG debugging:
- change TCGv_i64 field name to something different from
TCGv_i32
- fix things in tcg that the above change made visible.
Signed-off-by: Laurent Desnogues <laurent.desnogues@gmail.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5919 c046a42c-6fe2-441c-8c8c-71466251a162
Use the same order as the _i32 version (pure code move). Suggested by
Laurent Laurent Desnogues.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5606 c046a42c-6fe2-441c-8c8c-71466251a162
Add a subfi (subtract from immediate) wrapper, useful for the PPC target.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5599 c046a42c-6fe2-441c-8c8c-71466251a162
Move addi_i64, muli_i64 and subi_i64 out of #if TCG_TARGET_REG_BITS
as both implementations are strictly identical. Use the same
optimisation (ie when imm == 0) for addi_i64 and subi_64 than the
32-bit version.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5598 c046a42c-6fe2-441c-8c8c-71466251a162