Using only indirect calls results in 3 bundles (one to load the
descriptor address), and 4 stop bits. By looking through the
descriptor to the constants, we can perform the call with 2
bundles and only 1 stop bit.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
There's no need to go through the full opcode-to-insn function call
to generate nops. This makes the source a bit more readable.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Step two in the transition, adding the new ldst opcodes. Keep the old
opcodes around until all backends support the new opcodes.
Signed-off-by: Richard Henderson <rth@twiddle.net>
This is a no-op backend data implementation, for those targets that
are not currently using the load/store optimization path.
This is prepatory to always requiring these functions in all backends.
Signed-off-by: Richard Henderson <rth@twiddle.net>
The _cmmu helpers can be moved to exec-all.h. The helpers that are
used from TCG will shortly need access to tcg_target_long so move
their declarations into tcg.h.
This requires minor include adjustments to all TCG backends.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use them in places where mulu2 and muls2 are used.
Optimize mulx2 with dead low part to mulxh.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
There are several hosts with only a "div" insn. Remainder is computed
manually from the quotient and inputs. We can do this generically.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Note that in the general reg=reg,reg case we're restricted
to 16-bit insertions. This makes it easy to allow "any"
constant as input, as post-truncation it will fit into the
constant load insn for which we have room in the bundle.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
It is possible to slightly optimize the TLB access code, by replacing
the movi + and instructions by a deposit instruction.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Remove suboptimal register shifting in qemu_ld/st ops, introduced at the
CONFIG_TCG_PASS_AREG0 time.
As mem_idx is now loaded in register R58/R59 for the slow path, we have
to make sure to do it last, to not add additional register constraints.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement movcond_i32/64 on ia64 hosts. It is not possible to have
immediate compare arguments without adding a new bundle, but it is
possible to have 22-bit immediate value arguments.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use stack instead of temp_buf array in CPUState for TCG temps.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
GUEST_BASE support is now supported by all TCG backends, and is
now mandatory. Drop the now-pointless TCG_TARGET_HAS_GUEST_BASE
define (set by every backend) and the error if it is unset.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The TCG jmp operation doesn't really make sense in the QEMU context, it
is unused, it is not implemented by some targets, and it is wrongly
implemented by some others.
This patch simply removes it.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Blue Swirl <blauwirbel@gmail.com>
Acked-by: Stefan Weil<sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The TCG targets no longer need individual implementations.
Since commit 6a18ae2d29,
'flags' is no longer used in tcg_target_get_call_iarg_regs_count.
The remaining tcg_target_get_call_iarg_regs_count is trivial and only
called once. Therefore the patch eliminates it completely.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implemented with setcond if the target does not provide
the optional opcode.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now that CONFIG_TCG_PASS_AREG0 is enabled for all targets,
remove dead code and support for !CONFIG_TCG_PASS_AREG0 case.
Remove dyngen-exec.h and all references to it. Although included by
hw/spapr_hcall.c, it does not seem to use it.
Remove unused HELPER_CFLAGS.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Store slow path has been broken in e141ab52d:
- the arguments are shifted before the last one (mem_index) is written.
- the shift is done for both slow and fast paths.
Fix that. Also optimize a bit by bundling the move together. This still
can be optimized, but it's better to wait for a decision to be taken on
the arguments order.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Optionally, make memory access helpers take a parameter for CPUState
instead of relying on global env.
On most targets, perform simple moves to reorder registers. On i386,
switch from regparm(3) calling convention to standard stack-based
version.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
The TCG targets i386 and tci needed a change of the function
prototype for w64.
This change is currently not needed for the other TCG targets,
but it can be applied to avoid code differences.
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Including tcg_out_ld, tcg_out_st, tcg_out_mov, tcg_out_movi.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Most targets did not name the enum; tci used TCGRegister.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Remove the unused function tcg_out_addi() from the ia64 TCG backend;
this brings it into line with other backends.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
It is now declared for all tcg targets in tcg.h,
so the tcg target specific declarations are redundant.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The second register is never used for ia64 hosts.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
By always defining these symbols, we can eliminate a lot of ifdefs.
To allow this to be checked reliably, the semantics of the
TCG_TARGET_HAS_* macros must be changed from def/undef to true/false.
This allows even more ifdefs to be removed, converting them into
C if statements.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Make functions take a parameter for CPUState instead of relying
on global env. Pass CPUState pointer to TCG prologue, which moves
it to AREG0.
Thanks to Peter Maydell and Laurent Desnogues for the ARM prologue
change.
Revert the hacks to avoid AREG0 use on Sparc hosts.
Move cpu_has_work() and cpu_pc_from_tb() from exec.h to cpu.h.
Compile the file without HELPER_CFLAGS.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Add a comment about cache coherency and retranslation, so that people
developping new targets based on existing ones are warned of the issue.
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The usermode version of qemu_ld doesn't used mem_index,
leading to set-but-not-used warnings.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
A typo in the usermode address calculation path; R3 used where R2 needed.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Use ld4 not ld8 for reading the tlb of 32-bit targets.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
The port was not properly merged following
86feb1c860
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Fix compilation error when GUEST_BASE is not defined.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Both tcg_target_init and tcg_target_qemu_prologue
are unused outside of tcg.c.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Mirror tcg_out_movi in having a TYPE parameter. This allows x86_64
to perform the move at the proper width, which may elide a REX prefix.
Introduce a TCG_TYPE_REG enumerator to represent the "native width"
of the host register, and to distinguish the usage from "pointer data"
as represented by the existing TCG_TYPE_PTR.
Update all targets to match.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(1) The output registers were not marked call-clobbered, even though
they can be modified by called functions.
(2) The thread pointer was not marked reserved.
(3) R4-R6 are call-saved, but not saved by the prologue. Rather than
save them, mark them reserved so that we don't use them.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
A few words about design choices:
* On IA64, instructions should be grouped by bundle, and dependencies
between instructions declared. A first version of this code tried to
schedule instructions automatically, but was very complex and too
invasive for the current common TCG code (ops not ending at
instruction boundaries, code retranslation breaking already generated
code, etc.) It was also not very efficient, as dependencies between
TCG ops is not available.
Instead the option taken by the current implementation does not try
to fill the bundle by scheduling instructions, but by providing ops
not available as an ia64 instruction, and by offering 22-bit constant
loading for most of the instructions. With both options the bundle are
filled at approximately the same level.
* Up to 128 registers can be affected to a function on IA64, but TCG
limits this number to 64, which is actually more than enough. The
register affectation is the following:
- r0: used to map a constant argument with value 0
- r1: global pointer
- r2, r3: internal use
- r4 to r6: not used to avoid saving them
- r7: env structure
- r8 to r11: free for TCG (call clobbered)
- r12: stack pointer
- r13: thread pointer
- r14 to r31: free for TCG (call clobbered)
- r32: reserved (return address)
- r33: reserved (PFS)
- r33 to r63: free for TCG
* The IA64 architecture has only 64-bit registers and no 32-bit
instructions (the only exception being cmp4). Therefore 64-bit
registers and instructions are used for 32-bit ops. The adopted
strategy is the same as the ABI, that is the higher 32 bits are
undefined. Most ops (and, or, add, shl, etc.) can directly use
the 64-bit registers, while some others have to sign-extend (sar,
div, etc.) or zero-extend (shr, divu, etc.) the register first.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>