As seen with ubuntu-5.10-live-powerpc.iso.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Pre-allocating 512 of them per TB is a waste.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This is less about improved type checking than enabling a
subsequent change to the representation of labels.
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This is improved type checking for the translators -- it's no longer
possible to accidentally swap arguments to the branch functions.
Note that the code generating backends still manipulate labels as int.
With notable exceptions, the scope of the change is just a few lines
for each target, so it's not worth building extra machinery to do this
change in per-target increments.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Anthony Green <green@moxielogic.com>
Cc: Jia Liu <proljc@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Pre-allocating 640 of them per TB is a waste.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We no longer need INDEX_op_end to terminate the list, nor do we
need 5 forms of nop, since we just remove the TCGOp instead.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rather reserving space in the op stream for optimization,
let the optimizer add ops as necessary.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
With the linked list scheme we need not leave nops in the stream
that we need to process later.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The previous setup required ops and args to be completely sequential,
and was error prone when it came to both iteration and optimization.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The method by which we count the number of ops emitted
is going to change. Abstract that away into some inlines.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Almost completely eliminates the ifdefs in this file, improving
confidence in the lesser used 32-bit builds.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Some of these functions are really quite large. We have a number of
things that ought to be circularly dependent, but we duplicated code
to break that chain for the inlines.
This saved 25% of the code size of one of the translators I examined.
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Currently 'info jit' outputs half of the information to monitor and the
rest to qemu log. Dumping opcode counts to monitor as a part of 'info
jit' command doesn't sound useful. Add new monitor command 'info
opcount' that only dumps opcode counters.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Commit 9d8bf2d1 moved the softmmu slow path out of line and introduce a
regression at the same time by always calling tcg_out_tlb_load with
is_load=1. This makes impossible to run any significant code under
qemu-system-mips*.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Instead of using structures, which imply some amount of overhead
on certain ABIs, use pointer types.
This actually reduces the size of the binaries vs a NON-debug
build on ppc64 and x86_64, due to a reduction in the number of
sign-extension insns.
Signed-off-by: Richard Henderson <rth@twiddle.net>
The "old" qemu_ld opcode did not specify the size of the result,
and so we had to assume full register width. With the new opcodes,
we can narrow the result.
Signed-off-by: Richard Henderson <rth@twiddle.net>
The pre-v9 ADDX/SUBX insns were renamed ADDC/SUBC for v9.
Standardizing on the v9 name makes things less confusing.
Signed-off-by: Richard Henderson <rth@twiddle.net>
On T4 and newer Sparc chips we have an add-with-carry insn
that takes its input from %xcc instead of %icc.
Signed-off-by: Richard Henderson <rth@twiddle.net>
fopen() may fail and it does not check its return vaule here,
it is better to dump op count to the normal log file.
Signed-off-by: Li Liu <john.liuli@huawei.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add back in the support for 64-bit PPC MacOSX hosts that was
broken in the recent merge of the 32-bit and 64-bit TCG backends.
Reported-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Andreas Färber <andreas.faerber@web.de>
With rt != r0 on loads, we use rt for scratch. If we need an index
register different from base, we can't use rt, but r0 is usable.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1403843160-30332-1-git-send-email-rth@twiddle.net
Tested-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The tcg_out* and tcg_patch* functions are utility routines that may or
may not be used by a particular backend; mark them with the 'unused'
attribute to suppress spurious warnings if they aren't used.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This can significantly reduce code size for generation of (some)
64-bit constants. With the side effect that we know for a fact
that exit_tb can use the register to good effect.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
As a "utility", it only supported ppc, and in a way that other
tcg backends provided directly in tcg-target.h. Removing this
disparity is easier now that the two ppc backends are merged.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The other tcg backends that support 32- and 64-bit modes
use the 32-bit name for the port. Follow suit.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Good enough to run some instructions before things go awry.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Now passes tcg_add_target_add_op_defs assertions, but
not complete enough to function.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Just enough to compile, assuming you edit config-host.mak manually.
It will still abort at runtime, due to missing brcond2, setcond2, mulu2.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The new ELFv2 ABI, used by default on powerpc64le-linux hosts,
introduced some changes that are incompatible with code currently
generated by the ppc64 TGC target. In particular, we no longer
use function descriptors.
This patch adds support for the ELFv2 ABI in the ppc64 TGC
function call and function prologue sequences.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The correct test uses the _CALL_AIX macro, not a host-specific macro.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The calling convention reserves space for the 8 register parameters on
the stack, so using only 6*8=48 as the offset was wrong. We never saw
this bug because we don't have any helpers with more than 5 parameters.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
These values are private to tcg.c; we don't need to expose
this nonsense to the translators.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rather than using tcg_out32 and opcodes directly. This allows us
to remove LD_ADDR and CMP_L macros.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
In order to be able to use tcg_out_ld/st sensibly with scratch
registers, assert only when we'd incorrectly clobber a scratch.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Code movement only. This will allow us to make use of the
other tcg_out_* functions in tidying their implementations.
Tested-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
With the "old" ldst ops we didn't know the real width of the
result of the load, but with the "new" ldst ops we do.
Signed-off-by: Richard Henderson <rth@twiddle.net>
* remotes/bonzini/softmmu-smap: (33 commits)
target-i386: cleanup x86_cpu_get_phys_page_debug
target-i386: fix protection bits in the TLB for SMEP
target-i386: support long addresses for 4MB pages (PSE-36)
target-i386: raise page fault for reserved bits in large pages
target-i386: unify reserved bits and NX bit check
target-i386: simplify pte/vaddr calculation
target-i386: raise page fault for reserved physical address bits
target-i386: test reserved PS bit on PML4Es
target-i386: set correct error code for reserved bit access
target-i386: introduce support for 1 GB pages
target-i386: introduce do_check_protect label
target-i386: tweak handling of PG_NX_MASK
target-i386: commonize checks for PAE and non-PAE
target-i386: commonize checks for 4MB and 4KB pages
target-i386: commonize checks for 2MB and 4KB pages
target-i386: fix coding standards in x86_cpu_handle_mmu_fault
target-i386: simplify SMAP handling in MMU_KSMAP_IDX
target-i386: fix kernel accesses with SMAP and CPL = 3
target-i386: move check_io helpers to seg_helper.c
target-i386: rename KSMAP to KNOSMAP
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Unify pieces of cpu-all.h, exec-all.h, softmmu_exec.h and tcg/tcg.h
into a single new header file with all helpers.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We expose a generic helper "tcg_gen_extr_i64_tl" for 64bit targets, but the
same function for 32bit targets is a misnomer and refers to an invalid function
name.
Fix up the definition to point to the correct internal helper names instead.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Since all backends have been converted, remove the compatibility code.
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The first non-register argument isn't placed at offset 0.
Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
For a 64-bit host, the high bits of a register after a 32-bit operation
are undefined. Adjust the temps mask for all 32-bit ops to reflect that.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Adjust the FDE to point to the code_buffer after we've copied it
to the image, rather than requiring that the backend set it prior.
This allows the backend to use read-only storage for its data.
Signed-off-by: Richard Henderson <rth@twiddle.net>
This will let us find all the info from the hash table.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rather than special casing them, use the standard mechanisms
for tcg helper generation.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rather than include helper.h with N values of GEN_HELPER, include a
secondary file that sets up the macros to include helper.h. This
minimizes the files that must be rebuilt when changing the macros
for file N.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
If either the high or low pair can be resolved, we can
simplify to either a constant or to a 32-bit comparison.
Signed-off-by: Richard Henderson <rth@twiddle.net>
* remotes/rth/tcg-mips: (24 commits)
tcg-mips: Enable direct chaining of TBs
tcg-mips: Simplify movcond
tcg-mips: Simplify brcond2
tcg-mips: Improve setcond eq/ne vs zeros
tcg-mips: Simplify setcond2
tcg-mips: Simplify brcond
tcg-mips: Simplify setcond
tcg-mips: Commonize opcode implementations
tcg-mips: Improve add2/sub2
tcg-mips: Hoist args loads
tcg-mips: Fix subtract immediate range
tcg-mips: Name the opcode enumeration
tcg-mips: Use EXT for AND on mips32r2
tcg-mips: Use T9 for TCG_TMP1
tcg-mips: Introduce TCG_TMP0, TCG_TMP1
tcg-mips: Rearrange register allocation
tcg-mips: Convert to new_ldst
tcg-mips: Convert to new qemu_l/st helpers
tcg-mips: Move softmmu slow path out of line
tcg-mips: Split large ldst offsets
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Now that the code_gen_buffer is constrained to not cross 256mb
regions, we are assured that we can use J to reach another TB.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use the same table to fold comparisons as with setcond.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Emitting a single branch instead of (up to) 3, using setcond2
to generate the composite compare.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The original code results in one too many insns per zero
present in the input. And since comparing 64-bit numbers
vs zero is common...
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Using tcg_unsigned_cond and tcg_high_cond.
Also, move the function up in the file for future cleanups.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use the same table to fold comparisons as with setcond.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use a table to fold comparisons to less-than.
Also, move the function up in the file for futher simplifications.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Most opcodes fall in to one of a couple of patterns.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Since we must use ADDUI, we would generate incorrect code for -32768.
Leaving off subtract of +32768 makes things easier for a follow-on patch.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At the same time, tidy deposit by introducing tcg_out_opc_bf.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
T0 is an argument register for the n32 and n64 abis. T9 is the call
address register for the abis, and is more directly under the control
of the backend.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use these instead of hard-coding the registers to use for temporaries.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use FP (also known as S8) as a normal call-saved register.
Include T0 in the allocation order and call-clobbered list
even though it's currently used as a TCG temporary.
Put the argument registers at the end of the allocation order.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
In addition, fill delay slots calling the helpers and tail
call to the store helpers.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At the same time, tidy up the call helpers, avoiding a memory reference.
Split out several subroutines. Use TCGMemOp constants. Make endianness
selectable at runtime.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
For userland builds calls will normally be in range,
and for the exit_tb opcode the branch to the epilogue.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Broken since dddbb2e1e3.
Do all the rest of the things that tcg_out_op did before
and after the big switch statement.
Signed-off-by: Richard Henderson <rth@twiddle.net>