Commit Graph

1539 Commits

Author SHA1 Message Date
Richard Henderson
880ad9626c tcg/arm: Use constant pool for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
2a8ab93c6b tcg/arm: Extract INSN_NOP
We'll want this for tcg_out_nop_fill.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
1507061637 tcg/arm: Code rearrangement
Move constants before all of the functions.
Move tcg_out_<format> functions before all
of the others.  No functional change.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
95ede84f4d tcg/arm: Tighten tlb indexing offset test
We are not going to use ldrd for loading the comparator
for 32-bit guests, so don't limit cmp_off to 8 bits then.
This eliminates one insn in the tlb load for some guests.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
647ab96aaf tcg/arm: Improve tlb load for armv7
Use UBFX to avoid limitation on CPU_TLB_BITS.  Since we're dropping
the initial shift, we need to replace the page masking.  We can use
MOVW+BIC to do this without shifting.  The result is the same size
as the armv6 path with one less conditional instruction.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
e9823b4c33 tcg/sparc: Use constant pool for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
ab20bdc116 tcg/sparc: Introduce TCG_REG_TB
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
55129955e9 tcg/aarch64: Use constant pool for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
a534bb15f3 tcg/s390: Use constant pool for cmpi
Also use CHI/CGHI for 16-bit signed constants.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
5bf67a9217 tcg/s390: Use constant pool for xori
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
4046d9ca04 tcg/s390: Use constant pool for ori
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
bdcd5d1926 tcg/s390: Use constant pool for andi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
28eef8aaec tcg/s390: Use constant pool for movi
Split out maybe_out_small_movi for use with other operations
that want to add to the constant pool.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
e692a3492d tcg/s390: Fix sign of patch_reloc addend
We were passing in -2 instead of +2, but then ignoring
the actual contents of addend in the calculation.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
829e1376d9 tcg/s390: Introduce TCG_REG_TB
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
4e45f23943 tcg/i386: Store out-of-range call targets in constant pool
Already it saves 2 bytes per call, but also the constant pool
entry may well be shared across multiple calls.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
57a269469d tcg: Infrastructure for managing constant pools
A new shared header tcg-pool.inc.c adds new_pool_label,
for registering a tcg_target_ulong to be emitted after
the generated code, plus relocation data to install a
pointer to the data.

A new pointer is added to the TCGContext, so that we
dump the constant pool as data, not code.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
659ef5cbb8 tcg: Rearrange ldst label tracking
Dispense with TCGBackendData, as it has never been used for more than
holding a single pointer.  Use a define in the cpu/tcg-target.h to
signal requirement for TCGLabelQemuLdst, so that we can drop the no-op
tcg-be-null.h stubs.  Rename tcg-be-ldst.h to tcg-ldst.inc.c.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:35 -07:00
Richard Henderson
a858339336 tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.h
Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump
boolean test.  Replace the tb_set_jmp_target1 ifdef with an unconditional
function tb_target_set_jmp_target.

While we're touching all backends, add a parameter for tb->tc_ptr;
we're going to need it shortly for some backends.

Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c.

This opens the possibility for TCG_TARGET_HAS_direct_jump to be
a runtime decision -- based on host cpu capabilities, the size of
code_gen_buffer, or a future debugging switch.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-07 11:57:34 -07:00
Richard Henderson
cda4a338c4 tcg/tci: Add TCG_TARGET_DEFAULT_MO
Missed being added as part of 71650df7b0.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-07 18:57:34 +01:00
Richard Henderson
4609190b5f tcg/s390: Use slbgr for setcond le and leu
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:46 -07:00
Richard Henderson
7af525af01 tcg/s390: Use load-on-condition-2 facility
This allows LOAD HALFWORD IMMEDIATE ON CONDITION,
eliminating one insn in some common cases.

Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:41 -07:00
Richard Henderson
c2097136ad tcg/s390: Use distinct-operands facility
This allows using a 3-operand insn form for some arithmetic,
logicals and shifts.

Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:38 -07:00
Richard Henderson
e42349cbd6 tcg/s390: Merge ori+xori facilities check to tcg_target_op_def
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:35 -07:00
Richard Henderson
ba18b07dc6 tcg/s390: Merge add2i facilities check to tcg_target_op_def
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:33 -07:00
Richard Henderson
a8f0269e9e tcg/s390: Merge muli facilities check to tcg_target_op_def
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:31 -07:00
Richard Henderson
07952d9570 tcg/s390: Merge cmpi facilities check to tcg_target_op_def
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:24:28 -07:00
Richard Henderson
9b5500b697 tcg/s390: Fully convert tcg_target_op_def
Use a switch instead of searching a table.

Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-09-06 07:22:24 -07:00
Pranith Kumar
b32dc3370a tcg: Implement implicit ordering semantics
Currently, we cannot use mttcg for running strong memory model guests
on weak memory model hosts due to missing ordering semantics.

We implicitly generate fence instructions for stronger guests if an
ordering mismatch is detected. We generate fences only for the orders
for which fence instructions are necessary, for example a fence is not
necessary between a store and a subsequent load on x86 since its
absence in the guest binary tells that ordering need not be
ensured. Also note that if we find multiple subsequent fence
instructions in the generated IR, we combine them in the TCG
optimization pass.

This patch allows us to boot an x86 guest on ARM64 hosts using mttcg.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170829063313.10237-4-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-05 13:41:46 -07:00
Pranith Kumar
71650df7b0 tcg: Add tcg target default memory ordering
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170829063313.10237-3-bobby.prani@gmail.com>
[rth: Dropped ia64 hunk]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-05 12:56:40 -07:00
Richard Henderson
a46c1244a0 tcg: Remove support for ia64 as host
We threatened to remove ia64 as host in v2.9.0.  Its time has now come.

There are still some usages of defined(__ia64__) throughout the source
code that would be triggered if one were to enable TCI on an ia64 host.
Leave those alone for now.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-09-05 12:39:25 -07:00
Richard Henderson
13aaef678e tcg: Increase minimum alignment from tcg_malloc to 8
For a 64-bit ILP32 host, aligning to sizeof(long) is not enough.
Guess the minimum for any host is 8, as that covers uint64_t.
Qemu doesn't use a host long double or host vectors, except in
extremely limited circumstances.

Fixes a bus error for a sparc v8plus host.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-08-03 11:00:30 -07:00
Richard Henderson
ca671de8af tcg/arm: Fix runtime overalignment test
Patch 85aa80813d changed the IF emitting the TST instruction,
but failed to change the ?: converting CMP to CMPEQ, so the
result of the TST is ignored.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-08-03 10:56:44 -07:00
Philippe Mathieu-Daudé
b208ac07ea docs: fix broken paths to docs/devel/atomics.txt
With the move of some docs/ to docs/devel/ on ac06724a71,
a couple of references were not updated.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-07-31 13:12:47 +03:00
Richard Henderson
5dd8990841 util: Introduce include/qemu/cpuid.h
Clang 3.9 passes the CONFIG_AVX2_OPT configure test.  However, the
supplied <cpuid.h> does not contain the bit_AVX2 define that we use
when detecting whether the routine can be enabled.

Introduce a qemu-specific header that uses the compiler's definition
of __cpuid et al, but supplies any missing bit_* definitions needed.
This avoids introducing any extra ifdefs to util/bufferiszero.c, and
allows quite a few to be removed from tcg/i386/tcg-target.inc.c.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170719044018.18063-1-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-24 12:42:55 +01:00
Philippe Mathieu-Daudé
797ed66d29 tcg/tci: enable bswap16_i64
Altough correctly implemented, bswap16_i64() never got tested/executed so the
safety TODO() statement was never removed.

Since it got now tested the TODO() can be removed.

while running Alex Bennée's image aarch64-linux-3.15rc2-buildroot.img:

Trace 0x7fa1904b0890 [0: ffffffc00036cd04]
----------------
IN:
0xffffffc00036cd24:  5ac00694      rev16 w20, w20

OP:
 ---- ffffffc00036cd24 0000000000000000 0000000000000000
 ext32u_i64 tmp3,x20
 ext16u_i64 tmp2,tmp3
 bswap16_i64 x20,tmp2
 movi_i64 tmp4,$0x10
 shr_i64 tmp2,tmp3,tmp4
 ext16u_i64 tmp2,tmp2
 bswap16_i64 tmp2,tmp2
 deposit_i64 x20,x20,tmp2,$0x10,$0x10

Linking TBs 0x7fa1904b0890 [ffffffc00036cd04] index 0 -> 0x7fa1904b0aa0 [ffffffc00036cd24]
Trace 0x7fa1904b0aa0 [0: ffffffc00036cd24]
TODO qemu/tci.c:1049: tcg_qemu_tb_exec()
qemu/tci.c:1049: tcg fatal error
Aborted

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jaroslaw Pelczar <j.pelczar@samsung.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20170718045540.16322-11-f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-19 14:45:16 -07:00
Jiang Biao
4df9cac57f tcg/mips: reserve a register for the guest_base.
Reserve a register for the guest_base using ppc code for reference.
By doing so, we do not have to recompute it for every memory load.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1499677934-2249-1-git-send-email-jiang.biao2@zte.com.cn>
2017-07-19 14:45:15 -07:00
Lluís Vilanova
61a67f71dd exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Every vCPU now uses a separate set of TBs for each set of dynamic
tracing event state values. Each set of TBs can be used by any number of
vCPUs to maximize TB reuse when vCPUs have the same tracing state.

This feature is later used by tracetool to optimize tracing of guest
code events.

The maximum number of TB sets is defined as 2^E, where E is the number
of events that have the 'vcpu' property (their state is stored in
CPUState->trace_dstate).

For this to work, a change on the dynamic tracing state of a vCPU will
force it to flush its virtual TB cache (which is only indexed by
address), and fall back to the physical TB cache (which now contains the
vCPU's dynamic tracing state as part of the hashing function).

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915775266.6295.10060144081246467690.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:11:05 +01:00
Jiang Biao
8b8d768f19 tcg/mips: Bugfix for crash when running program with qemu-i386.
When running a helloworld program with qemu-i386 in linux-user
mode on Loongson 3A3000, it will crash. This patch fix the bug.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Message-Id: <1499669979-25904-1-git-send-email-jiang.biao2@zte.com.cn>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-09 21:11:38 -10:00
Pranith Kumar
2acee8b2b5 tcg/aarch64: Enable indirect jump path using LDR (literal)
This patch enables the indirect jump path using an LDR (literal)
instruction. It will be interesting to test and see which performs
better among the two paths.

CC: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170630143614.31059-3-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-09 21:10:23 -10:00
Pranith Kumar
b68686bd4b tcg/aarch64: Use ADRP+ADD to compute target address
We use ADRP+ADD to compute the target address for goto_tb. This patch
introduces the NOP instruction which is used to align the above
instruction pair so that we can use one atomic instruction to patch
the destination offsets.

CC: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170630143614.31059-2-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-09 21:10:23 -10:00
Pranith Kumar
23b7aa1d2a tcg/aarch64: Introduce and use long branch to register
We can use a branch to register instruction for exit_tb for offsets
greater than 128MB.

CC: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170630143614.31059-1-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-09 21:10:23 -10:00
Paolo Bonzini
beeaef55e4 tcg: move tb_lock out of translate-all.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 16:01:16 +02:00
Peter Maydell
db7a99cdc1 Queued TCG patches
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZSBP2AAoJEK0ScMxN0CebnyMH/1ZiDhYiqCD7PYfk4/Y7Db+h
 MNKNozrWKyChWQp1RzwWqcBaIzbuMZkDYn8dfS419PNtFRNoYtHjhYvjSTfcrxS0
 U8dGOoqQUHCr/jlyIDUE4y5+aFA9R/1Ih5IQv+QCi5QNXcfeST8zcYF+ImuikP6C
 7heIc7dE9kXdA8ycWJ39kYErHK9qEJbvDx6dxMPmb4cM36U239Zb9so985TXULlQ
 LoHrDpOCBzCbsICBE8iP2RKDvcwENIx21Dwv+9gW/NqR+nRdKcxhTjKEodkS8gl/
 UxMxM/TjIPQOLLUhdck5DFgIgBgQWHRqPMJKqt466I0JlXvSpifmWxckWzslXLc=
 =R+em
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170619' into staging

Queued TCG patches

# gpg: Signature made Mon 19 Jun 2017 19:12:06 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20170619:
  target/arm: Exit after clearing aarch64 interrupt mask
  target/s390x: Exit after changing PSW mask
  target/alpha: Use tcg_gen_lookup_and_goto_ptr
  tcg: Increase hit rate of lookup_tb_ptr
  tcg/arm: Use ldr (literal) for goto_tb
  tcg/arm: Try pc-relative addresses for movi
  tcg/arm: Remove limit on code buffer size
  tcg/arm: Use indirect branch for goto_tb
  tcg/aarch64: Use ADR in tcg_out_movi
  translate-all: consolidate tb init in tb_gen_code
  tcg: allocate TB structs before the corresponding translated code
  util: add cacheinfo

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-22 10:25:03 +01:00
Richard Henderson
308714e6bc tcg/arm: Use ldr (literal) for goto_tb
The new placement of the TB means that we can use one insn
to load the goto_tb destination directly from the TB.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Richard Henderson
9c39b94f14 tcg/arm: Try pc-relative addresses for movi
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Richard Henderson
3fb53fb4d1 tcg/arm: Use indirect branch for goto_tb
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Richard Henderson
cc74d332ff tcg/aarch64: Use ADR in tcg_out_movi
The new placement of the TB means that we can use one insn
to load the return value for exit_tb returning the TB pointer.

Tested-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Emilio G. Cota
6e3b2bfd6a tcg: allocate TB structs before the corresponding translated code
Allocating an arbitrarily-sized array of tbs results in either
(a) a lot of memory wasted or (b) unnecessary flushes of the code
cache when we run out of TB structs in the array.

An obvious solution would be to just malloc a TB struct when needed,
and keep the TB array as an array of pointers (recall that tb_find_pc()
needs the TB array to run in O(log n)).

Perhaps a better solution, which is implemented in this patch, is to
allocate TB's right before the translated code they describe. This
results in some memory waste due to padding to have code and TBs in
separate cache lines--for instance, I measured 4.7% of padding in the
used portion of code_gen_buffer when booting aarch64 Linux on a
host with 64-byte cache lines. However, it can allow for optimizations
in some host architectures, since TCG backends could safely assume that
the TB and the corresponding translated code are very close to each
other in memory. See this message by rth for a detailed explanation:

  https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html
  Subject: Re: GSoC 2017 Proposal: TCG performance enhancements
  Message-ID: <1e67644b-4b30-887e-d329-1848e94c9484@twiddle.net>

Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1496790745-314-3-git-send-email-cota@braap.org>
[rth: Simplify the arithmetic in tcg_tb_alloc]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00
Emilio G. Cota
b255b2c8a5 util: add cacheinfo
Add helpers to gather cache info from the host at init-time.

For now, only export the host's I/D cache line sizes, which we
will use to improve cache locality to avoid false sharing.

Suggested-by: Richard Henderson <rth@twiddle.net>
Suggested-by: Geert Martin Ijewski <gm.ijewski@web.de>
Tested-by:    Geert Martin Ijewski <gm.ijewski@web.de>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1496794624-4083-1-git-send-email-cota@braap.org>
[rth: Move all implementations from tcg/ppc/]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-19 11:10:59 -07:00