We require either 2 or 4 registers to hold int128_t.
Failure to do so results in a register allocation assert.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since adding MO_ATOM_MASK, the maximum MemOpIdx requires 15 bits,
which overflows the 12 bit field allocated for TCI memory ops.
Expand the field to 16 bits for 2-operand memory ops, and place
the value in TCG_REG_TMP for 3-operand memory ops (same as we
already do for 4-operand memory ops).
Cures a debug assert for aarch64, with FEAT_LSE2 enabled.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In commit d56fea79f9 ("tcg: Move TCG_{LOW,HIGH} to tcg-internal.h")
we replaced the "_link_error" definitions with modern QEMU_ERROR()
attribute markup. We covered tcg-op.c but forgot to completely
clean tcg-op-vec.c. Do it now.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230605175647.88395-3-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Create two static libraries for use by each execution mode.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This finally paves the way for tcg/ to be built once per mode.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The bug was hidden because they happen to have the same values.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is all that is required by tcg/ from exec-all.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
From this remove, it's no longer clear what this is attempting
to protect. The last time a use of this define was added to
the source tree, as opposed to merely moved around, was 2008.
There have been many cleanups since that time and this is
no longer required for the build to succeed.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Create tcg/tcg-op-gvec-common.h, moving everything that does not
concern TARGET_LONG_BITS. Adjust tcg-op-gvec.c to use the new header.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The replacement isn't ideal, as the raw count of bits
is not easily synced with exec/cpu-all.h, but it does
remove from tcg.h the target dependency on TARGET_PAGE_BITS_MIN
which is built into TLB_FLAGS_MASK.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This replaces of TCG_GUEST_DEFAULT_MO in tcg-op-ldst.c.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will enable replacement of TARGET_INSN_START_WORDS in tcg.c.
Split out "tcg/insn-start-words.h" and use it in target/.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Make tcg_gen_callN a static function. Create tcg_gen_call[0-7]
functions for use by helper-gen.h.inc.
Removes a multiplicty of calls to __stack_chk_fail, saving up
to 143kiB of .text space as measured on an x86_64 host.
Old New Less %Change
8888680 8741816 146864 1.65% qemu-system-aarch64
5911832 5856152 55680 0.94% qemu-system-riscv64
5816728 5767512 49216 0.85% qemu-system-mips64
6707832 6659144 48688 0.73% qemu-system-ppc64
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Removes a multiplicity of calls to __assert_fail, saving up
to 360kiB of .text space as measured on an x86_64 host.
Old New Less %Change
9257272 8888680 368592 3.98% qemu-system-aarch64
6100968 5911832 189136 3.10% qemu-system-riscv64
5839112 5707032 132080 2.26% qemu-system-mips
4447608 4341752 105856 2.38% qemu-system-s390x
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for compiling tcg/ only once, eliminate
the all_helpers array. Instantiate the info structs for
the generic helpers in accel/tcg/, and the structs for
the target-specific helpers in each translate.c.
Since we don't see all of the info structs at startup,
initialize at first use, using g_once_init_* to make
sure we don't race while doing so.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will be required outside of tcg-internal.h soon.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Create tcg/tcg-op-common.h, moving everything that does not concern
TARGET_LONG_BITS or TCGv. Adjust tcg/*.c to use the new header
instead of tcg-op.h, in preparation for compiling tcg/ only once.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Often, the only thing we need to know about the TCG host
is the register size.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Disconnect the layout of ArchCPU from TCG compilation.
Pass the relative offset of 'env' and 'neg.tlb.f' as a parameter.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This makes CPUTLBEntry agnostic to the address size of the guest.
When 32-bit addresses are in effect, we can simply read the low
32 bits of the 64-bit field. Similarly when we need to update
the field for setting TLB_NOTDIRTY.
For TCG backends that could in theory be big-endian, but in
practice are not (arm, loongarch, riscv), use QEMU_BUILD_BUG_ON
to document and ensure this is not accidentally missed.
For s390x, which is always big-endian, use HOST_BIG_ENDIAN anyway,
to document the reason for the adjustment.
For sparc64 and ppc64, always perform a 64-bit load, and rely on
the following 32-bit comparison to ignore the high bits.
Rearrange mips and ppc if ladders for clarity.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses replaced with TCGContext.addr_type.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses replaced with TCGContext.addr_type.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses replaced with TCGContext.addr_type.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses replaced with TCGContext.addr_type.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The last use was removed by e77c89fb08.
Fixes: e77c89fb08 ("cputlb: Remove static tlb sizing")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use LPQ/STPQ when 16-byte atomicity is required.
Note that these instructions require 16-byte alignment.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use LQ/STQ with ISA v2.07, and 16-byte atomicity is required.
Note that these instructions do not require 16-byte alignment.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With FEAT_LSE2, LDP/STP suffices. Without FEAT_LSE2, use LDXP+STXP
16-byte atomicity is required and LDP/STP otherwise.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Adjust the softmmu tlb to use TMP[0-2], not any of the normally available
registers. Since we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need to allocate a second general-purpose temporary.
Rename the existing temps to add a distinguishing number.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The first move was incorrectly using TCG_TYPE_I32 while the second
move was correctly using TCG_TYPE_REG. This prevents a 64-bit host
from moving all 128-bits of the return value.
Fixes: ebebea53ef ("tcg: Support TCG_TYPE_I128 in tcg_out_{ld,st}_helper_{args,ret}")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Implement with and without Zicond. Without Zicond, we were letting
the middle-end expand to a 5 insn sequence; better to use a branch
over a single insn.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out a helper function, tcg_out_setcond_int, which does not
always produce the complete boolean result, but returns a set of
flags to do so.
Based on 21af161984, the same improvement for loongarch64.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The instruction is a combined zero-extend and add.
Use it for exactly that.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Define a useful subset of the extensions. Probe for them
via compiler pre-processor feature macros and SIGILL.
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since e03b56863d, which replaced HOST_WORDS_BIGENDIAN
with HOST_BIG_ENDIAN, there is no need to define a second
symbol which is [0,1].
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These sequences are inexpensive to test. Maxing out at three insns
results in the same space as a load plus the constant pool entry.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These addresses are often loaded by the qemu_ld/st slow path,
for loading the retaddr value.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Repeated calls to a single helper are common -- especially
the ones for softmmu memory access. Prefer the constant pool
to longer sequences to increase sharing.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
During normal processing, the constant pool is accessible via
TCG_REG_TB. During the prologue, it is accessible via TCG_REG_T9.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emit all constants that can be loaded in exactly one insn.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This vastly reduces the size of code generated for 64-bit addresses.
The code for exit_tb, for instance, where we load a (tagged) pointer
to the current TB, goes from
0x400aa9725c: li v0,64
0x400aa97260: dsll v0,v0,0x10
0x400aa97264: ori v0,v0,0xaa9
0x400aa97268: dsll v0,v0,0x10
0x400aa9726c: j 0x400aa9703c
0x400aa97270: ori v0,v0,0x7083
to
0x400aa97240: j 0x400aa97040
0x400aa97244: daddiu v0,s6,-189
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In tcg_out_qemu_ld/st, we already check for guest_base matching int16_t.
Mirror that when setting up TCG_GUEST_BASE_REG in the prologue.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No functional change; just moving the saved reserved regs to the end.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No functional change; just moving the saved reserved regs to the end.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is always defined, and the optimization pass is
essential to producing reasonable code.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This had been set since the beginning, is never undefined,
and it would seem to be harmful to debugging to do so.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With the current structure of cputlb.c, there is no difference
between the little-endian and big-endian entry points, aside
from the assert. Unify the pairs of functions.
The only use of the functions with explicit endianness was in
target/sparc64, and that was only to satisfy the assert: the
correct endianness is already built into memop.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the code from tcg/. The only use of these bits so far
is with respect to the atomicity of tcg operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the CPUINFO_* bits instead of the individual boolean
variables that we had been using. Remove all of the init
code that was moved over to cpuinfo-i386.c.
Note that have_avx512* check both AVX512{F,VL}, as we had
previously done during tcg_target_init.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
After static_kwargs has been changed to an empty dictionary, it has
no functional effect and can be removed.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
TCG will need this declaration, without all of the other
bits that come with cpu-all.h.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Disconnect guest tlb parameters from TCG compilation.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Disconnect guest page size from TCG compilation.
While this could be done via exec/target_page.h, we want to cache
the value across multiple memory access operations, so we might
as well initialize this early.
The changes within tcg/ are entirely mechanical:
sed -i s/TARGET_PAGE_BITS/s->page_bits/g
sed -i s/TARGET_PAGE_MASK/s->page_mask/g
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses replaced with TCGContext.addr_type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses replaced with TCGContext.addr_type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses replaced with TCGContext.addr_type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses replaced with TCGContext.addr_type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Eliminate the test vs TARGET_LONG_BITS by considering this
predicate to be always true, and simplify accordingly.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses can be infered from the INDEX_op_qemu_*_a{32,64}_*
opcode being used.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses can be infered from the INDEX_op_qemu_*_a{32,64}_* opcode
being used. Add a field into TCGLabelQemuLdst to record the usage.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Because of its use on tgen_arithi, this value must be a signed
32-bit quantity, as that is what may be encoded in the insn.
The truncation of the value to unsigned for 32-bit guests is
done via the REX bit via 'trexw'.
Removes the only uses of target_ulong from this tcg backend.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since TCG_TYPE_I32 values are kept zero-extended in registers, via
omission of the REXW bit, we need not extend if the register matches.
This is already relied upon by qemu_{ld,st}.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Keep all 32-bit values zero extended in the register, not solely when
addresses are 32 bits. This eliminates a dependency on TARGET_LONG_BITS.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We now have the address size as part of the opcode, so
we no longer need to test TARGET_LONG_BITS. We can use
uint64_t for target_ulong, as passed into load/store helpers.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For 32-bit hosts, we cannot simply rely on TCGContext.addr_bits,
as we need one or two host registers to represent the guest address.
Create the new opcodes and update all users. Since we have not
yet eliminated TARGET_LONG_BITS, only one of the two opcodes will
ever be used, so we can get away with treating them the same in
the backends.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Expand from TCGv to TCGTemp inline in the translators,
and validate that the size matches tcg_ctx->addr_type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Expand from TCGv to TCGTemp inline in the translators,
and validate that the size matches tcg_ctx->addr_type.
These inlines will eventually be seen only by target-specific code.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will enable replacement of TARGET_LONG_BITS within tcg/.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since we do this inside gen_empty_mem_cb anyway, let's
do this earlier inside tcg expansion.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We only need to make copies for loads, when the destination
overlaps the address. For now, only eliminate the copy for
stores and 128-bit loads.
Rename plugin_prep_mem_callbacks to plugin_maybe_preserve_addr,
returning NULL if no copy is made.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Always pass the target address as uint64_t.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Always pass the target address as uint64_t.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Always pass the target address as uint64_t.
Adjust tcg_out_{ld,st}_helper_args to match.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We already pass uint64_t to restore_state_to_opc; this changes all
of the other uses from insn_start through the encoding to decoding.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No change to the ultimate load/store routines yet, so some atomicity
conditions not yet honored, but plumbs the change to alignment through
the relevant functions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No change to the ultimate load/store routines yet, so some atomicity
conditions not yet honored, but plumbs the change to alignment through
the relevant functions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Examine MemOp for atomicity and alignment, adjusting alignment
as required to implement atomicity on the host.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that tcg_out_helper_load_regs is not recursive, we can
merge it into its only caller, tcg_out_helper_load_slots.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With x86_64 as host, we do not have any temporaries with which to
resolve cycles, but we do have xchg. As a side bonus, the set of
graphs that can be made with 3 nodes and all nodes conflicting is
small: two. We can solve the cycle with a single temp.
This is required for x86_64 to handle stores of i128: 1 address
register and 2 data registers.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add opcodes for backend support for 128-bit memory operations.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the unparameterized TCG_TARGET_HAS_MEMORY_BSWAP macro
with a function with a memop argument.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The system is required to emulate unaligned accesses, even if the
hardware does not support it. The resulting trap may or may not
be more efficient than the qemu slow path. There are linux kernel
patches in flight to allow userspace to query hardware support;
we can re-evaluate whether to enable this by default after that.
In the meantime, softmmu now matches useronly, where we already
assumed that unaligned accesses are supported.
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Test the final byte of an unaligned access.
Use BSTRINS.D to clear the range of bits, rather than AND.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This should be true of all loongarch64 running Linux.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Drop the target-specific trampolines for the standard slow path.
This lets us use tcg_out_helper_{ld,st}_args, and handles the new
atomicity bits within MemOp.
At the same time, use the full load/store helpers for user-only mode.
Drop inline unaligned access support for user-only mode, as it does
not handle atomicity.
Use TCG_REG_T[1-3] in the tlb lookup, instead of TCG_REG_O[0-2].
This allows the constraints to be simplified.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emphasize that the constant is unsigned.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Shuffle the order in tcg_out_movi_int to check s13 first, and
drop this check from tcg_out_movi_imm32. This might make the
sequence for in_prologue larger, but not worth worrying about.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Emphasize that the constant is signed.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Always reserve r3 for tlb softmmu lookup. Fix a bug in user-only
ALL_QLDST_REGS, in that r14 is clobbered by the BLNE that leads
to the misaligned trap. Remove r0+r1 from user-only ALL_QLDST_REGS;
I believe these had been reserved for bswap, which we no longer
perform during qemu_st.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers.
This will allow the fast path to increase alignment to implement atomicity
while not immediately raising an alignment exception.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These features are present for Apple M1.
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Notice when the host has additional atomic instructions.
The new variables will also be used in generated code.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Notice when Intel or AMD have guaranteed that vmovdqa is atomic.
The new variable will also be used in generated code.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can now fold these two pieces of code.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
TCG backends may need to defer to a helper to implement
the atomicity required by a given operation. Mirror the
interface used in system mode.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With the current structure of cputlb.c, there is no difference
between the little-endian and big-endian entry points, aside
from the assert. Unify the pairs of functions.
Hoist the qemu_{ld,st}_helpers arrays to tcg.c.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This field may be used to describe the precise atomicity requirements
of the guest, which may then be used to constrain the methods by which
it may be emulated by the host.
For instance, the AArch64 LDP (32-bit) instruction changes semantics
with ARMv8.4 LSE2, from
MO_64 | MO_ATOM_IFALIGN_PAIR
(64-bits, single-copy atomic only on 4 byte units,
nonatomic if not aligned by 4),
to
MO_64 | MO_ATOM_WITHIN16
(64-bits, single-copy atomic within a 16 byte block)
The former may be implemented with two 4 byte loads, or a single 8 byte
load if that happens to be efficient on the host. The latter may not
be implemented with two 4 byte loads and may also require a helper when
misaligned.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses have now been expunged.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Adjust the softmmu tlb to use R0+R1, not any of the normally available
registers. Since we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than zero-extend the guest address into a register,
use an add instruction which zero-extends the second input.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Never used since its introduction.
Fixes: 3d582c6179 ("tcg-ppc64: Rearrange integer constant constraints")
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These constraints have not been used for quite some time.
Fixes: 77b73de676 ("Use rem/div[u]_i32 drop div[u]2_i32")
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softmmu tlb uses TCG_REG_{TMP1,TMP2,R0}, not any of the normally
available registers. Now that we handle overlap betwen inputs and
helper arguments, we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Allocate TCG_REG_TMP2. Use R0, TMP1, TMP2 instead of any of
the normally allocated registers for the tlb load.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softmmu tlb uses TCG_REG_TMP[0-3], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
and have eliminated use of A0, we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Compare the address vs the tlb entry with sign-extended values.
This simplifies the page+alignment mask constant, and the
generation of the last byte address for the misaligned test.
Move the tlb addend load up, and the zero-extension down.
This frees up a register, which allows us use TMP3 as the returned base
address register instead of A0, which we were using as a 5th temporary.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While performing the load in the delay slot of the call to the common
bswap helper function is cute, it is not worth the added complexity.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. This allows our local
tcg_out_arg_* infrastructure to be removed.
We are no longer filling the call or return branch
delay slots, nor are we tail-calling for the store,
but this seems a small price to pay.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. This allows our local
tcg_out_arg_* infrastructure to be removed.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_st_helper_args. This eliminates the use of a tail call to
the store helper. This may or may not be an improvement, depending on
the call/return branch prediction of the host microarchitecture.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_out_ld_helper_args and tcg_out_ld_helper_ret.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. These and their subroutines
use the existing knowledge of the host function call abi
to load the function call arguments and return results.
These will be used to simplify the backends in turn.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
tcg_prepare_user_ldst, and some code that lived in both tcg_out_qemu_ld
and tcg_out_qemu_st into one function that returns HostAddress and
TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns TCGReg and TCGLabelQemuLdst.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
tcg_out_zext_addr_if_32_bit, and some code that lived in both
tcg_out_qemu_ld and tcg_out_qemu_st into one function that returns
HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, and some code that lived
in both tcg_out_qemu_ld and tcg_out_qemu_st into one function that
returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since tcg_out_{ld,st}_helper_args, the slow path no longer requires
the address argument to be set up by the tlb load sequence. Use a
plain load for the addend and indexed addressing with the original
input address register.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_tlb_load, add_qemu_ldst_label,
tcg_out_test_alignment, and some code that lived in both
tcg_out_qemu_ld and tcg_out_qemu_st into one function
that returns HostAddress and TCGLabelQemuLdst structures.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Unify all computation of argument stack offset in one function.
This requires that we adjust ref_slot to be in the same units,
by adding max_reg_slots during init_call_layout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
An inline function is safer than a macro, and REG_P
was rather too generic.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We need to set this in TCGLabelQemuLdst, so plumb this
all the way through from tcg_out_op.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In tcg_canonicalize_memop, we remove MO_SIGN from MO_32 operations
with TCG_TYPE_I32. Thus this is never set. We already have an
identical test just above which does not include is_64
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Collect the 3 potential parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st}_direct to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We need to set this in TCGLabelQemuLdst, so plumb this
all the way through from tcg_out_op.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Interpret the variable argument placement in the caller. Pass data_type
instead of is64 -- there are several places where we already convert back
from bool to type. Clean things up by using type throughout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The port currently does not support "oversize" guests, which
means riscv32 can only target 32-bit guests. We will soon be
building TCG once for all guests. This implies that we can
only support riscv64.
Since all Linux distributions target riscv64 not riscv32,
this is not much of a restriction and simplifies the code.
The brcond2 and setcond2 opcodes are exclusive to 32-bit hosts,
so we can and should remove the stubs.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Collect the parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st} to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Interpret the variable argument placement in the caller. Pass data_type
instead of is64 -- there are several places where we already convert back
from bool to type. Clean things up by using type throughout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Interpret the variable argument placement in the caller. There are
several places where we already convert back from bool to type.
Clean things up by using type throughout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Collect the 2 parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st}_direct to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Interpret the variable argument placement in the caller. Shift some
code around slightly to share more between softmmu and user-only.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Collect the parts of the host address, and condition, into a struct.
Merge tcg_out_qemu_*_{index,direct} and use it.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Interpret the variable argument placement in the caller.
Pass data_type instead of is_64. We need to set this in
TCGLabelQemuLdst, so plumb this all the way through from tcg_out_op.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Collect the 3 potential parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st}_direct to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename the 'ext' parameter 'data_type' to make the use clearer;
pass it to tcg_out_qemu_st as well to even out the interfaces.
Rename the 'otype' local 'addr_type' to make the use clearer.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out a helper for choosing testb vs testl.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use TCG_REG_L[01] constants directly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Collect the 4 potential parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st}_direct to use it.
Reorg guest_base handling to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Test for both base and index; use datahi as a temporary, overwritten
by the final load. Always perform the loads in ascending order, so
that any (user-only) fault sees the correct address.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Interpret the variable argument placement in the caller. Pass data_type
instead of is64 -- there are several places where we already convert back
from bool to type. Clean things up by using type throughout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is common code in most qemu_{ld,st} slow paths, moving two
registers when there may be overlap between sources and destinations.
At present, this is only used by 32-bit hosts for 64-bit data,
but will shortly be used for more than that.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since TCG_TYPE_I32 values are kept sign-extended in registers, we need not
extend if the register matches. This is already relied upon by comparisons.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since TCG_TYPE_I32 values are kept sign-extended in registers,
via ".w" instructions, we need not extend if the register matches.
This is already relied upon by comparisons.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add tcg expander and helper functions for rotate right
vector with scalar operand.
Signed-off-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Message-Id: <20230428144757.57530-10-lawrence.hunter@codethink.co.uk>
[rth: Split out of larger patch; mask rotation count.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add tcg expander and helper functions for and-compliment
vector with scalar operand.
Signed-off-by: Nazar Kazakov <nazar.kazakov@codethink.co.uk>
Message-Id: <20230428144757.57530-10-lawrence.hunter@codethink.co.uk>
[rth: Split out of larger patch.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Document that the -singlestep command line option is now
deprecated, as it is replaced by either the TCG accelerator
property 'one-insn-per-tb' for system emulation or the new
'-one-insn-per-tb' option for usermode emulation, and remove
the only use of the deprecated syntax from a README.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230417164041.684562-7-peter.maydell@linaro.org
Since TCG_TYPE_I32 values are kept sign-extended in registers, via "w"
instructions, we don't need to extend if the register matches.
This is already relied upon by comparisons.
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will want a backend interface for register swapping.
This is only properly defined for x86; all others get a
stub version that always indicates failure.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is common code in most qemu_{ld,st} slow paths, extending the
input value for the store helper data argument or extending the
return value from the load helper.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for type truncation. For those backends
that did not enable TCG_TARGET_HAS_extrl_i64_i32, use tcg_out_mov.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for type extension with zero.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for type extension with sign.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for performing 32-bit zero-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for performing 32-bit sign-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for performing 16-bit zero-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for performing 16-bit sign-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for performing 8-bit zero-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need a backend interface for performing 8-bit sign-extend.
Use it in tcg_reg_alloc_op in the meantime.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For both _CALL_SYSV and _CALL_DARWIN, return is by reference,
not in 4 integer registers. For _CALL_SYSV, argument is also
by reference.
This error resulted in
$ ./qemu-system-i386 -nographic
qemu-system-i386: tcg/ppc/tcg-target.c.inc:185: \
tcg_target_call_oarg_reg: Assertion `slot >= 0 && slot <= 1' failed.
Fixes: 5427a9a760 ("tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128")
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The return is by reference, not in 4 integer registers.
This error resulted in
qemu-system-i386: tcg/mips/tcg-target.c.inc:140: \
tcg_target_call_oarg_reg: Assertion `slot >= 0 && slot <= 1' failed.
Fixes: 5427a9a760 ("tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can arrive here on _WIN64 because Int128 is passed by reference.
Change the assert to check that the immediate is in range,
instead of attempting to check the host ABI.
Fixes: 6a6d772e30 ("tcg: Introduce tcg_out_addi_ptr")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1581
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Something is wrong with this code, and also wrong with gdb on the
sparc systems to which I have access, so I cannot debug it either.
Disable for now, so the release is not broken.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu-user can hang in a multi-threaded fork. One common
reason is that when creating a TB, between fork and exec
we manipulate a GTree whose memory allocator (GSlice) is
not fork-safe.
Although POSIX does not mandate it, the system's allocator
(e.g. tcmalloc, libc malloc) is probably fork-safe.
Fix some of these hangs by using QTree, which uses the system's
allocator regardless of the Glib version that we used at
configuration time.
Tested with the test program in the original bug report, i.e.:
```
void garble() {
int pid = fork();
if (pid == 0) {
exit(0);
} else {
int wstatus;
waitpid(pid, &wstatus, 0);
}
}
void supragarble(unsigned depth) {
if (depth == 0)
return ;
std::thread a(supragarble, depth-1);
std::thread b(supragarble, depth-1);
garble();
a.join();
b.join();
}
int main() {
supragarble(10);
}
```
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/285
Reported-by: Valentin David <me@valentindavid.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Emilio Cota <cota@braap.org>
Message-Id: <20230205163758.416992-3-cota@braap.org>
[rth: Add QEMU_DISABLE_CFI for all callback using functions.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we call qemu_plugin_disable_mem_helpers in cpu_tb_exec,
we don't need to do this in generated code as well.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230310195252.210956-3-richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230315174331.2959-13-alex.bennee@linaro.org>
Reviewed-by: Emilio Cota <cota@braap.org>
These functions are no longer used.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace with tcg_constant_vec*.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These three instances got missed in previous conversion.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the tcg_temp_free_* and tcg_temp_ebb_new_* declarations
and inlines to the new header. These are private to the
implementation, and will prevent tcg_temp_free_* from creeping
back into the guest front ends.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since all temps allocated by guest front-ends are now TEMP_TB,
and we don't recycle TEMP_TB, there's no point in requiring
that the front-ends free the temps at all. Begin by dropping
the inner-most checks that all temps have been freed.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While we do not include these in tcg_target_reg_alloc_order,
and therefore they ought never be allocated, it seems safer
to mark them reserved as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the two uses of asm to expand xgetbv with an inline function.
Since one of the two has been using the mnemonic, assume that the
comment about "older versions of the assember" is obsolete, as even
that is 4 years old.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove the first label and redirect all uses to the second.
Tested-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This allows us to easily find all branches that use a label.
Since 'refs' is only tested vs zero, remove it and test for
an empty list instead. Drop the use of bitfields, which had
been used to pack refs into a single 32-bit word.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When CONFIG_PROFILER is set there are various undefined references to
profile_getclock. Include the header which defines this function.
For example:
../tcg/tcg.c: In function ‘tcg_gen_code’:
../tcg/tcg.c:4905:51: warning: implicit declaration of function ‘profile_getclock’ [-Wimplicit-function-declaration]
4905 | qatomic_set(&prof->opt_time, prof->opt_time - profile_getclock());
| ^~~~~~~~~~~~~~~~
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230303084948.3351546-1-rjones@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reusing TEMP_TB interferes with detecting whether the
temp can be adjusted to TEMP_EBB.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All of these have obvious and quite local scope.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While the argument can only be TEMP_EBB or TEMP_TB,
it's more obvious this way.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
TEMP_NORMAL is a subset of TEMP_EBB. Promote single basic
block temps to single extended basic block.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Attempt to reduce the lifetime of TEMP_TB.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This makes it easier to assign blame with perf.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use TEMP_TB as that is more explicit about the default
lifetime of the data. While "global" and "local" used
to be contrasting, we have more lifetimes than that now.
Do not yet rename tcg_temp_local_new_*, just the enum.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Just because the label reference count is more than 1 does
not mean we cannot remove a branch-to-next. By doing this
first, the label reference count may drop to 0, and then
the label itself gets removed as before.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Change the temps_in_use check to use assert not fprintf.
Move the assert for double-free before the check for count,
since that is the more immediate problem.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221219170806.60580-3-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221219170806.60580-2-philmd@linaro.org>
This commit was created with scripts/clean-includes.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230202133830.2152150-19-armbru@redhat.com>
'offset' should be bits [23:5] of LDR instruction, rather than [4:0].
Fixes: d59d83a1c3 ("tcg/aarch64: Reorg goto_tb implementation")
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Normally this is automatically handled by the CF_PARALLEL checks
with in tcg_gen_atomic_cmpxchg_i{32,64}, but x86 has a special
case of !PREFIX_LOCK where it always wants the non-atomic version.
Split these out so that x86 does not have to roll its own.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will allow targets to avoid rolling their own.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These are not yet considering atomicity of the 16-byte value;
this is a direct replacement for the current target code which
uses a pair of 8-byte operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add code generation functions for data movement between
TCGv_i128 (mov) and to/from TCGv_i64 (concat, extract).
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This enables allocation of i128. The type is not yet
usable, as we have not yet added data movement ops.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fill in the parameters for the host ABI for Int128 for
those backends which require no extra modification.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fill in the parameters for libffi for Int128.
Adjust the interpreter to allow for 16-byte return values.
Adjust tcg_out_call to record the return value length.
Call parameters are no longer all the same size, so we
cannot reuse the same call_slots array for every function.
Compute it each time now, but only fill in slots required
for the call we're about to make.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We expect the backend to require register pairs in
host-endian ordering, thus for big-endian the first
register of a pair contains the high part.
We were forcing R0 to contain the low part for calls.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fill in the parameters for the host ABI for Int128.
Adjust tcg_target_call_oarg_reg for _WIN64, and
tcg_out_call for i386 sysv. Allow TCG_TYPE_V128
stores without AVX enabled.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will be used by _WIN64 to return i128. Not yet used,
because allocation is not yet enabled.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the flat array tcg_target_call_oarg_regs[] with
a function call including the TCGCallReturnKind.
Extend the set of registers for ARM to r0-r3 to match the ABI:
https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#result-return
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These will be used by some hosts, both 32 and 64-bit, to pass and
return i128. Not yet used, because allocation is not yet enabled.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Implement the function for arm, i386, and s390x, which will use it.
Add stubs for all other backends.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>