Commit Graph

2513 Commits

Author SHA1 Message Date
Peter Maydell
bf4460a8d9 common-user: Re-enable ppc32 host
tcg: Avoid recursion in tcg_gen_mulu2_i32
 tcg: Mark tcg helpers noinline to avoid an issue with LTO
 tcg/arm: Use register pair allocation for qemu_{ld,st}_i64
 disas: Enable loongarch disassembler, and fixes
 tcg/loongarch64: Improve move immediate
 tcg/loongarch64: Improve add immediate
 tcg/loongarch64: Improve setcond
 tcg/loongarch64: Implement movcond
 tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
 tcg/loongarch64: Reorg goto_tb implementation
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmPPO+0dHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV93jwgAhG+H5XHtJqF2isCc
 a6pYuUWRbhsOFL23FmWKx2O41tHlJ2Seort8M9eIHOu21L9DUJFd291O/4ckiMQM
 13+KH/Kl5fumM+uEkO9YMyplOddmvygdTd5dCi5y349Gi3CgJH3n4HUl0qnioM/7
 Dy3n8JIvYsBp+8jUsLXo1gSl5P1kLMLwJmP68qgy8z8Xly4bDco1Nb2UKb7qKevO
 lMr6L+2/ALbKLZ6OU50erdUrlbgNs0eiQyJAfJ47SQ57RGuqF4pZ09+9yRI2FPZt
 UlSn+srsec1ieYyM2e5krVWbNcXaj6FouV7CkbgFXoUZt29xA1HTXsso+8vLgDPu
 g8vvuw==
 =Up0b
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20230123' of https://gitlab.com/rth7680/qemu into staging

common-user: Re-enable ppc32 host
tcg: Avoid recursion in tcg_gen_mulu2_i32
tcg: Mark tcg helpers noinline to avoid an issue with LTO
tcg/arm: Use register pair allocation for qemu_{ld,st}_i64
disas: Enable loongarch disassembler, and fixes
tcg/loongarch64: Improve move immediate
tcg/loongarch64: Improve add immediate
tcg/loongarch64: Improve setcond
tcg/loongarch64: Implement movcond
tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
tcg/loongarch64: Reorg goto_tb implementation

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmPPO+0dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV93jwgAhG+H5XHtJqF2isCc
# a6pYuUWRbhsOFL23FmWKx2O41tHlJ2Seort8M9eIHOu21L9DUJFd291O/4ckiMQM
# 13+KH/Kl5fumM+uEkO9YMyplOddmvygdTd5dCi5y349Gi3CgJH3n4HUl0qnioM/7
# Dy3n8JIvYsBp+8jUsLXo1gSl5P1kLMLwJmP68qgy8z8Xly4bDco1Nb2UKb7qKevO
# lMr6L+2/ALbKLZ6OU50erdUrlbgNs0eiQyJAfJ47SQ57RGuqF4pZ09+9yRI2FPZt
# UlSn+srsec1ieYyM2e5krVWbNcXaj6FouV7CkbgFXoUZt29xA1HTXsso+8vLgDPu
# g8vvuw==
# =Up0b
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 24 Jan 2023 02:01:17 GMT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20230123' of https://gitlab.com/rth7680/qemu:
  tcg/loongarch64: Reorg goto_tb implementation
  tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
  tcg/loongarch64: Implement movcond
  tcg/loongarch64: Improve setcond expansion
  tcg/loongarch64: Introduce tcg_out_addi
  tcg/loongarch64: Update tcg-insn-defs.c.inc
  tcg/loongarch64: Optimize immediate loading
  target/loongarch: Disassemble pcadd* addresses
  target/loongarch: Disassemble jirl properly
  target/loongarch: Enable the disassembler for host tcg
  tcg: Mark tcg helpers noinline to avoid an issue with LTO
  linux-user: Implment host/ppc/host-signal.h
  common-user/host/ppc: Implement safe-syscall.inc.S
  tcg/arm: Use register pair allocation for qemu_{ld,st}_i64
  tcg: Avoid recursion in tcg_gen_mulu2_i32

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-02-03 09:30:45 +00:00
Emilio Cota
17083f6fa6 tcg: exclude non-memory effecting helpers from instrumentation
There are actually a whole bunch of helpers that don't affect memory
that we shouldn't instrument. They are helpfully identified by the
TCG_CALL_NO_SIDE_EFFECTS flag which marks out lookup_tb_ptr as well as
a lot of the maths helpers. To avoid the string compare we introduce a
new flag for plugin internals so we skip that too.

Related: #1381
Signed-off-by: Emilio Cota <cota@braap.org>
Message-Id: <20230108164731.61469-4-cota@braap.org>
[AJB: updated to skip all no SE plugins, add flag for plugin helper]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230124180127.1881110-34-alex.bennee@linaro.org>
2023-02-02 11:48:20 +00:00
Richard Henderson
709bcd7da3 tcg/loongarch64: Reorg goto_tb implementation
The old implementation replaces two insns, swapping between

        b       <dest>
        nop
and
        pcaddu18i tmp, <dest>
        jirl      zero, tmp, <dest> & 0xffff

There is a race condition in which a thread could be stopped at
the jirl, i.e. with the top of the address loaded, and when
restarted we have re-linked to a different TB, so that the top
half no longer matches the bottom half.

Note that while we never directly re-link to a different TB, we
can link, unlink, and link again all while the stopped thread
remains stopped.

The new implementation replaces only one insn, swapping between

        b       <dest>
and
        pcadd   tmp, <jmp_addr>

falling through to load the address from tmp, and branch.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 16:00:13 -10:00
Richard Henderson
21199bfbef tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
Take the w^x split into account when computing the
pc-relative distance to an absolute pointer.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 16:00:13 -10:00
Richard Henderson
7bc76a4c2e tcg/loongarch64: Implement movcond
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 16:00:13 -10:00
Richard Henderson
21af161984 tcg/loongarch64: Improve setcond expansion
Split out a helper function, tcg_out_setcond_int, which
does not always produce the complete boolean result, but
returns a set of flags to do so.

Accept all int32_t as constant input, so that LE/GT can
adjust the constant to LT.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 16:00:13 -10:00
Richard Henderson
0e95be93c1 tcg/loongarch64: Introduce tcg_out_addi
Adjust the constraints to allow any int32_t for immediate
addition.  Split immediate adds into addu16i + addi, which
covers quite a lot of the immediate space.  For the hole in
the middle, load the constant into TMP0 instead.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 16:00:13 -10:00
Richard Henderson
76baa33a10 tcg/loongarch64: Update tcg-insn-defs.c.inc
Regenerate with ADDU16I included:

   $ cd loongarch-opcodes/scripts/go
   $ go run ./genqemutcgdefs > $QEMU/tcg/loongarch64/tcg-insn-defs.c.inc

Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 15:59:59 -10:00
Rui Wang
3fe7e36b44 tcg/loongarch64: Optimize immediate loading
diff:
  Imm                 Before                  After
  0000000000000000    addi.w  rd, zero, 0     addi.w  rd, zero, 0
                      lu52i.d rd, zero, 0
  00000000fffff800    lu12i.w rd, -1          addi.w  rd, zero, -2048
                      ori     rd, rd, 2048    lu32i.d rd, 0
                      lu32i.d rd, 0

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Rui Wang <wangrui@loongson.cn>
Message-Id: <20221107144713.845550-1-wangrui@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 15:36:36 -10:00
Richard Henderson
1b18d1fa05 tcg/arm: Use register pair allocation for qemu_{ld,st}_i64
Although we still can't use ldrd and strd for all operations,
increase the chances by getting the register allocation correct.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 14:39:48 -10:00
Richard Henderson
9fd86b518e tcg: Avoid recursion in tcg_gen_mulu2_i32
We have a test for one of TCG_TARGET_HAS_mulu2_i32 or
TCG_TARGET_HAS_muluh_i32 being defined, but the test
became non-functional when we changed to always define
all of these macros.

Replace this with a build-time test in tcg_gen_mulu2_i32.

Fixes: 25c4d9cc84 ("tcg: Always define all of the TCGOpcode enum members.")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1435
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-23 14:39:48 -10:00
Richard Henderson
9d9db41373 tcg/riscv: Use tcg_pcrel_diff in tcg_out_ldst
We failed to update this with the w^x split, so misses the fact
that true pc-relative offsets are usually small.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230117230415.354239-1-richard.henderson@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-01-20 10:14:14 +10:00
Richard Henderson
493c9b19a7 tcg/riscv: Implement direct branch for goto_tb
Now that tcg can handle direct and indirect goto_tb simultaneously,
we can optimistically leave space for a direct branch and fall back
to loading the pointer from the TB for an indirect branch.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 22:36:17 +00:00
Richard Henderson
9ae958e4d7 tcg/riscv: Introduce OPC_NOP
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:44:23 -10:00
Richard Henderson
79ffece444 tcg/arm: Implement direct branch for goto_tb
Now that tcg can handle direct and indirect goto_tb
simultaneously, we can optimistically leave space for
a direct branch and fall back to loading the pointer
from the TB for an indirect branch.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:44:18 -10:00
Richard Henderson
a228ae3ea7 tcg/sparc64: Reorg goto_tb implementation
The old sparc64 implementation may replace two insns, which leaves
a race condition in which a thread could be stopped at a PC in the
middle of the sequence, and when restarted does not see the complete
address computation and branches to nowhere.

The new implemetation replaces only one insn, swapping between a
direct branch and a direct call.  The TCG_REG_TB register is loaded
from tb->jmp_target_addr[] in the delay slot.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:38:18 -10:00
Richard Henderson
1e42b4f807 tcg/sparc64: Remove USE_REG_TB
This is always true for sparc64, so this is dead since 3a5f6805c7.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:38:18 -10:00
Richard Henderson
20b6643324 tcg/ppc: Reorg goto_tb implementation
The old ppc64 implementation replaces 2 or 4 insns, which leaves a race
condition in which a thread could be stopped at a PC in the middle of
the sequence, and when restarted does not see the complete address
computation and branches to nowhere.

The new implemetation replaces only one insn, swapping between

	b       <dest>
and
	mtctr	r31

falling through to a general-case indirect branch.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:36:16 -10:00
Richard Henderson
d59d83a1c3 tcg/aarch64: Reorg goto_tb implementation
The old implementation replaces two insns, swapping between

	b	<dest>
	nop
	br	x30
and
	adrp	x30, <dest>
	addi	x30, x30, lo12:<dest>
	br	x30

There is a race condition in which a thread could be stopped at
the PC of the second insn, and when restarted does not see the
complete address computation and branches to nowhere.

The new implemetation replaces only one insn, swapping between

	b	<dest>
	br	tmp
and
	ldr	tmp, <jmp_addr>
	br	tmp

Reported-by: hev <r@hev.cc>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:36:02 -10:00
Richard Henderson
2fd2e78d1b tcg: Remove TCG_TARGET_HAS_direct_jump
We now have the option to generate direct or indirect
goto_tb depending on the dynamic displacement, thus
the define is no longer necessary or completely accurate.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:25:49 -10:00
Richard Henderson
90c0fee3a2 tcg: Always define tb_target_set_jmp_target
Install empty versions for !TCG_TARGET_HAS_direct_jump hosts.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:22:35 -10:00
Richard Henderson
0012e3516e tcg: Move tb_target_set_jmp_target declaration to tcg.h
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:22:35 -10:00
Richard Henderson
0fe1c98da9 tcg: Change tb_target_set_jmp_target arguments
Replace 'tc_ptr' and 'addr' with 'tb' and 'n'.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:21:54 -10:00
Richard Henderson
9da6079b26 tcg: Add TranslationBlock.jmp_insn_offset
Stop overloading jmp_target_arg for both offset and address,
depending on TCG_TARGET_HAS_direct_jump.  Instead, add a new
field to hold the jump insn offset and always set the target
address in jmp_target_addr[].  This will allow a tcg backend
to use either direct or indirect depending on displacement.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:08:55 -10:00
Richard Henderson
b7e4afbd9f tcg: Add gen_tb to TCGContext
This can replace four other variables that are references
into the TranslationBlock structure.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 10:08:51 -10:00
Richard Henderson
3a50f424c9 tcg: Rename TB_JMP_RESET_OFFSET_INVALID to TB_JMP_OFFSET_INVALID
This will shortly be used for more than reset.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:56:45 -10:00
Richard Henderson
cf7d6b8e98 tcg: Split out tcg_out_goto_tb
The INDEX_op_goto_tb opcode needs no register allocation.
Split out a dedicated helper function for it.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:56:45 -10:00
Richard Henderson
becc452a36 tcg: Introduce get_jmp_target_addr
Similar to the existing set_jmp_reset_offset.  Include the
rw->rx address space conversion done by arm and s390x, and
forgotten by mips and riscv.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:56:40 -10:00
Richard Henderson
b52a2c03b7 tcg: Introduce set_jmp_insn_offset
Similar to the existing set_jmp_reset_offset.  Move any assert for
TCG_TARGET_HAS_direct_jump into the new function (which now cannot
be build-time).  Will be unused if TCG_TARGET_HAS_direct_jump is
constant 0, but we can't test for constant in the preprocessor,
so just mark it G_GNUC_UNUSED.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:55:13 -10:00
Richard Henderson
7f83167c61 tcg: Replace asserts on tcg_jmp_insn_offset
Test TCG_TARGET_HAS_direct_jump instead of testing an
implementation pointer.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:55:13 -10:00
Richard Henderson
1ce41e0443 tcg/sparc64: Remove unused goto_tb code for indirect jump
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:55:13 -10:00
Richard Henderson
cea583d13c tcg/ppc: Remove unused goto_tb code for indirect jump
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:55:13 -10:00
Richard Henderson
3bb8500ef8 tcg/i386: Remove unused goto_tb code for indirect jump
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:55:13 -10:00
Richard Henderson
b55a8d9d0b tcg: Split out tcg_out_exit_tb
The INDEX_op_exit_tb opcode needs no register allocation.
Split out a dedicated helper function for it.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-17 09:55:13 -10:00
Ilya Leoshkevich
5584e2dbe8 tcg: add perfmap and jitdump
Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump.
The first one allows the perf tool to map samples to each individual
translation block. The second one adds the ability to resolve symbol
names, line numbers and inspect JITed code.

Example of use:

    perf record qemu-x86_64 -perfmap ./a.out
    perf report

or

    perf record -k 1 qemu-x86_64 -jitdump ./a.out
    DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
    perf report -i perf.data.jitted

Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
Co-developed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230112152013.125680-4-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-16 10:14:12 -10:00
Peter Maydell
528d9f33ca tcg/s390x improvements:
- drop support for pre-z196 cpus (eol before 2017)
  - add support for misc-instruction-extensions-3
  - misc cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmO5I68dHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8qOAgAgfLyzz5yajzVD0TZ
 ufEw07Jlgw2th4Il4FZ3cKVaO9X1Sec9yTLaDXSByKJzTHQXso4Y7hBHciKjC8s4
 LatuVWJaUdYUHry39iDgjvKb+ZrQ1ajve1sDNrIrCj5ITqoGT23f1NxQSS+0MhPB
 cMVIEmiblPlynKDtOYBE9lcrkUFOBUqxqnV734AP1gOVyJOMjaVtm9T4wKIipC2M
 UOXKOo/YPYeygSUFZdrmjoaJE0qCyJbpQqVDGUPXN2Md7UADfVhIo9jrbnvmr7BR
 OfWDPuuVdt5g7P+TzLe0BXKPmbyqGi4vnjwPWNSl4ow+8xbxQ8fsSjpyx9kKd1i1
 swoGSQ==
 =gWI0
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu into staging

tcg/s390x improvements:
 - drop support for pre-z196 cpus (eol before 2017)
 - add support for misc-instruction-extensions-3
 - misc cleanups

# gpg: Signature made Sat 07 Jan 2023 07:47:59 GMT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu: (27 commits)
  tcg/s390x: Avoid the constant pool in tcg_out_movi
  tcg/s390x: Cleanup tcg_out_movi
  tcg/s390x: Tighten constraints for 64-bit compare
  tcg/s390x: Implement ctpop operation
  tcg/s390x: Use tgen_movcond_int in tgen_clz
  tcg/s390x: Support SELGR instruction in movcond
  tcg/s390x: Generalize movcond implementation
  tcg/s390x: Create tgen_cmp2 to simplify movcond
  tcg/s390x: Support MIE3 logical operations
  tcg/s390x: Tighten constraints for and_i64
  tcg/s390x: Tighten constraints for or_i64 and xor_i64
  tcg/s390x: Issue XILF directly for xor_i32
  tcg/s390x: Support MIE2 MGRK instruction
  tcg/s390x: Support MIE2 multiply single instructions
  tcg/s390x: Distinguish RIE formats
  tcg/s390x: Distinguish RRF-a and RRF-c formats
  tcg/s390x: Use LARL+AGHI for odd addresses
  tcg/s390x: Remove DISTINCT_OPERANDS facility check
  tcg/s390x: Remove FAST_BCR_SER facility check
  tcg/s390x: Check for load-on-condition facility at startup
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-01-08 11:23:17 +00:00
Richard Henderson
90497e03ca tcg/s390x: Avoid the constant pool in tcg_out_movi
Load constants in no more than two insns, which turns
out to be faster than using the constant pool.

Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:10 +00:00
Richard Henderson
1818c71ba1 tcg/s390x: Cleanup tcg_out_movi
Merge maybe_out_small_movi, as it no longer has additional users.
Use is_const_p{16,32}.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:10 +00:00
Richard Henderson
32c256eda6 tcg/s390x: Tighten constraints for 64-bit compare
Give 64-bit comparison second operand a signed 33-bit immediate.
This is the smallest superset of uint32_t and int32_t, as used
by CLGFI and CGFI respectively.  The rest of the 33-bit space
can be loaded into TCG_TMP0.  Drop use of the constant pool.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:09 +00:00
Richard Henderson
29a5ea738a tcg/s390x: Implement ctpop operation
There is an older form that produces per-byte results,
and a newer form that produces per-register results.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:09 +00:00
Richard Henderson
bfff851842 tcg/s390x: Use tgen_movcond_int in tgen_clz
Reuse code from movcond to conditionally copy a2 to dest,
based on the condition codes produced by FLOGR.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
0bbf0f7acf tcg/s390x: Support SELGR instruction in movcond
The new select instruction provides two separate register inputs,
whereas the old load-on-condition instruction overlaps one of the
register inputs with the destination.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
23d1394a6d tcg/s390x: Generalize movcond implementation
Generalize movcond to support pre-computed conditions, and the same
set of arguments at all times.  This will be assumed by a following
patch, which needs to reuse tgen_movcond_int.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
5c837bbca6 tcg/s390x: Create tgen_cmp2 to simplify movcond
Return both regular and inverted condition codes from tgen_cmp2.
This lets us choose after the fact which comparision we want.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
6c9b5c0f53 tcg/s390x: Support MIE3 logical operations
This is andc, orc, nand, nor, eqv.
We can use nor for implementing not.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
4134083f80 tcg/s390x: Tighten constraints for and_i64
Let the register allocator handle such immediates by matching
only what one insn can achieve.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
b2509acc60 tcg/s390x: Tighten constraints for or_i64 and xor_i64
Drop support for sequential OR and XOR, as the serial dependency is
slower than loading the constant first.  Let the register allocator
handle such immediates by matching only what one insn can achieve.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
a0332aca67 tcg/s390x: Issue XILF directly for xor_i32
There is only one instruction that is applicable
to a 32-bit immediate xor.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
668ce343f6 tcg/s390x: Support MIE2 MGRK instruction
The MIE2 facility adds a 3-operand signed 64x64->128 multiply.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
92c89a074c tcg/s390x: Support MIE2 multiply single instructions
The MIE2 facility adds 3-operand versions of multiply.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00