Commit Graph

2479 Commits

Author SHA1 Message Date
Ilya Leoshkevich
5584e2dbe8 tcg: add perfmap and jitdump
Add ability to dump /tmp/perf-<pid>.map and jit-<pid>.dump.
The first one allows the perf tool to map samples to each individual
translation block. The second one adds the ability to resolve symbol
names, line numbers and inspect JITed code.

Example of use:

    perf record qemu-x86_64 -perfmap ./a.out
    perf report

or

    perf record -k 1 qemu-x86_64 -jitdump ./a.out
    DEBUGINFOD_URLS= perf inject -j -i perf.data -o perf.data.jitted
    perf report -i perf.data.jitted

Co-developed-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
Co-developed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230112152013.125680-4-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-16 10:14:12 -10:00
Peter Maydell
528d9f33ca tcg/s390x improvements:
- drop support for pre-z196 cpus (eol before 2017)
  - add support for misc-instruction-extensions-3
  - misc cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmO5I68dHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8qOAgAgfLyzz5yajzVD0TZ
 ufEw07Jlgw2th4Il4FZ3cKVaO9X1Sec9yTLaDXSByKJzTHQXso4Y7hBHciKjC8s4
 LatuVWJaUdYUHry39iDgjvKb+ZrQ1ajve1sDNrIrCj5ITqoGT23f1NxQSS+0MhPB
 cMVIEmiblPlynKDtOYBE9lcrkUFOBUqxqnV734AP1gOVyJOMjaVtm9T4wKIipC2M
 UOXKOo/YPYeygSUFZdrmjoaJE0qCyJbpQqVDGUPXN2Md7UADfVhIo9jrbnvmr7BR
 OfWDPuuVdt5g7P+TzLe0BXKPmbyqGi4vnjwPWNSl4ow+8xbxQ8fsSjpyx9kKd1i1
 swoGSQ==
 =gWI0
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu into staging

tcg/s390x improvements:
 - drop support for pre-z196 cpus (eol before 2017)
 - add support for misc-instruction-extensions-3
 - misc cleanups

# gpg: Signature made Sat 07 Jan 2023 07:47:59 GMT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20230106' of https://gitlab.com/rth7680/qemu: (27 commits)
  tcg/s390x: Avoid the constant pool in tcg_out_movi
  tcg/s390x: Cleanup tcg_out_movi
  tcg/s390x: Tighten constraints for 64-bit compare
  tcg/s390x: Implement ctpop operation
  tcg/s390x: Use tgen_movcond_int in tgen_clz
  tcg/s390x: Support SELGR instruction in movcond
  tcg/s390x: Generalize movcond implementation
  tcg/s390x: Create tgen_cmp2 to simplify movcond
  tcg/s390x: Support MIE3 logical operations
  tcg/s390x: Tighten constraints for and_i64
  tcg/s390x: Tighten constraints for or_i64 and xor_i64
  tcg/s390x: Issue XILF directly for xor_i32
  tcg/s390x: Support MIE2 MGRK instruction
  tcg/s390x: Support MIE2 multiply single instructions
  tcg/s390x: Distinguish RIE formats
  tcg/s390x: Distinguish RRF-a and RRF-c formats
  tcg/s390x: Use LARL+AGHI for odd addresses
  tcg/s390x: Remove DISTINCT_OPERANDS facility check
  tcg/s390x: Remove FAST_BCR_SER facility check
  tcg/s390x: Check for load-on-condition facility at startup
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-01-08 11:23:17 +00:00
Richard Henderson
90497e03ca tcg/s390x: Avoid the constant pool in tcg_out_movi
Load constants in no more than two insns, which turns
out to be faster than using the constant pool.

Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:10 +00:00
Richard Henderson
1818c71ba1 tcg/s390x: Cleanup tcg_out_movi
Merge maybe_out_small_movi, as it no longer has additional users.
Use is_const_p{16,32}.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:10 +00:00
Richard Henderson
32c256eda6 tcg/s390x: Tighten constraints for 64-bit compare
Give 64-bit comparison second operand a signed 33-bit immediate.
This is the smallest superset of uint32_t and int32_t, as used
by CLGFI and CGFI respectively.  The rest of the 33-bit space
can be loaded into TCG_TMP0.  Drop use of the constant pool.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:09 +00:00
Richard Henderson
29a5ea738a tcg/s390x: Implement ctpop operation
There is an older form that produces per-byte results,
and a newer form that produces per-register results.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:09 +00:00
Richard Henderson
bfff851842 tcg/s390x: Use tgen_movcond_int in tgen_clz
Reuse code from movcond to conditionally copy a2 to dest,
based on the condition codes produced by FLOGR.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
0bbf0f7acf tcg/s390x: Support SELGR instruction in movcond
The new select instruction provides two separate register inputs,
whereas the old load-on-condition instruction overlaps one of the
register inputs with the destination.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
23d1394a6d tcg/s390x: Generalize movcond implementation
Generalize movcond to support pre-computed conditions, and the same
set of arguments at all times.  This will be assumed by a following
patch, which needs to reuse tgen_movcond_int.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
5c837bbca6 tcg/s390x: Create tgen_cmp2 to simplify movcond
Return both regular and inverted condition codes from tgen_cmp2.
This lets us choose after the fact which comparision we want.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
6c9b5c0f53 tcg/s390x: Support MIE3 logical operations
This is andc, orc, nand, nor, eqv.
We can use nor for implementing not.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
4134083f80 tcg/s390x: Tighten constraints for and_i64
Let the register allocator handle such immediates by matching
only what one insn can achieve.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
b2509acc60 tcg/s390x: Tighten constraints for or_i64 and xor_i64
Drop support for sequential OR and XOR, as the serial dependency is
slower than loading the constant first.  Let the register allocator
handle such immediates by matching only what one insn can achieve.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
a0332aca67 tcg/s390x: Issue XILF directly for xor_i32
There is only one instruction that is applicable
to a 32-bit immediate xor.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
668ce343f6 tcg/s390x: Support MIE2 MGRK instruction
The MIE2 facility adds a 3-operand signed 64x64->128 multiply.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
92c89a074c tcg/s390x: Support MIE2 multiply single instructions
The MIE2 facility adds 3-operand versions of multiply.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
d84ca80462 tcg/s390x: Distinguish RIE formats
There are multiple variations, with different fields.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:05 +00:00
Richard Henderson
1dd06b1aab tcg/s390x: Distinguish RRF-a and RRF-c formats
One has 3 register arguments; the other has 2 plus an m3 field.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:05 +00:00
Richard Henderson
1b74cf6ea2 tcg/s390x: Use LARL+AGHI for odd addresses
Add one instead of dropping odd addresses to the constant pool.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:04 +00:00
Richard Henderson
238da1c942 tcg/s390x: Remove DISTINCT_OPERANDS facility check
The distinct-operands facility is bundled into facility 45,
along with load-on-condition.  We are checking this at startup.
Remove the a0 == a1 checks for 64-bit sub, and, or, xor, as there
is no space savings for avoiding the distinct-operands insn.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:04 +00:00
Richard Henderson
e62d5752f5 tcg/s390x: Remove FAST_BCR_SER facility check
The fast-bcr-serialization facility is bundled into facility 45,
along with load-on-condition.  We are checking this at startup.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:04 +00:00
Richard Henderson
c68d5b7a6a tcg/s390x: Check for load-on-condition facility at startup
The general-instruction-extension facility was introduced in z196,
which itself was end-of-life in 2021.  In addition, z196 is the
minimum CPU supported by our set of supported operating systems:
RHEL 7 (z196), SLES 12 (z196) and Ubuntu 16.04 (zEC12).

Check for facility number 45, which will be the consilidated check
for several facilities.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:04 +00:00
Richard Henderson
9c3bfb79f4 tcg/s390x: Check for general-instruction-extension facility at startup
The general-instruction-extension facility was introduced in z10,
which itself was end-of-life in 2019.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:03 +00:00
Richard Henderson
3e25f7da9a tcg/s390x: Check for extended-immediate facility at startup
The extended-immediate facility was introduced in z9-109,
which itself was end-of-life in 2017.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:03 +00:00
Richard Henderson
761ea52213 tcg/s390x: Check for long-displacement facility at startup
We are already assuming the existance of long-displacement, but were
not being explicit about it.  This has been present since z990.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:03 +00:00
Richard Henderson
0a3afe09cb tcg/s390x: Remove USE_LONG_BRANCHES
The size of a compiled TB is limited by the uint16_t used by
gen_insn_end_off[] -- there is no need for a 32-bit branch.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:03 +00:00
Richard Henderson
6bd739ed37 tcg/s390x: Always set TCG_TARGET_HAS_direct_jump
Since USE_REG_TB is removed, there is no need to load the
target TB address into a register.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:03 +00:00
Richard Henderson
ccbecb441e tcg/s390x: Remove TCG_REG_TB
This reverts 829e1376d9 ("tcg/s390: Introduce TCG_REG_TB"), and
several follow-up patches.  The primary motivation is to reduce the
less-tested code paths, pre-z10.  Secondarily, this allows the
unconditional use of TCG_TARGET_HAS_direct_jump, which might be more
important for performance than any slight increase in code size.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v4: Do not simplify tgen_ori, tgen_xori.
2023-01-06 23:07:03 +00:00
Richard Henderson
4143f78dad tcg/s390x: Use register pair allocation for div and mulu2
Previously we hard-coded R2 and R3.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:03 +00:00
Peter Maydell
052e6534c4 First RISC-V PR for QEMU 8.0
* Fix PMP propagation for tlb
 * Collection of bug fixes
 * Bump the OpenTitan supported version
 * Add smstateen support
 * Support native debug icount trigger
 * Remove the redundant ipi-id property in the virt machine
 * Support cache-related PMU events in virtual mode
 * Add some missing PolarFire SoC io regions
 * Fix mret exception cause when no pmp rule is configured
 * Fix bug where disabling compressed instructions would crash QEMU
 * Add Zawrs ISA extension support
 * A range of code refactoring and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE9sSsRtSTSGjTuM6PIeENKd+XcFQFAmO3b5sACgkQIeENKd+X
 cFTD8Af+L0OaGzE4C0jil7LdITFKk7ltdTg3yw84ZBvIxrEWMWlt5Yj3Ez5OIPbY
 WpPmCLoJ9XM/5CV5PuPjxcFFExWjXLNeMEvaaT/3/3jPPnA/M/AbZa4hefKRluTg
 kkvBiOuRMPXiYLki5xAHmjD/1j6SQ8ghylPLxYQqyGq87WZt4Dx9msOTZLVzmmYl
 8y9oC8j6yO2UBZYC1TxSkKxFbj+Cp2HmtBZ09tDzwRL6DpRvIlfftiLl8X3awMlK
 YTzCIrmllc38/+cV9IpQhdXzDUZ1kM7Zy56JbJl3XOsS4VnUYGmEtrKYpYQ2CKLY
 /tcmrDKNw1ArWcP1axNN8FHfhy1FyQ==
 =SH+C
 -----END PGP SIGNATURE-----

Merge tag 'pull-riscv-to-apply-20230106' of https://github.com/alistair23/qemu into staging

First RISC-V PR for QEMU 8.0

* Fix PMP propagation for tlb
* Collection of bug fixes
* Bump the OpenTitan supported version
* Add smstateen support
* Support native debug icount trigger
* Remove the redundant ipi-id property in the virt machine
* Support cache-related PMU events in virtual mode
* Add some missing PolarFire SoC io regions
* Fix mret exception cause when no pmp rule is configured
* Fix bug where disabling compressed instructions would crash QEMU
* Add Zawrs ISA extension support
* A range of code refactoring and cleanups

# gpg: Signature made Fri 06 Jan 2023 00:47:23 GMT
# gpg:                using RSA key F6C4AC46D4934868D3B8CE8F21E10D29DF977054
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [full]
# Primary key fingerprint: F6C4 AC46 D493 4868 D3B8  CE8F 21E1 0D29 DF97 7054

* tag 'pull-riscv-to-apply-20230106' of https://github.com/alistair23/qemu: (43 commits)
  hw/intc: sifive_plic: Fix the pending register range check
  hw/riscv: opentitan: Drop "hartid-base" and "priority-base" initialization
  hw/intc: sifive_plic: Change "priority-base" to start from interrupt source 0
  hw/riscv: virt: Fix the value of "riscv, ndev" in the dtb
  hw/riscv: sifive_u: Avoid using magic number for "riscv, ndev"
  hw/riscv: sifive_e: Fix the number of interrupt sources of PLIC
  hw/riscv: microchip_pfsoc: Fix the number of interrupt sources of PLIC
  hw/intc: sifive_plic: Update "num-sources" property default value
  hw/intc: sifive_plic: Use error_setg() to propagate the error up via errp in sifive_plic_realize()
  hw/intc: sifive_plic: Improve robustness of the PLIC config parser
  hw/intc: sifive_plic: Drop PLICMode_H
  hw/riscv: spike: Remove misleading comments
  hw/riscv: Sort machines Kconfig options in alphabetical order
  hw/riscv: Fix opentitan dependency to SIFIVE_PLIC
  hw/intc: Select MSI_NONBROKEN in RISC-V AIA interrupt controllers
  hw/riscv: Select MSI_NONBROKEN in SIFIVE_PLIC
  RISC-V: Add Zawrs ISA extension support
  target/riscv: Clear mstatus.MPRV when leaving M-mode for priv spec 1.12+
  target/riscv: Simplify helper_sret() a little bit
  target/riscv: Set pc_succ_insn for !rvc illegal insn
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-01-06 22:15:53 +00:00
Richard Henderson
2e3a933abb tcg/riscv: Fix base register for user-only qemu_ld/st
When guest_base != 0, we were not coordinating the usage of
TCG_REG_TMP0 as base properly, leading to a previous zero-extend
of the input address being discarded.

Shuffle the alignment check to the front, because that does not
depend on the zero-extend, and it keeps the register usage clear.
Set base after each step of the address arithmetic instead of before.

Return the base register used from tcg_out_tlb_load, so as to
keep that register choice localized to that function.

Reported-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20221023233337.2846860-1-richard.henderson@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-01-06 10:42:55 +10:00
Richard Henderson
9b246685b3 tcg/riscv: Fix reg overlap case in tcg_out_addsub2
There was a typo using opc_addi instead of opc_add with the
two registers.  While we're at it, simplify the gating test
to al == bl to improve dynamic scheduling even when the
output register does not overlap the inputs.

Reported-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20221020233836.2341671-1-richard.henderson@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-01-06 10:42:55 +10:00
Richard Henderson
6272276361 tcg/riscv: Fix range matched by TCG_CT_CONST_M12
We were matching a signed 13-bit range, not a 12-bit range.
Expand the commentary within the function and be explicit
about all of the ranges.

Reported-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20221022095821.2441874-1-richard.henderson@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-01-06 10:42:55 +10:00
Richard Henderson
cee44b037b tcg: Add TCGHelperInfo argument to tcg_out_call
This eliminates an ifdef for TCI, and will be required for
expanding the call for TCGv_i128.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
fa3cb9f9ff tcg/aarch64: Merge tcg_out_callr into tcg_out_call
There is only one use, and BLR is perhaps even more
self-documentary than CALLR.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
f9c4bb804d tcg: Move ffi_cif pointer into TCGHelperInfo
Instead of requiring a separate hash table lookup,
put a pointer to the CIF into TCGHelperInfo.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221111074101.2069454-27-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221122180804.938-4-philmd@linaro.org>
2023-01-05 11:41:29 -08:00
Philippe Mathieu-Daudé
0c22e17658 tcg: Factor init_ffi_layouts() out of tcg_context_init()
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221111074101.2069454-27-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221122180804.938-3-philmd@linaro.org>
2023-01-05 11:41:29 -08:00
Philippe Mathieu-Daudé
c6ef8c7b35 tcg: Convert typecode_to_ffi from array to function
In the unlikely case of invalid typecode mask, the function
will abort instead of returning a NULL pointer.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221111074101.2069454-27-richard.henderson@linaro.org>
[PMD: Split from bigger patch]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221122180804.938-2-philmd@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
39004a71d8 tcg: Reorg function calls
Pre-compute the function call layout for each helper at startup.
Drop TCG_CALL_DUMMY_ARG, as we no longer need to leave gaps
in the op->args[] array.  This allows several places to stop
checking for NULL TCGTemp, to which TCG_CALL_DUMMY_ARG mapped.

For tcg_gen_callN, loop over the arguments once.  Allocate the TCGOp
for the call early but delay emitting it, collecting arguments first.
This allows the argument processing loop to emit code for extensions
and have them sequenced before the call.

For tcg_reg_alloc_call, loop over the arguments in reverse order,
which allows stack slots to be filled first naturally.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
31fd884b2e tcg: Use output_pref wrapper function
We will shortly have the possibility of more that two outputs,
though only for calls (for which preferences are moot).  Avoid
direct references to op->output_pref[] when possible.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
cb10bc63b7 tcg: Vary the allocation size for TCGOp
We have been allocating a worst case number of arguments
to support calls.  Instead, allow the size to vary.
By default leave space for 4 args, to maximize reuse,
but allow calls to increase the number of args to 32.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[PMD: Split patch in two]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221218211832.73312-3-philmd@linaro.org>
2023-01-05 11:41:29 -08:00
Philippe Mathieu-Daudé
d44789434b tcg: Pass number of arguments to tcg_emit_op() / tcg_op_insert_*()
In order to have variable size allocated TCGOp, pass the number
of arguments we use (and would allocate) up to tcg_op_alloc().

This alters tcg_emit_op(), tcg_op_insert_before() and
tcg_op_insert_after() prototypes.

In tcg_op_alloc() ensure the number of arguments is in range.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[PMD: Extracted from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221218211832.73312-2-philmd@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
e2a9dd6b6b tcg: Use TCG_CALL_ARG_EVEN for TCI special case
Change 32-bit tci TCG_TARGET_CALL_ARG_I32 to TCG_CALL_ARG_EVEN, to
force 32-bit values to be aligned to 64-bit.  With a small reorg
to the argument processing loop, this neatly replaces an ifdef for
CONFIG_TCG_INTERPRETER.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
eb8b0224fc tcg: Replace TCG_TARGET_EXTEND_ARGS with TCG_TARGET_CALL_ARG_I32
For 64-bit hosts that had TCG_TARGET_EXTEND_ARGS, set
TCG_TARGET_CALL_ARG_I32 to TCG_CALL_ARG_EXTEND.
Otherwise, use TCG_CALL_ARG_NORMAL.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
c8eef96046 tcg: Replace TCG_TARGET_CALL_ALIGN_ARGS with TCG_TARGET_CALL_ARG_I64
For 32-bit hosts when TCG_TARGET_CALL_ALIGN_ARGS was set, use
TCG_CALL_ARG_EVEN.  For 64-bit hosts, TCG_TARGET_CALL_ALIGN_ARGS
was silently ignored, so always use TCG_CALL_ARG_NORMAL.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:29 -08:00
Richard Henderson
18ff36ab71 tcg: Introduce TCGCallReturnKind and TCGCallArgumentKind
Prepare to replace a bunch of separate ifdefs with a
consistent way to describe the ABI of a function call.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:28 -08:00
Richard Henderson
31c9641746 tcg: Introduce tcg_type_size
Add a helper function for computing the size of a type.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:28 -08:00
Richard Henderson
aef8540290 tcg: Allocate TCGTemp pairs in host memory order
Allocate the first of a pair at the lower address, and the
second of a pair at the higher address.  This will make it
easier to find the beginning of the larger memory block.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:28 -08:00
Richard Henderson
f01847c251 tcg: Simplify calls to temp_sync vs mem_coherent
The first thing that temp_sync does is check mem_coherent,
so there's no need for the caller to do so.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:28 -08:00
Richard Henderson
fac87bd2a4 tcg: Add temp_subindex to TCGTemp
Record the location of a TCGTemp within a larger object.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-05 11:41:28 -08:00