Commit Graph

21 Commits

Author SHA1 Message Date
Richard Henderson
50695fb83e tcg/s390x: Optimize cmpsel with constant 0/-1 arguments
These can be simplified to and/or/andc/orc,
avoiding the load of the constantinto a register.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-09-22 06:54:50 +02:00
Richard Henderson
1c7d05ff70 tcg/s390x: Implement cmpsel_vec
Do not allow cmpsel_vec to be expanded early, so that we can
make the correct decision wrt the sense of the comparison.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-09-22 06:54:50 +02:00
Richard Henderson
d95b51d3fb tcg/s390x: Add TCG_CT_CONST_CMP
Better constraint for tcg_out_cmp, based on the comparison.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-02-03 23:53:49 +00:00
Richard Henderson
cbaddf3094 tcg/s390x: Split constraint A into J+U
Signed 33-bit == signed 32-bit + unsigned 32-bit.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-02-03 23:53:49 +00:00
Ilya Leoshkevich
22d2e5351a tcg/{i386, s390x}: Add earlyclobber to the op_add2's first output
i386 and s390x implementations of op_add2 require an earlyclobber,
which is currently missing. This breaks VCKSM in s390x guests. E.g., on
x86_64 the following op:

    add2_i32 tmp2,tmp3,tmp2,tmp3,tmp3,tmp2   dead: 0 2 3 4 5  pref=none,0xffff

is translated to:

    addl     %ebx, %r12d
    adcl     %r12d, %ebx

Introduce a new C_N1_O1_I4 constraint, and make sure that earlyclobber
of aliased outputs is honored.

Cc: qemu-stable@nongnu.org
Fixes: 82790a8709 ("tcg: Add markup for output requires new register")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230719221310.1968845-7-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-23 17:58:19 +01:00
Richard Henderson
4caad79f8d tcg/s390x: Support 128-bit load/store
Use LPQ/STPQ when 16-byte atomicity is required.
Note that these instructions require 16-byte alignment.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson
9490142284 tcg/s390x: Simplify constraints on qemu_ld/st
Adjust the softmmu tlb to use R0+R1, not any of the normally available
registers.  Since we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11 09:53:41 +01:00
Richard Henderson
32c256eda6 tcg/s390x: Tighten constraints for 64-bit compare
Give 64-bit comparison second operand a signed 33-bit immediate.
This is the smallest superset of uint32_t and int32_t, as used
by CLGFI and CGFI respectively.  The rest of the 33-bit space
can be loaded into TCG_TMP0.  Drop use of the constant pool.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:09 +00:00
Richard Henderson
bfff851842 tcg/s390x: Use tgen_movcond_int in tgen_clz
Reuse code from movcond to conditionally copy a2 to dest,
based on the condition codes produced by FLOGR.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
23d1394a6d tcg/s390x: Generalize movcond implementation
Generalize movcond to support pre-computed conditions, and the same
set of arguments at all times.  This will be assumed by a following
patch, which needs to reuse tgen_movcond_int.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
6c9b5c0f53 tcg/s390x: Support MIE3 logical operations
This is andc, orc, nand, nor, eqv.
We can use nor for implementing not.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
4134083f80 tcg/s390x: Tighten constraints for and_i64
Let the register allocator handle such immediates by matching
only what one insn can achieve.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:07 +00:00
Richard Henderson
b2509acc60 tcg/s390x: Tighten constraints for or_i64 and xor_i64
Drop support for sequential OR and XOR, as the serial dependency is
slower than loading the constant first.  Let the register allocator
handle such immediates by matching only what one insn can achieve.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
668ce343f6 tcg/s390x: Support MIE2 MGRK instruction
The MIE2 facility adds a 3-operand signed 64x64->128 multiply.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
92c89a074c tcg/s390x: Support MIE2 multiply single instructions
The MIE2 facility adds 3-operand versions of multiply.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:06 +00:00
Richard Henderson
4143f78dad tcg/s390x: Use register pair allocation for div and mulu2
Previously we hard-coded R2 and R3.

Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-01-06 23:07:03 +00:00
Richard Henderson
9bca986df8 tcg/s390x: Implement TCG_TARGET_HAS_bitsel_vec
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-05 16:53:17 -07:00
Richard Henderson
22cb37b417 tcg/s390x: Implement vector shift operations
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-05 16:53:17 -07:00
Richard Henderson
ae77bbe574 tcg/s390x: Implement andc, orc, abs, neg, not vector operations
These logical and arithmetic operations are optional but trivial.

Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-05 16:53:17 -07:00
Richard Henderson
34ef767609 tcg/s390x: Add host vector framework
Add registers and function stubs.  The functionality
is disabled via squashing s390_facilities[2] to 0.

We must still include results for the mandatory opcodes in
tcg_target_op_def, as all opcodes are checked during tcg init.

Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-05 16:53:17 -07:00
Richard Henderson
3704993f54 tcg/s390x: Rename from tcg/s390
This emphasizes that we don't support s390, only 64-bit s390x hosts.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-05 16:53:17 -07:00