The avx512 vpblendm* instructions exactly implement cmpsel,
using a predicate input. Of course this matches nicely with
the avx512 predicate comparison instructions.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Extend tcg_out_evex_opc to handle the predicate and
zero-merging parameters of the evex prefix.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The sse/avx instruction set only has EQ and GT as direct comparisons.
Other signed comparisons can be generated from swapping and inversion.
However unsigned comparisons are not available and must be transformed
to signed comparisons by biasing the inputs.
The avx512 instruction set has a complete set of comparisons, with
results placed into a predicate register. We can produce the normal
cmp_vec result by using VPMOVM2*.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Expand during output instead of during opcode generation.
Remove x86_vpblendvb_vec opcode, this this removes the only user.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move most of expansion to opcode generation, leaving the
conversion of unsigned to signed to be done in the early phase.
Small inefficiencies, but not incorrect results, are introduced
until cmpsel_vec is converted in the next patch.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Helper function to handle setting of VEXL based
on the type of the operation.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This reverts commit 45ccdbcb24.
The x86-64 instruction set can now be tuned down to x86-64 v1
or i386 Pentium Pro.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which has the POPCNT instruction.
Use it freely in TCG-generated code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU now requires an x86-64-v2 host, which always has CMOV.
Use it freely in TCG generated code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prelude to supporting TCG_COND_TST* in vector comparisons.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This may be treated as a 32-bit EQ/NE comparison against 0,
which is in turn treated as a LTU/GEU comparison against 1.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The x86 isa does not have this operation, so we need an expansion.
Use the same algorithm that we use for expanding this vector
operation with integers: perform the shift with a wider type
and then mask the bits that must be zero.
This reduces the instruction count from 5 to 2.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Just like when testing against the sign bits, TEST r,r can be used when the
immediate is 0xff, 0xff00, 0xffff, 0xffffffff.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tcg_out_testi into tcg_out_cmp and adjust the two uses.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Return the x86 condition codes to use after the compare.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Hoist the tcg_cond_to_jcc index outside the function.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fill the new argument from any condition within the opcode.
Not yet used within any backend.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Define as 0 for all tcg backends.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In the case where OR or XOR has an 8-bit immediate between 128 and 255,
we can operate on a low-byte register and shorten the output by two or
three bytes (two if a prefix byte is needed for REX.B).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231228120524.70239-1-pbonzini@redhat.com>
[rth: Incorporate into switch.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Extend the existing conditional that generates INC/DEC, to also swap an
ADD for a SUB and vice versa when the immediate is 128. This facilitates
using OPC_ARITH_EvIb instead of OPC_ARITH_EvIz.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20231228120514.70205-1-pbonzini@redhat.com>
[rth: Use a switch on C]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The movcond opcode is now mandatory for backends to implement.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231026041404.1229328-7-richard.henderson@linaro.org>
The movcond opcode is now mandatory for backends to implement.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231026041404.1229328-4-richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231004090629.37473-6-philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This hook may emit code at the beginning of the TB.
Suggested-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass vece to tcg_target_const_match() to allow correct interpretation of
const args of vector ops.
Signed-off-by: Jiajie Chen <c@jia.je>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230908022302.180442-4-c@jia.je>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For LT/GE vs zero, shift down the sign bit.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Using XOR first is both smaller and more efficient,
though cannot be applied if it clobbers an input.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the carry bit to optimize some forms of setcond.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass a rexw parameter instead of duplicating the functions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass a rexw parameter instead of duplicating the functions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass a rexw parameter instead of duplicating the functions.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce a new opcode for negative setcond.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the separate defines with TCG_TARGET_HAS_extr_i64_i32,
so that the two parts of backend-specific type changing cannot
be out of sync.
Reported-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: <20230822175127.1173698-1-richard.henderson@linaro.org>
We can use MOVB and MOVW with an immediate just as easily
as with a register input.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It is more useful to allow low-part deposits into all registers
than to restrict allocation for high-byte deposits.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Missing the segment prefix means that user-only fails
to add guest_base for some 128-bit load/store.
Fixes: 098d0fc10d ("tcg/i386: Support 128-bit load/store")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1763
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
i386 and s390x implementations of op_add2 require an earlyclobber,
which is currently missing. This breaks VCKSM in s390x guests. E.g., on
x86_64 the following op:
add2_i32 tmp2,tmp3,tmp2,tmp3,tmp3,tmp2 dead: 0 2 3 4 5 pref=none,0xffff
is translated to:
addl %ebx, %r12d
adcl %r12d, %ebx
Introduce a new C_N1_O1_I4 constraint, and make sure that earlyclobber
of aliased outputs is honored.
Cc: qemu-stable@nongnu.org
Fixes: 82790a8709 ("tcg: Add markup for output requires new register")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230719221310.1968845-7-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Often, the only thing we need to know about the TCG host
is the register size.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Disconnect the layout of ArchCPU from TCG compilation.
Pass the relative offset of 'env' and 'neg.tlb.f' as a parameter.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The last use was removed by e77c89fb08.
Fixes: e77c89fb08 ("cputlb: Remove static tlb sizing")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the CPUINFO_* bits instead of the individual boolean
variables that we had been using. Remove all of the init
code that was moved over to cpuinfo-i386.c.
Note that have_avx512* check both AVX512{F,VL}, as we had
previously done during tcg_target_init.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Disconnect guest tlb parameters from TCG compilation.
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Disconnect guest page size from TCG compilation.
While this could be done via exec/target_page.h, we want to cache
the value across multiple memory access operations, so we might
as well initialize this early.
The changes within tcg/ are entirely mechanical:
sed -i s/TARGET_PAGE_BITS/s->page_bits/g
sed -i s/TARGET_PAGE_MASK/s->page_mask/g
Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All uses can be infered from the INDEX_op_qemu_*_a{32,64}_* opcode
being used. Add a field into TCGLabelQemuLdst to record the usage.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>