Merge VERLL and VERLLV into op_vesv and op_ves, alongside
all of the other vector shift operations.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The gen_gvec_dupi switch is unnecessary with the new function.
Replace it with a local gen_gvec_dup_imm that takes care of the
register to offset conversion and length arguments.
Drop zero_vec and use use gen_gvec_dup_imm with 0.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
I believe that the separate allocation of DisasFields from DisasContext
was meant to limit the places from which we could access fields. But
that plan did not go unchanged, and since DisasContext contains a pointer
to fields, the substructure is accessible everywhere.
By allocating the substructure with DisasContext, we improve the locality
of the accesses by avoiding one level of pointer chasing. In addition,
we avoid a dangling pointer to stack allocated memory, diagnosed by static
checkers.
Launchpad: https://bugs.launchpad.net/bugs/1661815
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200123232248.1800-5-richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
All callers pass s->fields, so we might as well pass s directly.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200123232248.1800-4-richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The numbers are unsigned, the computation is wrong. "Each operand is
treated as an unsigned binary integer".
Let's implement as given in the PoP:
"A subtraction is performed by adding the contents of the second operand
with the bitwise complement of the third operand along with a borrow
indication from the rightmost bit of the fourth operand."
Reuse gen_accc2_i64().
Fixes: bc725e6515 ("s390x/tcg: Implement VECTOR SUBTRACT WITH BORROW COMPUTE BORROW INDICATION")
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20191021085715.3797-7-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Testing this, there seems to be something messed up. We are dealing with
unsigned numbers. "Each operand is treated as an unsigned binary integer."
Let's just implement as written in the PoP:
"A subtraction is performed by adding the contents of
the second operand with the bitwise complement of
the third operand along with a borrow indication from
the rightmost bit position of the fourth operand and
the result is placed in the first operand."
We can reuse gen_ac2_i64().
Fixes: 48390a7c27 ("s390x/tcg: Implement VECTOR SUBTRACT WITH BORROW INDICATION")
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20191021085715.3797-6-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Looks like my idea of what a "borrow" is was wrong. The PoP says:
"If the resulting subtraction results in a carry out of bit zero, a value
of one is placed in the corresponding element of the first operand;
otherwise, a value of zero is placed in the corresponding element"
As clarified by Richard, all we have to do is invert the result.
Fixes: 1ee2d7ba72 ("s390x/tcg: Implement VECTOR SUBTRACT COMPUTE BORROW INDICATION")
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20191021085715.3797-5-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.
Target dependant attributes are conditionalized upon NEED_CPU_H.
Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Wrong order of operands. The constant always comes last. Makes QEMU crash
reliably on specific git fetch invocations.
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190814151242.27199-1-david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Fixes: 5c4b0ab460 ("s390x/tcg: Implement VECTOR ELEMENT ROTATE AND INSERT UNDER MASK")
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
...so that the compiler properly recognizes it.
Reported-by: Stefan Weil <sw@weilnetz.de>
Fixes: f180da83c0 ("s390x/tcg: Implement VECTOR LOAD LOGICAL ELEMENT AND ZERO")
Message-Id: <20190708125433.16927-3-cohuck@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This replaces the target-specific implementations for VSEL.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Simulate XxC=0 and ERM=0 (current mode), so we can use the existing
helper function.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
The only FP instruction we can implement without an helper.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse some of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Take care of reading/indicating the 32-bit elements.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse most of the infrastructure introduced for
VECTOR FP CONVERT FROM FIXED 64-BIT and friends.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse most of the infrastructure added for VECTOR FP ADD.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
1. We'll reuse op_vcdg() for similar instructions later, prepare for
that.
2. We'll reuse vop64_2() later for other instructions.
We have to mangle the erm (effective rounding mode) and the m4 into
the simd_data(), and properly unmangle them again.
Make sure to restore the erm before triggering an exception.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Provide for all three instructions all four combinations of cc bit and
s bit.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
As far as I can see, there is only a tiny difference.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
1. We'll reuse op_vfa() for similar instructions later, prepare for
that.
2. We'll reuse vop64_3() for other instructions later.
3. Take care of modifying the vector register only if no trap happened.
- on traps, flags are not updated and no elements are modified
- traps don't modify the fpc flags
- without traps, all exceptions of all elements are merged
4. We'll reuse check_ieee_exc() later when we need the XxC flag.
We have to check for exceptions after processing each element.
Provide separate handlers for single/all element processing. We'll do
the same for all applicable FP instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Unfortunately, there is no easy way to avoid looping over all elements
in v2. Provide specialized variants for !cc,!rt/!cc,rt/cc,!rt/cc,rt and
all element types. Especially for different values of rt, the compiler
might be able to optimize the code a lot.
Add s390_vec_write_element().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR FIND ELEMENT EQUAL. Core logic courtesy of Richard H.
Add s390_vec_read_element() that can deal with element sizes.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Complicated stuff. Provide two different helpers for CC an !CC handling.
We might want to add more helpers later.
zero_search() and match_index() are courtesy of Richard H.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's return the cc value directly via cpu_env. Unfortunately there
isn't a simple way to calculate the value lazily - one would have to
calculate and store e.g. the population count of the mask and the
result so it can be evaluated in a cc helper.
But as VTM only sets the cc, we can assume the value will be needed soon
either way.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR SUM ACROSS DOUBLEWORD.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR SUM ACROSS DOUBLEWORD, however without a loop and
using 128-bit calculations.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Perform the calculations without a helper. Only 16 bit or 32 bit values
have to be added.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Fairly easy as only 128-bit handling is required. Simply perform the
subtraction and then subtract the borrow.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's keep it simple for now and handle 8/16 bit elements via helpers.
Especially for 8/16, we could come up with some bit tricks.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can use tcg_gen_sub2_i64() to do 128-bit subtraction and otherwise
existing gvec helpers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR SHIFT RIGHT ARITHMETICAL.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR SHIFT LEFT ARITHMETIC. Add s390_vec_sar() similar to
s390_vec_shr().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Inline expansion courtesy of Richard H.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can reuse the existing 128-bit shift utility function.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
We can use all the fancy new vector helpers implemented by Richard.
One important thing to take care of is always to properly mask of
unused bits from the shift count.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Use the new vector expansion for GVecGen3i.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Take care of properly taking the modulo of the count. We might later
want to come back and create a variant of VERLL where the base register
is 0, resulting in an immediate.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR COUNT TRAILING ZEROES.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>