Commit Graph

150 Commits

Author SHA1 Message Date
Lucas Mateus Castro (alqotel)
da3c53bac3 target/ppc: Moved XSTSTDC[QDS]P to decodetree
Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of
its decoding away from the helper as previously the DCMX, XB and BF were
calculated in the helper with the help of cpu_env, now that part was
moved to the decodetree with the rest.

xvtstdcsp:
rept    loop    master             patch
8       12500   1,85393600         1,94683600 (+5.0%)
25      4000    1,78779800         1,92479000 (+7.7%)
100     1000    2,12775000         2,28895500 (+7.6%)
500     200     2,99655300         3,23102900 (+7.8%)
2500    40      6,89082200         7,44827500 (+8.1%)
8000    12     17,50585500        18,95152100 (+8.3%)

xvtstdcdp:
rept    loop    master             patch
8       12500   1,39043100         1,33539800 (-4.0%)
25      4000    1,35731800         1,37347800 (+1.2%)
100     1000    1,51514800         1,56053000 (+3.0%)
500     200     2,21014400         2,47906000 (+12.2%)
2500    40      5,39488200         6,68766700 (+24.0%)
8000    12     13,98623900        18,17661900 (+30.0%)

xvtstdcdp:
rept    loop    master             patch
8       12500   1,35123800         1,34455800 (-0.5%)
25      4000    1,36441200         1,36759600 (+0.2%)
100     1000    1,49763500         1,54138400 (+2.9%)
500     200     2,19020200         2,46196400 (+12.4%)
2500    40      5,39265700         6,68147900 (+23.9%)
8000    12     14,04163600        18,19669600 (+29.6%)

As some values are now decoded outside the helper and passed to it as an
argument the number of arguments of the helper increased, the number
of TCGop needed to load the arguments increased. I suspect that's why
the slow-down in the tests with a high REPT but low LOOP.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-12-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28 13:15:22 -03:00
Lucas Mateus Castro (alqotel)
a70a524710 target/ppc: Moved XVTSTDC[DS]P to decodetree
Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper
to be simpler and do all decoding in the decodetree (so XB, XT and DCMX
are all calculated outside the helper).

Obs: The tests in this one are slightly different, these are the sum of
these instructions with all possible immediate and those instructions
are repeated 10 times.

xvtstdcsp:
rept    loop    master             patch
8       12500   2,76402100         2,70699100 (-2.1%)
25      4000    2,64867100         2,67884100 (+1.1%)
100     1000    2,73806300         2,78701000 (+1.8%)
500     200     3,44666500         3,61027600 (+4.7%)
2500    40      5,85790200         6,47475500 (+10.5%)
8000    12     15,22102100        17,46062900 (+14.7%)

xvtstdcdp:
rept    loop    master             patch
8       12500   2,11818000         1,61065300 (-24.0%)
25      4000    2,04573400         1,60132200 (-21.7%)
100     1000    2,13834100         1,69988100 (-20.5%)
500     200     2,73977000         2,48631700 (-9.3%)
2500    40      5,05067000         5,25914100 (+4.1%)
8000    12     14,60507800        15,93704900 (+9.1%)

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-11-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28 13:15:22 -03:00
Víctor Colombo
c3f24257e3 target/ppc: Clear fpstatus flags on helpers missing it
In ppc emulation, exception flags are not cleared at the end of an
instruction. Instead, the next instruction is responsible to clear
it before its emulation. However, some helpers are not doing it,
causing an issue where the previously set exception flags are being
used and leading to incorrect values being set in FPSCR.
Fix this by clearing fp_status before doing the instruction 'real' work
for the following helpers that were missing this behavior:

- VSX_CVT_INT_TO_FP_VECTOR
- VSX_CVT_FP_TO_FP
- VSX_CVT_FP_TO_INT_VECTOR
- VSX_CVT_FP_TO_INT2
- VSX_CVT_FP_TO_INT
- VSX_CVT_FP_TO_FP_HP
- VSX_CVT_FP_TO_FP_VECTOR
- VSX_CMP
- VSX_ROUND
- xscvqpdp
- xscvdpsp[n]

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220906125523.38765-9-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-09-20 10:54:06 -03:00
Víctor Colombo
9f097daa54 target/ppc: Zero second doubleword for VSX madd instructions
In 205eb5a89e we updated most VSX instructions to zero the
second doubleword, as is requested by PowerISA since v3.1.
However, VSX_MADD helper was left behind unchanged, while it
is also affected and should be fixed as well.

This patch applies the fix for MADD instructions.

Fixes: 205eb5a89e ("target/ppc: Change VSX instructions behavior to fill with zeros")
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220906125523.38765-6-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-09-20 10:54:06 -03:00
Víctor Colombo
74177ec661 target/ppc: Merge fsqrt and fsqrts helpers
These two helpers are almost identical, differing only by the softfloat
operation it calls. Merge them into one using a macro.
Also, take this opportunity to capitalize the helper name as we moved
the instruction to decodetree in a previous patch.

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220905123746.54659-4-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-09-20 10:54:06 -03:00
Lucas Mateus Castro (alqotel)
08e185cadb target/ppc: Bugfix FP when OE/UE are set
When an overflow exception occurs and OE is set the intermediate result
should be adjusted (by subtracting from the exponent) to avoid rounding
to inf. The same applies to an underflow exceptionion and UE (but adding
to the exponent). To do this set the fp_status.rebias_overflow when OE
is set and fp_status.rebias_underflow when UE is set as the FPU will
recalculate in case of a overflow/underflow if the according rebias* is
set.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220805141522.412864-3-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-08-31 14:08:05 -03:00
Daniel Henrique Barboza
5980167e07 target/ppc: fix unreachable code in fpu_helper.c
Commit c29018cc73 added an env->fpscr OR operation using a ternary
that checks if 'error' is not zero:

    env->fpscr |= error ? FP_FEX : 0;

However, in the current body of do_fpscr_check_status(), 'error' is
granted to be always non-zero at that point. The result is that Coverity
is less than pleased:

  Control flow issues  (DEADCODE)
Execution cannot reach the expression "0ULL" inside this statement:
"env->fpscr |= (error ? 1073...".

Remove the ternary and always make env->fpscr |= FP_FEX.

Cc: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Cc: Richard Henderson <richard.henderson@linaro.org>
Fixes: Coverity CID 1489442
Fixes: c29018cc73 ("target/ppc: Implemented xvf*ger*")
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Message-Id: <20220602191048.137511-1-danielhb413@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-06-20 08:38:58 -03:00
Lucas Mateus Castro (alqotel)
5724e131ca target/ppc: Implemented [pm]xvbf16ger2*
Implement the following PowerISA v3.1 instructions:
xvbf16ger2:   VSX Vector bfloat16 GER (rank-2 update)
xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply,
Negative accumulate
xvbf16ger2np: VSX Vector bfloat16 GER (rank-2 update) Negative multiply,
Positive accumulate
xvbf16ger2pn: VSX Vector bfloat16 GER (rank-2 update) Positive multiply,
Negative accumulate
xvbf16ger2pp: VSX Vector bfloat16 GER (rank-2 update) Positive multiply,
Positive accumulate
pmxvbf16ger2:   Prefixed Masked VSX Vector bfloat16 GER (rank-2 update)
pmxvbf16ger2nn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update)
Negative multiply, Negative accumulate
pmxvbf16ger2np: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update)
Negative multiply, Positive accumulate
pmxvbf16ger2pn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update)
Positive multiply, Negative accumulate
pmxvbf16ger2pp: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update)
Positive multiply, Positive accumulate

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220524140537.27451-8-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:33 -03:00
Lucas Mateus Castro (alqotel)
2d9cba74ef target/ppc: Implemented xvf16ger*
Implement the following PowerISA v3.1 instructions:
xvf16ger2:   VSX Vector 16-bit Floating-Point GER (rank-2 update)
xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative
multiply, Negative accumulate
xvf16ger2np: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative
multiply, Positive accumulate
xvf16ger2pn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive
multiply, Negative accumulate
xvf16ger2pp: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive
multiply, Positive accumulate

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220524140537.27451-6-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:33 -03:00
Lucas Mateus Castro (alqotel)
c29018cc73 target/ppc: Implemented xvf*ger*
Implement the following PowerISA v3.1 instructions:
xvf32ger:   VSX Vector 32-bit Floating-Point GER (rank-1 update)
xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative
multiply, Negative accumulate
xvf32gernp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative
multiply, Positive accumulate
xvf32gerpn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive
multiply, Negative accumulate
xvf32gerpp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive
multiply, Positive accumulate
xvf64ger:   VSX Vector 64-bit Floating-Point GER (rank-1 update)
xvf64gernn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative
multiply, Negative accumulate
xvf64gernp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative
multiply, Positive accumulate
xvf64gerpn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive
multiply, Negative accumulate
xvf64gerpp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive
multiply, Positive accumulate

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220524140537.27451-5-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:33 -03:00
Matheus Ferst
c36ab970ac target/ppc: declare xvxsigsp helper with call flags
Move xvxsigsp to decodetree, declare helper_xvxsigsp with
TCG_CALL_NO_RWG, and drop the unused env argument.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220517123929.284511-8-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:32 -03:00
Matheus Ferst
cf862bee0e target/ppc: declare xscvspdpn helper with call flags
Move xscvspdpn to decodetree, declare helper_xscvspdpn with
TCG_CALL_NO_RWG_SE and drop the unused env argument.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220517123929.284511-7-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:32 -03:00
Matheus Ferst
eb69a84bb0 target/ppc: Use TCG_CALL_NO_RWG_SE in fsel helper
fsel doesn't change FPSCR and CR1 is handled by gen_set_cr1_from_fpscr,
so helper_fsel doesn't need the env argument and can be declared with
TCG_CALL_NO_RWG_SE. We also take this opportunity to move the insn to
decodetree.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220517123929.284511-6-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:32 -03:00
Víctor Colombo
dd657a35b4 target/ppc: Rename sfprf to sfifprf where it's also used as set fi flag
The bit FI fix used the sfprf flag as a flag for the set_fi parameter
in do_float_check_status where applicable. Now, this patch rename this
flag to sfifprf to state this dual usage.

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Message-Id: <20220517161522.36132-4-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:32 -03:00
Víctor Colombo
c582a1dbc8 target/ppc: Fix FPSCR.FI changing in float_overflow_excp()
This patch fixes another not-so-clear situation in Power ISA
regarding the inexact bits in FPSCR. The ISA states that:

"""
When Overflow Exception is disabled (OE=0) and an
Overflow Exception occurs, the following actions are
taken:
...
2. Inexact Exception is set
XX <- 1
...
FI is set to 1
...
"""

However, when tested on a Power 9 hardware, some instructions that
trigger an OX don't set the FI bit:

xvcvdpsp(0x4050533fcdb7b95ff8d561c40bf90996) = FI: CLEARED -> CLEARED
xvnmsubmsp(0xf3c0c1fc8f3230, 0xbeaab9c5) = FI: CLEARED -> CLEARED
(just a few examples. Other instructions are also affected)

The root cause for this seems to be that only instructions that list
the bit FI in the "Special Registers Altered" should modify it.

QEMU is, today, not working like the hardware:

xvcvdpsp(0x4050533fcdb7b95ff8d561c40bf90996) = FI: CLEARED -> SET
xvnmsubmsp(0xf3c0c1fc8f3230, 0xbeaab9c5) = FI: CLEARED -> SET

(all tests assume FI is cleared beforehand)

Fix this by making float_overflow_excp() return float_flag_inexact
if it should update the inexact flags.

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Message-Id: <20220517161522.36132-3-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:32 -03:00
Víctor Colombo
3278677f6a target/ppc: Fix FPSCR.FI bit being cleared when it shouldn't
According to Power ISA, the FI bit in FPSCR is non-sticky.
This means that if an instruction is said to modify the FI bit, then
it should be set or cleared depending on the result of the
instruction. Otherwise, it should be kept as was before.

However, the following inconsistency was found when comparing results
from the hardware (tested on both a Power 9 processor and in
Power 10 Mambo):

(FI bit is set before the execution of the instruction)
Hardware: xscmpeqdp(0xff..ff, 0xff..ff) = FI: SET -> SET
QEMU: xscmpeqdp(0xff..ff, 0xff..ff) = FI: SET -> CLEARED

As the FI bit is non-sticky, and xscmpeqdp does not list it as a field
that is changed by the instruction, it should not be changed after its
execution.
This is happening to multiple instructions in the vsx implementations.

If the ISA does not list the FI bit as altered for a particular
instruction, then it should be kept as it was before the instruction.

QEMU is not following this behavior. Affected instructions include:
- xv* (all vsx-vector instructions);
- xscmp*, xsmax*, xsmin*;
- xstdivdp and similars;
(to identify the affected instructions, just search in the ISA for
 the instructions that does not list FI in "Special Registers Altered")

Most instructions use the function do_float_check_status() to commit
changes in the inexact flag. So the fix is to add a parameter to it
that will control if the bit FI should be changed or not.
All users of do_float_check_status() are then modified to provide this
argument, controlling if that specific instruction changes bit FI or
not.
Some macro helpers are responsible for both instructions that change
and instructions that aren't suposed to change FI. This seems to always
overlap with the sfprf flag. So, reuse this flag for this purpose when
applicable.

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220517161522.36132-2-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26 17:11:32 -03:00
Víctor Colombo
208d803326 target/ppc: Remove fpscr_* macros from cpu.h
fpscr_* defined macros are hiding the usage of *env behind them.
Substitute the usage of these macros with `env->fpscr & FP_*` to make
the code cleaner.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20220504210541.115256-2-victor.colombo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05 15:36:17 -03:00
Matheus Ferst
b3d4520585 target/ppc: implement xscvqp[su]qz
Implement the following PowerISA v3.1 instructions:
xscvqpsqz: VSX Scalar Convert with round to zero Quad-Precision to
           Signed Quadword
xscvqpuqz: VSX Scalar Convert with round to zero Quad-Precision to
           Unsigned Quadword

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220330175932.6995-9-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20 18:00:30 -03:00
Matheus Ferst
67332e0718 target/ppc: implement xscv[su]qqp
Implement the following PowerISA v3.1 instructions:
xscvsqqp: VSX Scalar Convert with round Signed Quadword to
          Quad-Precision
xscvuqqp: VSX Scalar Convert with round Unsigned Quadword to
          Quad-Precision format

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220330175932.6995-8-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20 18:00:30 -03:00
Lucas Coutinho
3515553bf6 target/ppc: Replicate Double->Single-Precision result
Power ISA v3.1 formalizes the previously undefined result in
words 1 and 3 to be a copy of the result in words 0 and 2.

This affects: xvcvsxdsp, xvcvuxdsp, xvcvdpsp.

And the previously undefined result in word 1 to be a copy of
the result in word 0.

This affects: xscvdpsp.

Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Message-Id: <20220316200427.3410437-1-lucas.coutinho@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-20 23:35:27 +01:00
Richard Henderson
217979d33e target/ppc: Replicate double->int32 result for some vector insns
Power ISA v3.1 formalizes the previously undefined result in
words 1 and 3 to be a copy of the result in words 0 and 2.

This affects: xscvdpsxws, xscvdpuxws, xvcvdpsxws, xvcvdpuxws.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/852
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[ clg: checkpatch fixes ]
Message-Id: <20220315053934.377519-1-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-20 23:35:27 +01:00
Víctor Colombo
a9eb50376f target/ppc: Add missing helper_reset_fpstatus to helper_XVCVSPBF16
Fixes: 3909ff1fac ("target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions")
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220304175156.2012315-8-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-05 07:16:48 +01:00
Víctor Colombo
e1428e5b57 target/ppc: Add missing helper_reset_fpstatus to VSX_MAX_MINC
Fixes: da499405aa ("target/ppc: Refactor VSX_MAX_MINC helper")
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220304175156.2012315-7-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-05 07:16:48 +01:00
Matheus Ferst
4e4b5a3eac target/ppc: change xs[n]madd[am]sp to use float64r32_muladd
Change VSX Scalar Multiply-Add/Subtract Type-A/M Single Precision
helpers to use float64r32_muladd. This method should correctly handle
all rounding modes, so the workaround for float_round_nearest_even can
be dropped.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220304165417.1981159-3-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-05 07:16:46 +01:00
Víctor Colombo
3909ff1fac target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220225210936.1749575-47-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Víctor Colombo
7b8d6e3e79 target/ppc: Implement xs{max,min}cqp
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-46-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Víctor Colombo
da499405aa target/ppc: Refactor VSX_MAX_MINC helper
Refactor xs{max,min}cdp VSX_MAX_MINC helper to prepare for
xs{max,min}cqp implementation.

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-45-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Víctor Colombo
5307df8f3a target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3
Also, fixes these instructions not being capitalized.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-44-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Víctor Colombo
72d24354ca target/ppc: Move xscmp{eq,ge,gt}dp to decodetree
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-43-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Víctor Colombo
568e7c4d45 target/ppc: Implement xscmp{eq,ge,gt}qp
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-42-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Víctor Colombo
4439586a2b target/ppc: Refactor VSX_SCALAR_CMP_DP
Refactor VSX_SCALAR_CMP_DP, changing its name to VSX_SCALAR_CMP and
prepare the helper to be used for quadword comparisons.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220225210936.1749575-41-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Víctor Colombo
0efbb8dc2f target/ppc: Remove xscmpnedp instruction
xscmpnedp was added in ISA v3.0 but removed in v3.0B. This patch
removes this instruction as it was not in the final version of v3.0.

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-40-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Matheus Ferst
3bb1aed246 target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]
Implement the following PowerISA v3.0 instuctions:
xsmaddqp[o]: VSX Scalar Multiply-Add Quad-Precision [using round to Odd]
xsmsubqp[o]: VSX Scalar Multiply-Subtract Quad-Precision [using round
             to Odd]
xsnmaddqp[o]: VSX Scalar Negative Multiply-Add Quad-Precision [using
              round to Odd]
xsnmsubqp[o]: VSX Scalar Negative Multiply-Subtract Quad-Precision
              [using round to Odd]

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-38-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Matheus Ferst
e4318ab2e4 target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-37-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Matheus Ferst
6a94bf196c target/ppc: move xxperm/xxpermr to decodetree
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220225210936.1749575-31-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02 06:51:38 +01:00
Víctor Colombo
205eb5a89e target/ppc: Change VSX instructions behavior to fill with zeros
ISA v3.1 changed some VSX instructions behavior by changing what the
other words/doubleword in the result should contain when the result is
only one word/doubleword. e.g. xsmaxdp operates on doubleword 0 and
saves the result also in doubleword 0.
Before, the second doubleword result was undefined according to the
ISA, but now it's stated that it should be zeroed.

Even tough the result was undefined before, hardware implementing these
instructions already filled these fields with 0s. Changing every ISA
version in QEMU to this behavior makes the results match what happens
in hardware.

Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220204181944.65063-1-victor.colombo@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-09 09:08:56 +01:00
Matheus Ferst
84ade98e87 target/ppc: do not silence snan in xscvspdpn
The non-signalling versions of VSX scalar convert to shorter/longer
precision insns doesn't silence SNaNs in the hardware. To better match
this behavior, use the non-arithmatic conversion of helper_todouble
instead of float32_to_float64. A test is added to prevent future
regressions.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211228120310.1957990-1-matheus.ferst@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Matheus Ferst
caf6f9b568 target/ppc: move xscvqpdp to decodetree
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211213120958.24443-5-victor.colombo@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:18 +01:00
Victor Colombo
201fc774e0 target/ppc: Fix xs{max, min}[cj]dp to use VSX registers
PPC instruction xsmaxcdp, xsmincdp, xsmaxjdp, and xsminjdp are using
vector registers when they should be using VSX ones. This happens
because the instructions are using GEN_VSX_HELPER_R3, which adds 32
to the register numbers, effectively making them vector registers.

This patch fixes it by changing these instructions to use
GEN_VSX_HELPER_X3.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Victor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20211213120958.24443-2-victor.colombo@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:18 +01:00
Richard Henderson
a1f1c731c6 target/ppc: Use helper_todouble/tosingle in helper_xststdcsp
When computing the predicate "is this value currently formatted
for single precision", we do not want to round the value according
to the current rounding mode, nor perform a floating-point equality.
We want to see if the N bits that make up single-precision are the
only ones set within the register, and then a bitwise equality.

Fixes a bug in which a single-precision NaN is considered !SP,
because float64_eq(nan, nan) is always false.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-35-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
7d82ea3484 target/ppc: Update fres to new flags and float64r32
There is no double-rounding bug here, because the result is
merely an estimate to within 1 part in 256, but perform the
operation with float64r32_div for consistency.

Use float_flag_invalid_snan instead of recomputing the
snan-ness of the operand.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-34-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
dedbfda765 target/ppc: Add helper for frsqrtes
There is no double-rounding bug here, because the result is
merely an estimate to within 1 part in 32, but perform the
operation with float64r32_div for consistency.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-33-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
7f87214e3b target/ppc: Add helper for fmuls
Use float64r32_mul.  Fixes a double-rounding issue with performing
the compuation in float64 and then rounding afterward.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-32-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
d9e792a1c1 target/ppc: Add helpers for fadds, fsubs, fdivs
Use float64r32_{add,sub,div}.  Fixes a double-rounding issue with
performing the compuation in float64 and then rounding afterward.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-31-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
41ae890d08 target/ppc: Add helper for fsqrts
Use float64r32_sqrt.  Fixes a double-rounding issue with performing
the compuation in float64 and then rounding afterward.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-30-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
d04ca895dc target/ppc: Add helpers for fmadds et al
Use float64r32_muladd.  Fixes a double-rounding issue with performing
the compuation in float64 and then rounding afterward.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-29-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
8ea0b1408e target/ppc: Update fre to new flags
Use float_flag_invalid_snan instead of recomputing
the snan-ness of the operand.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-27-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:15 +01:00
Richard Henderson
053e23a694 target/ppc: Update xsrqpi and xsrqpxp to new flags
Use float_flag_invalid_snan instead of recomputing
the snan-ness of the operand.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-26-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:15 +01:00
Richard Henderson
3d3050cc8d target/ppc: Update sqrt for new flags
Now that vxsqrt and vxsnan are computed directly by softfloat,
we don't need to recompute it.  Split out float_invalid_op_sqrt
to be used in several places.  This fixes VSX_SQRT, which did
not order its tests correctly to eliminate NaN with sign set.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-25-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:15 +01:00
Richard Henderson
58c7edef61 target/ppc: Use helper_todouble in do_frsp
We only needed one ieee arithmetic operation to raise
exceptions.  To convert back to register form, we can
use our simpler non-arithmetic function.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-24-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:15 +01:00