The VSX register array is a block of 64 128-bit registers where the first 32
registers consist of the existing 64-bit FP registers extended to 128-bit
using new VSR registers, and the last 32 registers are the VMX 128-bit
registers as show below:
64-bit 64-bit
+--------------------+--------------------+
| FP0 | | VSR0
+--------------------+--------------------+
| FP1 | | VSR1
+--------------------+--------------------+
| ... | ... | ...
+--------------------+--------------------+
| FP30 | | VSR30
+--------------------+--------------------+
| FP31 | | VSR31
+--------------------+--------------------+
| VMX0 | VSR32
+-----------------------------------------+
| VMX1 | VSR33
+-----------------------------------------+
| ... | ...
+-----------------------------------------+
| VMX30 | VSR62
+-----------------------------------------+
| VMX31 | VSR63
+-----------------------------------------+
In order to allow for future conversion of VSX instructions to use TCG vector
operations, recreate the same layout using an aligned version of the existing
vsr register array.
Since the old fpr and avr register arrays are removed, the existing callers
must also be updated to use the correct offset in the vsr register array. This
also includes switching the relevant VMState fields over to using subarrays
to make sure that migration is preserved.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead of accessing the FPR, VMX and VSX registers through static arrays of
TCGv_i64 globals, remove them and change the helpers to load/store data directly
within cpu_env.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These helpers allow us to move VSR register values to/from the specified TCGv_i64
argument.
To prevent VSX helpers accessing the cpu_vsr array directly, add extra TCG
temporaries as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These helpers allow us to move AVR register values to/from the specified TCGv_i64
argument.
To prevent VMX helpers accessing the cpu_avr{l,h} arrays directly, add extra TCG
temporaries as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These helpers allow us to move FP register values to/from the specified TCGv_i64
argument in the VSR helpers to be introduced shortly.
To prevent FP helpers accessing the cpu_fpr array directly, add extra TCG
temporaries as required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Changes requirement for "vsubsbs" instruction, which has been supported
since ISA 2.03. (Please see section 5.9.1.2 of ISA 2.03)
Reported-by: Paul A. Clarke <pc@us.ibm.com>
Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
Signed-off-by: Leonardo Bras <leonardo@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
External PID is a mechanism present on BookE 2.06 that enables application to
store/load data from different address spaces. There are special version of some
instructions, which operate on alternate address space, which is specified in
the EPLC/EPSC regiser.
This implementation uses two additional MMU modes (mmu_idx) to provide the
address space for the load and store instructions. The QEMU TLB fill code was
modified to recognize these MMU modes and use the values in EPLC/EPSC to find
the proper entry in he PPC TLB. These two QEMU TLBs are also flushed on each
write to EPLC/EPSC.
Following instructions are implemented: dcbfep dcbstep dcbtep dcbtstep dcbzep
dcbzlep icbiep lbepx ldepx lfdepx lhepx lwepx stbepx stdepx stfdepx sthepx
stwepx.
Following vector instructions are not: evlddepx evstddepx lvepx lvepxl stvepx
stvepxl.
Signed-off-by: Roman Kapl <rka@sysgo.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Memory operations have no side effects on fp state.
The use of a "real" conversions between float64 and float32
would raise exceptions for SNaN and out-of-range inputs.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A couple of notes:
- removed ctx->nip in favour of base->pc_next. Yes, it is annoying,
but didn't want to waste its 4 bytes.
- ctx->singlestep_enabled does a lot more than
base.singlestep_enabled; this confused me at first.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Done with the Coccinelle semantic patch
scripts/coccinelle/tcg_gen_extract.cocci.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20170718045540.16322-6-f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
xscvqpudz: VSX Scalar truncate & Convert Quad-Precision format to
Unsigned Doubleword format
xscvqpuwz: VSX Scalar truncate & Convert Quad-Precision format to
Unsigned Word format
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xsaddqpo: VSX Scalar Add Quad-Precision using round to Odd
xsmulqo: VSX Scalar Multiply Quad-Precision using round to Odd
xsdivqpo: VSX Scalar Divide Quad-Precision using round to Odd
xscvqpdpo: VSX Scalar round & Convert Quad-Precision format to
Double-Precision format using round to Odd
xssqrtqpo: VSX Scalar Square Root Quad-Precision using round to Odd
xssubqpo: VSX Scalar Subtract Quad-Precision using round to Odd
In addition, fix the invalid bitmask in the instruction encoding
of xssqrtqp[o].
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
CC: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xststdcsp: VSX Scalar Test Data Class Single-Precision
xststdcdp: VSX Scalar Test Data Class Double-Precision
xststdcqp: VSX Scalar Test Data Class Quad-Precision
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xvtstdcsp: VSX Vector Test Data Class Single-Precision
xvtstdcdp: VSX Vector Test Data Class Double-Precision
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xvcvhpsp: VSX Vector Convert Half Precision to Single Precision
xvcvsphp: VSX Vector Convert Single Precision to Half Precision
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xscvsdqp: VSX Scalar Convert Signed Doubleword format to
Quad-Precision format
xscvudqp: VSX Scalar Convert Unsigned Doubleword format to
Quad-Precision format
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
bcdutrunc. Decimal unsigned truncate. Works like bcdtrunc. with
unsigned BCD numbers.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
bcdtrunc.: Decimal integer truncate. Given a BCD number in vrb and the
number of bytes to truncate in vra, the return register will have vrb
with such bits truncated.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xscvqpsdz: VSX Scalar truncate & Convert Quad-Precision format to
Signed Doubleword format
xscvqpswz: VSX Scalar truncate & Convert Quad-Precision format to
Signed Word format
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
bcdsr.: Decimal shift and round. This instruction works like bcds.
however, when performing right shift, 1 will be added to the
result if the last digit was >= 5.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
bcdus.: Decimal unsigned shift. This instruction works like bcds. but
considers only unsigned BCDs (no sign in least meaning 4 bits).
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
bcds.: Decimal shift. Given two registers vra and vrb, this instruction
shift the vrb value by vra bits into the result register.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xscvqpdp: VSX Scalar round & Convert Quad-Precision format to
Double-Precision format
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xscvdpqp: VSX Scalar Convert Double-Precision format to
Quad-Precision format
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xscvdphp: VSX Scalar round & Convert Double-Precision format to
Half-Precision format
xscvhpdp: VSX Scalar Convert Half-Precision format to
Double-Precision format
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since helper_compute_fprf() works on float64 argument, rename it
to helper_compute_fprf_float64(). Also use a macro to generate
helper_compute_fprf_float64() so that float128 version of the same
helper can be introduced easily later.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xxinsertw: VSX Vector Insert Word
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xxextractuw: VSX Vector Extract Unsigned Word
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
stxvll: Store VSX Vector Left-justified with Length
Vector (8-bit elements) in BE/LE:
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--+--+
|“T”|“h”|“i”|“s”|“ ”|“i”|“s”|“ ”|“a”|“ ”|“T”|“E”|“S”|“T”|00|00|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--+--+
Storing 14 bytes would result in following Little/Big-endian Storage:
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--+--+
|“T”|“h”|“i”|“s”|“ ”|“i”|“s”|“ ”|“a”|“ ”|“T”|“E”|“S”|“T”|FF|FF|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--+--+
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>