Commit Graph

173 Commits

Author SHA1 Message Date
Victor Colombo
201fc774e0 target/ppc: Fix xs{max, min}[cj]dp to use VSX registers
PPC instruction xsmaxcdp, xsmincdp, xsmaxjdp, and xsminjdp are using
vector registers when they should be using VSX ones. This happens
because the instructions are using GEN_VSX_HELPER_R3, which adds 32
to the register numbers, effectively making them vector registers.

This patch fixes it by changing these instructions to use
GEN_VSX_HELPER_X3.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Victor Colombo <victor.colombo@eldorado.org.br>
Message-Id: <20211213120958.24443-2-victor.colombo@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:18 +01:00
Fabiano Rosas
a09410ed1f target/ppc: Remove the software TLB model of 7450 CPUs
(Applies to 7441, 7445, 7450, 7451, 7455, 7457, 7447, 7447a and 7448)

The QEMU-side software TLB implementation for the 7450 family of CPUs
is being removed due to lack of known users in the real world. The
last users in the code were removed by the two previous commits.

A brief history:

The feature was added in QEMU by commit 7dbe11acd8 ("Handle all MMU
models in switches...") with the mention that Linux was not able to
handle the TLB miss interrupts and the MMU model would be kept
disabled.

At some point later, commit 8ca3f6c382 ("Allow selection of all
defined PowerPC 74xx (aka G4) CPUs.") enabled the model for the 7450
family without further justification.

We have since the year 2011 [1] been unable to run OpenBIOS in the
7450s and have not heard of any other software that is used with those
CPUs in QEMU. Attempts were made to find a guest OS that implemented
the TLB miss handlers and none were found among Linux 5.15, FreeBSD 13,
MacOS9, MacOSX and MorphOS 3.15.

All CPUs that registered this feature were moved to an MMU model that
replaces the software TLB with a QEMU hardware TLB
implementation. They can now run the same software as the 7400 CPUs,
including the OSes mentioned above.

References:

- https://bugs.launchpad.net/qemu/+bug/812398
  https://gitlab.com/qemu-project/qemu/-/issues/86

- https://lists.nongnu.org/archive/html/qemu-ppc/2021-11/msg00289.html
  message id: 20211119134431.406753-1-farosas@linux.ibm.com

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211130230123.781844-4-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
dedbfda765 target/ppc: Add helper for frsqrtes
There is no double-rounding bug here, because the result is
merely an estimate to within 1 part in 32, but perform the
operation with float64r32_div for consistency.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-33-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
7f87214e3b target/ppc: Add helper for fmuls
Use float64r32_mul.  Fixes a double-rounding issue with performing
the compuation in float64 and then rounding afterward.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-32-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
d9e792a1c1 target/ppc: Add helpers for fadds, fsubs, fdivs
Use float64r32_{add,sub,div}.  Fixes a double-rounding issue with
performing the compuation in float64 and then rounding afterward.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-31-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
41ae890d08 target/ppc: Add helper for fsqrts
Use float64r32_sqrt.  Fixes a double-rounding issue with performing
the compuation in float64 and then rounding afterward.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-30-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Richard Henderson
d04ca895dc target/ppc: Add helpers for fmadds et al
Use float64r32_muladd.  Fixes a double-rounding issue with performing
the compuation in float64 and then rounding afterward.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-29-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:16 +01:00
Lucas Mateus Castro (alqotel)
c3a824b0cf target/ppc: Fixed call to deferred exception
mtfsf, mtfsfi and mtfsb1 instructions call helper_float_check_status
after updating the value of FPSCR, but helper_float_check_status
checks fp_status and fp_status isn't updated based on FPSCR and
since the value of fp_status is reset earlier in the instruction,
it's always 0.

Because of this helper_float_check_status would change the FI bit to 0
as this bit checks if the last operation was inexact and
float_flag_inexact is always 0.

These instructions also don't throw exceptions correctly since
helper_float_check_status throw exceptions based on fp_status.

This commit created a new helper, helper_fpscr_check_status that checks
FPSCR value instead of fp_status and checks for a larger variety of
exceptions than do_float_check_status.

Since fp_status isn't used, gen_reset_fpstatus() was removed.

The hardware used to compare QEMU's behavior to was a Power9.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Message-Id: <20211201163808.440385-2-lucas.araujo@eldorado.org.br>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17 17:57:12 +01:00
Matheus Ferst
788c63998c target/ppc: Implement xxblendvb/xxblendvh/xxblendvw/xxblendvd instructions
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bruno Larsen (billionai) <bruno.larsen@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211104123719.323713-24-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:53 +11:00
Matheus Ferst
28110b72a8 target/ppc: Implement Vector Extract Double to VSR using GPR index insns
Implement the following PowerISA v3.1 instructions:
vextdubvlx: Vector Extract Double Unsigned Byte to VSR using
            GPR-specified Left-Index
vextduhvlx: Vector Extract Double Unsigned Halfword to VSR using
            GPR-specified Left-Index
vextduwvlx: Vector Extract Double Unsigned Word to VSR using
            GPR-specified Left-Index
vextddvlx: Vector Extract Double Doubleword to VSR using
           GPR-specified Left-Index
vextdubvrx: Vector Extract Double Unsigned Byte to VSR using
            GPR-specified Right-Index
vextduhvrx: Vector Extract Double Unsigned Halfword to VSR using
            GPR-specified Right-Index
vextduwvrx: Vector Extract Double Unsigned Word to VSR using
            GPR-specified Right-Index
vextddvrx: Vector Extract Double Doubleword to VSR using
           GPR-specified Right-Index

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211104123719.323713-10-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Matheus Ferst
b422c2cb52 target/ppc: Move vinsertb/vinserth/vinsertw/vinsertd to decodetree
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211104123719.323713-9-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Matheus Ferst
2cc12af399 target/ppc: Implement Vector Insert from GPR using GPR index insns
Implements the following PowerISA v3.1 instructions:
vinsblx: Vector Insert Byte from GPR using GPR-specified Left-Index
vinshlx: Vector Insert Halfword from GPR using GPR-specified Left-Index
vinswlx: Vector Insert Word from GPR using GPR-specified Left-Index
vinsdlx: Vector Insert Doubleword from GPR using GPR-specified
         Left-Index
vinsbrx: Vector Insert Byte from GPR using GPR-specified Right-Index
vinshrx: Vector Insert Halfword from GPR using GPR-specified
         Right-Index
vinswrx: Vector Insert Word from GPR using GPR-specified Right-Index
vinsdrx: Vector Insert Doubleword from GPR using GPR-specified
         Right-Index

The helpers and do_vinsx receive i64 to allow code sharing with the
future implementation of Vector Insert from VSR using GPR Index.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211104123719.323713-6-matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Matheus Ferst
00a16569eb target/ppc: Implement vpdepd/vpextd instruction
pdepd and pextd helpers are moved out of #ifdef (TARGET_PPC64) to allow
them to be reused as GVecGen3.fni8.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211104123719.323713-4-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Matheus Ferst
6e0bbc4048 target/ppc: Move vcfuged to vmx-impl.c.inc
There's no reason to keep vector-impl.c.inc separate from
vmx-impl.c.inc. Additionally, let GVec handle the multiple calls to
helper_cfuged for us.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211104123719.323713-2-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
a23297479c target/ppc: Move ddedpd[q],denbcd[q],dscli[q],dscri[q] to decodetree
Move the following instructions to decodetree:
ddedpd:  DFP Decode DPD To BCD
ddedpdq: DFP Decode DPD To BCD Quad
denbcd:  DFP Encode BCD To DPD
denbcdq: DFP Encode BCD To DPD Quad
dscli:   DFP Shift Significand Left Immediate
dscliq:  DFP Shift Significand Left Immediate Quad
dscri:   DFP Shift Significand Right Immediate
dscriq:  DFP Shift Significand Right Immediate Quad

Also deleted dfp-ops.c.inc, now that all PPC DFP instructions were
moved to decodetree.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-16-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
c8ef4d1ec0 target/ppc: Move dct{dp,qpq},dr{sp,dpq},dc{f,t}fix[q],dxex[q] to decodetree
Move the following instructions to decodetree:
dctdp:   DFP Convert To DFP Long
dctqpq:  DFP Convert To DFP Extended
drsp:    DFP Round To DFP Short
drdpq:   DFP Round To DFP Long
dcffix:  DFP Convert From Fixed
dcffixq: DFP Convert From Fixed Quad
dctfix:  DFP Convert To Fixed
dctfixq: DFP Convert To Fixed Quad
dxex:    DFP Extract Biased Exponent
dxexq:   DFP Extract Biased Exponent Quad

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-15-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
a8f4bce6f8 target/ppc: Move dqua[q], drrnd[q] to decodetree
Move the following instructions to decodetree:
dqua:   DFP Quantize
dquaq:  DFP Quantize Quad
drrnd:  DFP Reround
drrndq: DFP Reround Quad

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-14-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
78464edb8f target/ppc: Move dquai[q], drint{x,n}[q] to decodetree
Move the following instructions to decodetree:
dquai:   DFP Quantize Immediate
dquaiq:  DFP Quantize Immediate Quad
drintx:  DFP Round to FP Integer With Inexact
drintxq: DFP Round to FP Integer With Inexact Quad
drintn:  DFP Round to FP Integer Without Inexact
drintnq: DFP Round to FP Integer Without Inexact Quad

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-13-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
85c38a460c target/ppc: Move dcmp{u,o}[q],dts{tex,tsf,tsfi}[q] to decodetree
Move the following instructions to decodetree:
dcmpu:    DFP Compare Unordered
dcmpuq:   DFP Compare Unordered Quad
dcmpo:    DFP Compare Ordered
dcmpoq:   DFP Compare Ordered Quad
dtstex:   DFP Test Exponent
dtstexq:  DFP Test Exponent Quad
dtstsf:   DFP Test Significance
dtstsfq:  DFP Test Significance Quad
dtstsfi:  DFP Test Significance Immediate
dtstsfiq: DFP Test Significance Immediate Quad

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-12-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
afdc931013 target/ppc: Move d{add,sub,mul,div,iex}[q] to decodetree
Move the following instructions to decodetree:
dadd:  DFP Add
daddq: DFP Add Quad
dsub:  DFP Subtract
dsubq: DFP Subtract Quad
dmul:  DFP Multiply
dmulq: DFP Multiply Quad
ddiv:  DFP Divide
ddivq: DFP Divide Quad
diex:  DFP Insert Biased Exponent
diexq: DFP Insert Biased Exponent Quad

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-11-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
87bc8e52b1 target/ppc: Move dtstdc[q]/dtstdg[q] to decodetree
Move the following instructions to decodetree:
dtstdc:  DFP Test Data Class
dtstdcq: DFP Test Data Class Quad
dtstdg:  DFP Test Data Group
dtstdgq: DFP Test Data Group Quad

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-10-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
328747f32f target/ppc: Implement DCTFIXQQ
Implement the following PowerISA v3.1 instruction:
dctfixqq: DFP Convert To Fixed Quadword Quad

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-8-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Luis Pires
d39b2cc7d0 target/ppc: Implement DCFFIXQQ
Implement the following PowerISA v3.1 instruction:
dcffixqq: DFP Convert From Fixed Quadword Quad

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-5-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Matheus Ferst
8bdb760606 target/ppc: Implement pextd instruction
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211029202424.175401-11-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Matheus Ferst
21ba6e5873 target/ppc: Implement pdepd instruction
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20211029202424.175401-10-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
Richard Henderson
7319d83a73 tcg: Combine dh_is_64bit and dh_is_signed to dh_typecode
We will shortly be interested in distinguishing pointers
from integers in the helper's declaration, as well as a
true void return.  We currently have two parallel 1 bit
fields; merge them and expand to a 3 bit field.

Our current maximum is 7 helper arguments, plus the return
makes 8 * 3 = 24 bits used within the uint32_t typemask.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-19 08:51:11 -07:00
Matheus Ferst
89ccd7dc3f target/ppc: Implement cfuged instruction
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20210601193528.2533031-12-matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-06-03 18:10:31 +10:00
Richard Henderson
46a0add975 target/ppc: Mark helper_raise_exception* as noreturn
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20210517205025.3777947-8-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-19 10:30:29 +10:00
Richard Henderson
f43520e5b2 target/ppc: Create helper_scv
Perform the test against FSCR_SCV at runtime, in the helper.

This means we can remove the incorrect set against SCV in
ppc_tr_init_disas_context and do not need to add an HFLAGS bit.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210323184340.619757-6-richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-05-04 11:41:24 +10:00
Lijun Pan
c4b8b49d68 target/ppc: add vmulh{su}d instructions
vmulhsd: Vector Multiply High Signed Doubleword
vmulhud: Vector Multiply High Unsigned Doubleword

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Message-Id: <20200724045845.89976-5-ljp@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Lijun Pan
f3e0d864ab target/ppc: add vmulh{su}w instructions
vmulhsw: Vector Multiply High Signed Word
vmulhuw: Vector Multiply High Unsigned Word

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Message-Id: <20200724045845.89976-4-ljp@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Lijun Pan
a285ffa680 target/ppc: convert vmuluwm to tcg_gen_gvec_mul
Convert the original implementation of vmuluwm to the more generic
tcg_gen_gvec_mul.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Message-Id: <20200701234344.91843-5-ljp@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-08-12 13:16:27 +10:00
Richard Henderson
3e114acc91 target/ppc: Use tcg_gen_gvec_rotlv
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2020-06-02 08:42:37 -07:00
Nicholas Piggin
3c89b8d6ac target/ppc: Add support for scv and rfscv instructions
POWER9 adds scv and rfscv instructions and the system call vectored
interrupt. Linux does not support this instruction yet but it has
been tested with a modified kernel that runs on real hardware.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20200507115328.789175-1-npiggin@gmail.com>
[dwg: Corrected an overlong line]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-05-27 15:29:24 +10:00
Nicholas Piggin
0418bf78fe target/ppc: Fix ISA v3.0 (POWER9) slbia implementation
The new ISA v3.0 slbia variants have not been implemented for TCG,
which can lead to crashing when a POWER9 machine boots Linux using
the hash MMU, for example ("disable_radix" kernel command line).

Add them.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20200319064439.1020571-1-npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fixed compile error for USER_ONLY builds]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-03-24 11:56:14 +11:00
Cédric Le Goater
5ba7ba1da0 target/ppc: Add privileged message send facilities
The Processor Control facility for POWER8 processors and later
provides a mechanism for the hypervisor to send messages to other
threads in the system (msgsnd instruction) and cause hypervisor-level
exceptions. Privileged non-hypervisor programs can also send messages
(msgsndp instruction) but are restricted to the threads of the same
subprocessor and cause privileged-level exceptions.

The Directed Privileged Doorbell Exception State (DPDES) register
reflects the state of pending privileged doorbell exceptions and can
be used to modify that state. The register can be used to read and
modify the state of privileged doorbell exceptions for all threads of
a subprocessor and thus is a shared facility for that subprocessor.
The register can be read/written by the hypervisor and read by the
supervisor if enabled in the HFSCR, otherwise a hypervisor facility
unavailable exception is generated.

The privileged message send and clear instructions (msgsndp & msgclrp)
are used to generate and clear the presence of a directed privileged
doorbell exception, respectively. The msgsndp instruction can be used
to target any thread of the current subprocessor, msgclrp acts on the
thread issuing the instruction. These instructions are privileged, but
will generate a hypervisor facility unavailable exception if not
enabled in the HFSCR and executed in privileged non-hypervisor
state. The HV facility unavailable exception will be addressed in
other patch.

Add and implement this register and instructions by reading or
modifying the pending interrupt state of the cpu.

Note that TCG only supports one thread per core and so we only need to
worry about the cpu making the access.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20200120104935.24449-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-02 14:07:57 +11:00
Suraj Jitindar Singh
f0ec31b1e2 target/ppc: Add SPR TBU40
The spr TBU40 is used to set the upper 40 bits of the timebase
register, present on POWER5+ and later processors.

This register can only be written by the hypervisor, and cannot be read.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh
5cc7e69f6d target/ppc: Work [S]PURR implementation and add HV support
The Processor Utilisation of Resources Register (PURR) and Scaled
Processor Utilisation of Resources Register (SPURR) provide an estimate
of the resources used by the thread, present on POWER7 and later
processors.

Currently the [S]PURR registers simply count at the rate of the
timebase.

Preserve this behaviour but rework the implementation to store an offset
like the timebase rather than doing the calculation manually. Also allow
hypervisor write access to the register along with the currently
available read access.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh
5d62725b2f target/ppc: Implement the VTB for HV access
The virtual timebase register (VTB) is a 64-bit register which
increments at the same rate as the timebase register, present on POWER8
and later processors.

The register is able to be read/written by the hypervisor and read by
the supervisor. All other accesses are illegal.

Currently the VTB is just an alias for the timebase (TB) register.

Implement the VTB so that is can be read/written independent of the TB.
Make use of the existing method for accessing timebase facilities where
by the compensation is stored and used to compute the value on reads/is
updated on writes.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-12-17 10:39:48 +11:00
Mark Cave-Ayland
d9acba3130 target/ppc: update {get,set}_dfp{64,128}() helper functions to read/write DFP numbers correctly
Since commit ef96e3ae96 "target/ppc: move FP and VMX registers into aligned vsr
register array" FP registers are no longer stored consecutively in memory and so
the current method of combining FP register pairs into DFP numbers is incorrect.

Firstly update the definition of the dh_*_fprp defines in helper.h to reflect
that FP registers are now stored as part of an array of ppc_vsr_t elements
rather than plain uint64_t elements, and then introduce a new ppc_fprp_t type
which conceptually represents a DFP even-odd register pair to be consumed by the
DFP helper functions.

Finally update the new DFP {get,set}_dfp{64,128}() helper functions to convert
between DFP numbers and DFP even-odd register pairs correctly, making use of the
existing VsrD() macro to access the correct elements regardless of host endian.

Fixes: ef96e3ae96 "target/ppc: move FP and VMX registers into aligned vsr register array"
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190926185801.11176-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-04 19:08:21 +10:00
Stefan Brankovic
1872588ede target/ppc: Optimize emulation of vclzw instruction
Optimize Altivec instruction vclzw (Vector Count Leading Zeros Word).
This instruction counts the number of leading zeros of each word element
in source register and places result in the appropriate word element of
destination register.

Counting is to be performed in four iterations of for loop(one for each
word elemnt of source register vB). Every iteration consists of loading
appropriate word element from source register, counting leading zeros
with tcg_gen_clzi_i32, and saving the result in appropriate word element
of destination register.

Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-7-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Stefan Brankovic
b8313f0d91 target/ppc: Optimize emulation of vclzd instruction
Optimize Altivec instruction vclzd (Vector Count Leading Zeros Doubleword).
This instruction counts the number of leading zeros of each doubleword element
in source register and places result in the appropriate doubleword element of
destination register.

Using tcg-s count leading zeros instruction two times(once for each
doubleword element of source register vB) and placing result in
appropriate doubleword element of destination register vD.

Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-6-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Stefan Brankovic
083b3f012f target/ppc: Optimize emulation of vgbbd instruction
Optimize altivec instruction vgbbd (Vector Gather Bits by Bytes by Doubleword)
All ith bits (i in range 1 to 8) of each byte of doubleword element in
source register are concatenated and placed into ith byte of appropriate
doubleword element in destination register.

Following solution is done for both doubleword elements of source register
in parallel, in order to reduce the number of instructions needed(that's why
arrays are used):
First, both doubleword elements of source register vB are placed in
appropriate element of array avr. Bits are gathered in 2x8 iterations(2 for
loops). In first iteration bit 1 of byte 1, bit 2 of byte 2,... bit 8 of
byte 8 are in their final spots so avr[i], i={0,1} can be and-ed with
tcg_mask. For every following iteration, both avr[i] and tcg_mask variables
have to be shifted right for 7 and 8 places, respectively, in order to get
bit 1 of byte 2, bit 2 of byte 3.. bit 7 of byte 8 in their final spots so
shifted avr values(saved in tmp) can be and-ed with new value of tcg_mask...
After first 8 iteration(first loop), all the first bits are in their final
places, all second bits but second bit from eight byte are in their places...
only 1 eight bit from eight byte is in it's place). In second loop we do all
operations symmetrically, in order to get other half of bits in their final
spots. Results for first and second doubleword elements are saved in
result[0] and result[1] respectively. In the end those results are saved in
appropriate doubleword element of destination register vD.

Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-5-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Stefan Brankovic
4e6d0920e7 target/ppc: Optimize emulation of vsl and vsr instructions
Optimization of altivec instructions vsl and vsr(Vector Shift Left/Rigt).
Perform shift operation (left and right respectively) on 128 bit value of
register vA by value specified in bits 125-127 of register vB. Lowest 3
bits in each byte element of register vB must be identical or result is
undefined.

For vsl instruction, the first step is bits 125-127 of register vB have
to be saved in variable sh. Then, the highest sh bits of the lower
doubleword element of register vA are saved in variable shifted,
in order not to lose those bits when shift operation is performed on
the lower doubleword element of register vA, which is the next
step. After shifting the lower doubleword element shift operation
is performed on higher doubleword element of vA, with replacement of
the lowest sh bits(that are now 0) with bits saved in shifted.

For vsr instruction, firstly, the bits 125-127 of register vB have
to be saved in variable sh. Then, the lowest sh bits of the higher
doubleword element of register vA are saved in variable shifted,
in odred not to lose those bits when the shift operation is
performed on the higher doubleword element of register vA, which is
the next step. After shifting higher doubleword element, shift operation
is performed on lower doubleword element of vA, with replacement of
highest sh bits(that are now 0) with bits saved in shifted.

Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-3-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Stefan Brankovic
1cc792698e target/ppc: Optimize emulation of lvsl and lvsr instructions
Adding simple macro that is calling tcg implementation of appropriate
instruction if altivec support is active.

Optimization of altivec instruction lvsl (Load Vector for Shift Left).
Place bytes sh:sh+15 of value 0x00 || 0x01 || 0x02 || ... || 0x1E || 0x1F
in destination register. Sh is calculated by adding 2 source registers and
getting bits 60-63 of result.

First, the bits [28-31] are placed from EA to variable sh. After that,
the bytes are created in the following way:
sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101
followed by addition of the result with 0x0001020304050607. Value obtained
is placed in higher doubleword element of vD.
(sh+8):(sh+15) by adding the result of previous multiplication with
0x08090a0b0c0d0e0f. Value obtained is placed in lower doubleword element
of vD.

Optimization of altivec instruction lvsr (Load Vector for Shift Right).
Place bytes 16-sh:31-sh of value 0x00 || 0x01 || 0x02 || ... || 0x1E ||
0x1F in destination register. Sh is calculated by adding 2 source
registers and getting bits 60-63 of result.

First, the bits [28-31] are placed from EA to variable sh. After that,
the bytes are created in the following way:
sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101
followed by substraction of the result from 0x1011121314151617. Value
obtained is placed in higher doubleword element of vD.
(sh+8):(sh+15) by substracting the result of previous multiplication from
0x18191a1b1c1d1e1f. Value obtained is placed in lower doubleword element
of vD.

Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1563200574-11098-2-git-send-email-stefan.brankovic@rt-rk.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-21 17:17:11 +10:00
Mark Cave-Ayland
c9f4e4d8b6 target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro
Introduce a new GEN_VSX_HELPER_VSX_MADD macro for the generator function which
enables the source and destination registers to be decoded at translation time.

This enables the determination of a or m form to be made at translation time so
that a single helper function can now be used for both variants.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-16-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
5ba5335d93 target/ppc: decode target register in VSX_EXTRACT_INSERT at translation time
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-15-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
2aba168e50 target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-14-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
6ae4a57ab0 target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.c
Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_R2_AB macro which performs the decode based
upon rA and rB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
9922962011 target/ppc: introduce GEN_VSX_HELPER_R2 macro to fpu_helper.c
Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_R2 macro which performs the decode based
upon rD and rB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-12-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
23d0766bd9 target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c
Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_R3 macro which performs the decode based
upon rD, rA and rB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
8d830485fc target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c
Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X1 macro which performs the decode based
upon xB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
033e1fcd97 target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c
Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X2_AB macro which performs the decode based
upon xA and xB at translation time.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-9-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
75cf84cbee target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c
Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X2 macro which performs the decode based
upon xT and xB at translation time.

With the previous change to the xscvqpdp generator and helper functions the
opcode parameter is no longer required in the common case and can be
removed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
e0d6a362be target/ppc: introduce separate generator and helper for xscvqpdp
Rather than perform the VSR register decoding within the helper itself,
introduce a new generator and helper function which perform the decode based
upon xT and xB at translation time.

The xscvqpdp helper is the only 2 parameter xT/xB implementation that requires
the opcode to be passed as an additional parameter, so handling this separately
allows us to optimise the conversion in the next commit.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
99125c7499 target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c
Rather than perform the VSR register decoding within the helper itself,
introduce a new GEN_VSX_HELPER_X3 macro which performs the decode based
upon xT, xA and xB at translation time.

With the previous changes to the VSX_CMP generator and helper macros the
opcode parameter is no longer required in the common case and can be
removed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Mark Cave-Ayland
00084a25ad target/ppc: introduce separate VSX_CMP macro for xvcmp* instructions
Rather than perform the VSR register decoding within the helper itself,
introduce a new VSX_CMP macro which performs the decode based upon xT, xA
and xB at translation time.

Subsequent commits will make the same changes for other instructions however
the xvcmp* instructions are different in that they return a set of flags to be
optionally written back to the crf[6] register. Move this logic from the
helper function to the generator function, along with the float_status update.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190616123751.781-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Richard Henderson
571fbe6ccd target/ppc: Use vector variable shifts for VSL, VSR, VSRA
The gvec expanders take care of masking the shift amount
against the element width.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190518191430.21686-2-richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
Benjamin Herrenschmidt
c4dae9cd37 target/ppc: Flush the TLB locally when the LPIDR is written
Our TCG TLB only tags whether it's a HV vs a guest access, so it must
be flushed when the LPIDR is changed.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215170029.15641-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-26 09:21:25 +11:00
Richard Henderson
73e14c6a9c target/ppc: convert vmin* and vmax* to vector operations
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215100058.20015-18-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18 11:00:44 +11:00
Richard Henderson
fb11ae7daa target/ppc: convert vadd*s and vsub*s to vector operations
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215100058.20015-17-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18 11:00:44 +11:00
Richard Henderson
cc2b90d725 target/ppc: Add helper_mfvscr
This is required before changing the representation of the register.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215100058.20015-13-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18 11:00:44 +11:00
Richard Henderson
dedfaac74e target/ppc: Pass integer to helper_mtvscr
We can re-use this helper elsewhere if we're not passing
in an entire vector register.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215100058.20015-10-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18 11:00:44 +11:00
Richard Henderson
0f6a6d5db8 target/ppc: convert vsplt[bhw] to use vector operations
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215100058.20015-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18 11:00:44 +11:00
Richard Henderson
471ff3d025 target/ppc: convert vspltis[bhw] to use vector operations
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190215100058.20015-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18 11:00:44 +11:00
Mark Cave-Ayland
3e942a1a80 target/ppc: convert vaddu[b,h,w,d] and vsubu[b,h,w,d] over to use vector operations
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190215100058.20015-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-02-18 11:00:44 +11:00
Roman Kapl
50728199c5 target/ppc: add external PID support
External PID is a mechanism present on BookE 2.06 that enables application to
store/load data from different address spaces. There are special version of some
instructions, which operate on alternate address space, which is specified in
the EPLC/EPSC regiser.

This implementation uses two additional MMU modes (mmu_idx) to provide the
address space for the load and store instructions. The QEMU TLB fill code was
modified to recognize these MMU modes and use the values in EPLC/EPSC to find
the proper entry in he PPC TLB. These two QEMU TLBs are also flushed on each
write to EPLC/EPSC.

Following instructions are implemented: dcbfep dcbstep dcbtep dcbtstep dcbzep
dcbzlep icbiep lbepx ldepx lfdepx lhepx lwepx stbepx stdepx stfdepx sthepx
stwepx.

Following vector instructions are not: evlddepx evstddepx lvepx lvepxl stvepx
stvepxl.

Signed-off-by: Roman Kapl <rka@sysgo.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-11-08 12:04:40 +11:00
Richard Henderson
f34ec0f6d7 target/ppc: Convert to HAVE_CMPXCHG128 and HAVE_ATOMIC128
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-18 19:46:53 -07:00
Richard Henderson
86c0cab11a target/ppc: Use non-arithmetic conversions for fp load/store
Memory operations have no side effects on fp state.
The use of a "real" conversions between float64 and float32
would raise exceptions for SNaN and out-of-range inputs.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Richard Henderson
49ab52ef69 target/ppc: Tidy helper_fsqrt
Tidy the invalid exception checking so that we rely on softfloat for
initial argument validation, and select the kind of invalid operand
exception only when we know we must.  Pass and return float64 values
directly rather than bounce through the CPU_DoubleU union.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Richard Henderson
ac43cec37e target/ppc: Tidy helper_fadd, helper_fsub
Tidy the invalid exception checking so that we rely on softfloat for
initial argument validation, and select the kind of invalid operand
exception only when we know we must.  Pass and return float64 values
directly rather than bounce through the CPU_DoubleU union.

Note that because we know float_flag_invalid was set, we do not have
to re-check the signs of the infinities.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Richard Henderson
79f916331d target/ppc: Tidy helper_fmul
Tidy the invalid exception checking so that we rely on softfloat for
initial argument validation, and select the kind of invalid operand
exception only when we know we must.  Pass and return float64 values
directly rather than bounce through the CPU_DoubleU union.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Richard Henderson
ae13018d79 target/ppc: Honor fpscr_ze semantics and tidy fdiv
Divide by zero, exception taken, leaves the destination register
unmodified.  Therefore we must raise the exception before returning
from helper_fdiv.  Move the check from do_float_check_status into
helper_fdiv.

At the same time, tidy the invalid exception checking so that we
rely on softfloat for initial argument validation, and select the
kind of invalid operand exception only when we know we must.

At the same time, pass and return float64 values directly rather
than bounce through the CPU_DoubleU union.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Richard Henderson
4a9b3c5dd3 target/ppc: Use atomic cmpxchg for STQCX
When running in a parallel context, we must use a helper in order
to perform the 128-bit atomic operation.  When running in a serial
context, do the compare before the store.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03 09:56:52 +10:00
Richard Henderson
f89ced5f55 target/ppc: Use atomic store for STQ
Section 1.4 of the Power ISA v3.0B states that this insn is
single-copy atomic.  As we cannot (yet) issue 128-bit stores
within TCG, use the generic helpers provided.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03 09:56:52 +10:00
Richard Henderson
94bf265867 target/ppc: Use atomic load for LQ and LQARX
Section 1.4 of the Power ISA v3.0B states that both of these
instructions are single-copy atomic.  As we cannot (yet) issue
128-bit loads within TCG, use the generic helpers provided.

Since TCG cannot (yet) return a 128-bit value, add a slot within
CPUPPCState for returning the high half of a 128-bit return value.
This solution is preferred to the helper assigning to architectural
registers directly, as it avoids clobbering all TCG live values.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03 09:56:52 +10:00
Joel Stanley
6b37554458 target/ppc: Allow privileged access to SPR_PCR
The powerpc Linux kernel[1] and skiboot firmware[2] recently gained changes
that cause the Processor Compatibility Register (PCR) SPR to be cleared.

These changes cause Linux to fail to boot on the Qemu powernv machine
with an error:

 Trying to write privileged spr 338 (0x152) at 0000000030017f0c

With this patch Qemu makes this register available as a hypervisor
privileged register.

Note that bits set in this register disable features of the processor.
Currently the only register state that is supported is when the register
is zeroed (enable all features). This is sufficient for guests to
once again boot.

[1] https://lkml.kernel.org/r/20180518013742.24095-1-mikey@neuling.org
[2] https://patchwork.ozlabs.org/patch/915932/

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 09:33:52 +10:00
Cédric Le Goater
4a7518e0fd target/ppc: add basic support for PTCR on POWER9
The Partition Table Control Register (PTCR) is a hypervisor privileged
SPR. It contains the host real address of the Partition Table and its
size.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-05-04 09:56:27 +10:00
Cédric Le Goater
7af1e7b022 target/ppc: add support for hypervisor doorbells on book3s CPUs
The hypervisor doorbells are used by skiboot and Linux on POWER9
processors to wake up secondaries.

This adds processor control support to the Server architecture by
reusing the Embedded support. They are very similar, only the bits
definition of the CPU identifier differ.

Still to be done is message broadcast to all threads of the same
processor.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-20 17:15:05 +11:00
Suraj Jitindar Singh
31b2b0f846 target/ppc: Flush TLB on write to PIDR
The PIDR (process id register) is used to store the id of the currently
running process, which is used to select the process table entry used to
perform address translation. This means that when we write to this register
all the translations in the TLB become outdated as they are for a
previously running process. Thus when this register is written to we need
to invalidate the TLB entries to ensure stale entries aren't used to
to perform translation for the new process, which would result in at best
segfaults or alternatively just random memory being accessed.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Fixed compile error for 32-bit targets]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Bharata B Rao
e0aee726bf target-ppc: Add xscvqpudz and xscvqpuwz instructions
xscvqpudz: VSX Scalar truncate & Convert Quad-Precision format to
           Unsigned Doubleword format
xscvqpuwz: VSX Scalar truncate & Convert Quad-Precision format to
           Unsigned Word format

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:28 +11:00
Nikunj A Dadhania
a63f1dfc62 target-ppc: add slbieg instruction
slbieg: SLB Invalidate Entry Global

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:28 +11:00
Bharata B Rao
d4ccd87e68 target-ppc: Add xsmaxjdp and xsminjdp instructions
xsmaxjdp: VSX Scalar Maximum Type-J Double-Precision
xsminjdp: VSX Scalar Minimum Type-J Double-Precision

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:27 +11:00
Bharata B Rao
2770deede0 target-ppc: Add xsmaxcdp and xsmincdp instructions
xsmaxcdp: VSX Scalar Maximum Type-C Double-Precision
xsmincdp: VSX Scalar Minimum Type-C Double-Precision

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:27 +11:00
Jose Ricardo Ziviani
f6b99afdc3 ppc: implement xssubqp instruction
xssubqp: VSX Scalar Subtract Quad-Precision.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:27 +11:00
Jose Ricardo Ziviani
a4a68476de ppc: implement xssqrtqp instruction
xssqrtqp: VSX Scalar Square Root Quad-Precision.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:27 +11:00
Jose Ricardo Ziviani
917950d7f5 ppc: implement xsrqpxp instruction
xsrqpxp: VSX Scalar Round Quad-Precision to Double-Extended Precision.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:27 +11:00
Jose Ricardo Ziviani
be07ad5842 ppc: implement xsrqpi[x] instruction
xsrqpi[x]: VSX Scalar Round to Quad-Precision Integer
[with Inexact].

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:27 +11:00
Nikunj A Dadhania
78241762c4 target-ppc: Add xststdc[sp, dp, qp] instructions
xststdcsp: VSX Scalar Test Data Class Single-Precision
xststdcdp: VSX Scalar Test Data Class Double-Precision
xststdcqp: VSX Scalar Test Data Class Quad-Precision

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-02 09:30:06 +11:00
Nikunj A Dadhania
403a884a40 target-ppc: Add xvtstdc[sp,dp] instructions
xvtstdcsp: VSX Vector Test Data Class Single-Precision
xvtstdcdp: VSX Vector Test Data Class Double-Precision

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-02 09:30:06 +11:00
Nikunj A Dadhania
8b920d8abc target-ppc: Add xvcv[hpsp, sphp] instructions
xvcvhpsp: VSX Vector Convert Half Precision to Single Precision
xvcvsphp: VSX Vector Convert Single Precision to Half Precision

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Bharata B Rao
a811ec0491 target-ppc: Add xsmulqp instruction
xsmulqp: VSX Scalar Multiply Quad-Precision

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Bharata B Rao
314c116347 target-ppc: Add xsdivqp instruction
xsdivqp: VSX Scalar Divide Quad-Precision

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Bharata B Rao
48ef23cb26 target-ppc: Add xscvsdqp and xscvudqp instructions
xscvsdqp: VSX Scalar Convert Signed Doubleword format to
          Quad-Precision format
xscvudqp: VSX Scalar Convert Unsigned Doubleword format to
          Quad-Precision format

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Jose Ricardo Ziviani
5c32e2e4a0 ppc: Implement bcdutrunc. instruction
bcdutrunc. Decimal unsigned truncate. Works like bcdtrunc. with
unsigned BCD numbers.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Jose Ricardo Ziviani
31bc4d114a ppc: Implement bcdtrunc. instruction
bcdtrunc.: Decimal integer truncate. Given a BCD number in vrb and the
number of bytes to truncate in vra, the return register will have vrb
with such bits truncated.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Bharata B Rao
05590b9252 target-ppc: Add xscvqps[d,w]z instructions
xscvqpsdz: VSX Scalar truncate & Convert Quad-Precision format to
           Signed Doubleword format
xscvqpswz: VSX Scalar truncate & Convert Quad-Precision format to
           Signed Word format

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Nikunj A Dadhania
c5969d2eb1 target-ppc: Add xvxsigsp instruction
xvxsigsp: VSX Vector Extract Significand Single Precision

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Jose Ricardo Ziviani
a54238adac ppc: Implement bcdsr. instruction
bcdsr.: Decimal shift and round. This instruction works like bcds.
however, when performing right shift, 1 will be added to the
result if the last digit was >= 5.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Jose Ricardo Ziviani
a49a95e9e4 ppc: Implement bcdus. instruction
bcdus.: Decimal unsigned shift. This instruction works like bcds. but
considers only unsigned BCDs (no sign in least meaning 4 bits).

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00