Commit Graph

93 Commits

Author SHA1 Message Date
Peter Maydell
3394116f47 target/arm: Implement MVE vector shift right by immediate insns
Implement the MVE vector shift right by immediate insns VSHRI and
VRSHRI.  As with Neon, we implement these by using helper functions
which perform left shifts but allow negative shift counts to indicate
right shifts.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-9-peter.maydell@linaro.org
2021-07-02 11:48:37 +01:00
Peter Maydell
2c0286dba4 target/arm: Use asimd_imm_const for A64 decode
The A64 AdvSIMD modified-immediate grouping uses almost the same
constant encoding that A32 Neon does; reuse asimd_imm_const() (to
which we add the AArch64-specific case for cmode 15 op 1) instead of
reimplementing it all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-5-peter.maydell@linaro.org
2021-07-02 11:48:36 +01:00
Peter Maydell
dfd66bc0f3 target/arm: Make asimd_imm_const() public
The function asimd_imm_const() in translate-neon.c is an
implementation of the pseudocode AdvSIMDExpandImm(), which we will
also want for MVE.  Move the implementation to translate.c, with a
prototype in translate.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210628135835.6690-4-peter.maydell@linaro.org
2021-07-02 11:48:36 +01:00
Peter Maydell
1d2386f70a target/arm: Implement MVE VMLALDAV
Implement the MVE VMLALDAV insn, which multiplies pairs of integer
elements, accumulating them into a 64-bit result in a pair of
general-purpose registers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210617121628.20116-20-peter.maydell@linaro.org
2021-06-21 17:12:51 +01:00
Peter Maydell
5138bd0143 target/arm: Add handling for PSR.ECI/ICI
On A-profile, PSR bits [15:10][26:25] are always the IT state bits.
On M-profile, some of the reserved encodings of the IT state are used
to instead indicate partial progress through instructions that were
interrupted partway through by an exception and can be resumed.

These resumable instructions fall into two categories:

(1) load/store multiple instructions, where these bits are called
"ICI" and specify the register in the ldm/stm list where execution
should resume.  (Specifically: LDM, STM, VLDM, VSTM, VLLDM, VLSTM,
CLRM, VSCCLRM.)

(2) MVE instructions subject to beatwise execution, where these bits
are called "ECI" and specify which beats in this and possibly also
the following MVE insn have been executed.

There are also a few insns (LE, LETP, and BKPT) which do not use the
ICI/ECI bits but must leave them alone.

Otherwise, we should raise an INVSTATE UsageFault for any attempt to
execute an insn with non-zero ICI/ECI bits.

So far we have been able to ignore ECI/ICI, because the architecture
allows the IMPDEF choice of "always restart load/store multiple from
the beginning regardless of ICI state", so the only thing we have
been missing is that we don't raise the INVSTATE fault for bad guest
code.  However, MVE requires that we honour ECI bits and do not
rexecute beats of an insn that have already been executed.

Add the support in the decoder for handling ECI/ICI:
 * identify the ECI/ICI case in the CONDEXEC TB flags
 * when a load/store multiple insn succeeds, it updates the ECI/ICI
   state (both in DisasContext and in the CPU state), and sets a flag
   to say that the ECI/ICI state was handled
 * if we find that the insn we just decoded did not handle the
   ECI/ICI state, we delete all the code that we just generated for
   it and instead emit the code to raise the INVFAULT.  This allows
   us to avoid having to update every non-MVE non-LDM/STM insn to
   make it check for "is ECI/ICI set?".

We continue with our existing IMPDEF choice of not caring about the
ICI state for the load/store multiples and simply restarting them
from the beginning.  Because we don't allow interrupts in the middle
of an insn, the only way we would see this state is if the guest set
ICI manually on return from an exception handler, so it's a corner
case which doesn't merit optimisation.

ICI update for LDM/STM is simple -- it always zeroes the state.  ECI
update for MVE beatwise insns will be a little more complex, since
the ECI state may include information for the following insn.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210614151007.4545-5-peter.maydell@linaro.org
2021-06-16 14:33:52 +01:00
Peter Maydell
9194a9cbc7 target/arm: Move NeonGenThreeOpEnvFn typedef to translate.h
Move the NeonGenThreeOpEnvFn typedef to translate.h together
with the other similar typedefs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210430132740.10391-12-peter.maydell@linaro.org
2021-05-10 13:24:09 +01:00
Peter Maydell
d9318a5f9c target/arm: Share unallocated_encoding() and gen_exception_insn()
The unallocated_encoding() function is the same in both
translate-a64.c and translate.c; make the translate.c function global
and drop the translate-a64.c version.  To do this we need to also
share gen_exception_insn(), which currently exists in two slightly
different versions for A32 and A64: merge those into a single
function that can work for both.

This will be useful for splitting up translate.c, which will require
unallocated_encoding() to no longer be file-local.  It's also
hopefully less confusing to have only one version of the function
rather than two.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210430132740.10391-3-peter.maydell@linaro.org
2021-05-10 13:24:09 +01:00
Peter Maydell
b5aa664679 target/arm: Move constant expanders to translate.h
Some of the constant expanders defined in translate.c are generically
useful and will be used by the separate C files for VFP and Neon once
they are created; move the expander definitions to translate.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210430132740.10391-2-peter.maydell@linaro.org
2021-05-10 13:24:09 +01:00
Richard Henderson
a8502b37f6 target/arm: Enforce alignment for VLDn (all lanes)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210419202257.161730-23-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-30 11:16:51 +01:00
Richard Henderson
9d486b40e8 target/arm: Adjust gen_aa32_{ld, st}_i32 for align+endianness
Create a finalize_memop function that computes alignment and
endianness and returns the final MemOp for the operation.

Split out gen_aa32_{ld,st}_internal_i32 which bypasses any special
handling of endianness or alignment.  Adjust gen_aa32_{ld,st}_i32
so that s->be_data is not added by the callers.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210419202257.161730-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-30 11:16:50 +01:00
Richard Henderson
4479ec30c9 target/arm: Add ALIGN_MEM to TBFLAG_ANY
Use this to signal when memory access alignment is required.
This value comes from the CCR register for M-profile, and
from the SCTLR register for A-profile.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210419202257.161730-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-30 11:16:50 +01:00
Richard Henderson
a378206a20 target/arm: Move mode specific TB flags to tb->cs_base
Now that we have all of the proper macros defined, expanding
the CPUARMTBFlags structure and populating the two TB fields
is relatively simple.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210419202257.161730-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-30 11:16:50 +01:00
Richard Henderson
3902bfc6f0 target/arm: Introduce CPUARMTBFlags
In preparation for splitting tb->flags across multiple
fields, introduce a structure to hold the value(s).
So far this only migrates the one uint32_t and fixes
all of the places that require adjustment to match.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210419202257.161730-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-30 11:16:50 +01:00
Richard Henderson
8a40fe5f1b target/arm: Rearrange {sve,fp}_check_access assert
We want to ensure that access is checked by the time we ask
for a specific fp/vector register.  We want to ensure that
we do not emit two lots of code to raise an exception.

But sometimes it's difficult to cleanly organize the code
such that we never pass through sve_check_access exactly once.
Allow multiple calls so long as the result is true, that is,
no exception to be raised.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200815013145.539409-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-08-28 10:02:47 +01:00
Peter Maydell
aaae563bc7 target/arm: Implement FPST_STD_F16 fpstatus
Architecturally, Neon FP16 operations use the "standard FPSCR" like
all other Neon operations.  However, this is defined in the Arm ARM
pseudocode as "a fixed value, except that FZ16 (and AHP) follow the
FPSCR bits". In QEMU, the softfloat float_status doesn't include
separate flush-to-zero for FP16 operations, so we must keep separate
fp_status for "Neon non-FP16" and "Neon fp16" operations, in the
same way we do already for the non-Neon "fp_status" vs "fp_status_f16".

Add the extra float_status field to the CPU state structure,
ensure it is correctly initialized and updated on FPSCR writes,
and make fpstatus_ptr(FPST_STD_F16) return a pointer to it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200806104453.30393-4-peter.maydell@linaro.org
2020-08-24 10:15:11 +01:00
Peter Maydell
cdfb22bb73 target/arm: Replace A64 get_fpstatus_ptr() with generic fpstatus_ptr()
We currently have two versions of get_fpstatus_ptr(), which both take
an effectively boolean argument:
 * the one for A64 takes "bool is_f16" to distinguish fp16 from other ops
 * the one for A32/T32 takes "int neon" to distinguish Neon from other ops

This is confusing, and to implement ARMv8.2-FP16 the A32/T32 one will
need to make a four-way distinction between "non-Neon, FP16",
"non-Neon, single/double", "Neon, FP16" and "Neon, single/double".
The A64 version will then be a strict subset of the A32/T32 version.

To clean this all up, we want to go to a single implementation which
takes an enum argument with values FPST_FPCR, FPST_STD,
FPST_FPCR_F16, and FPST_STD_F16.  We rename the function to
fpstatus_ptr() so that unconverted code gets a compilation error
rather than silently passing the wrong thing to the new function.

This commit implements that new API, and converts A64 to use it:
 get_fpstatus_ptr(false) -> fpstatus_ptr(FPST_FPCR)
 get_fpstatus_ptr(true) -> fpstatus_ptr(FPST_FPCR_F16)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200806104453.30393-2-peter.maydell@linaro.org
2020-08-24 10:15:11 +01:00
Richard Henderson
5f716a8238 target/arm: Implement the LDGM, STGM, STZGM instructions
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200626033144.790098-20-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-26 14:31:12 +01:00
Richard Henderson
81ae05fa2d target/arm: Add MTE bits to tb_flags
Cache the composite ATA setting.

Cache when MTE is fully enabled, i.e. access to tags are enabled
and tag checks affect the PE.  Do this for both the normal context
and the UNPRIV context.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200626033144.790098-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-26 14:31:12 +01:00
Richard Henderson
329833286d target/arm: Add DISAS_UPDATE_NOCHAIN
Add an option that writes back the PC, like DISAS_UPDATE_EXIT,
but does not exit back to the main loop.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200626033144.790098-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-26 14:31:12 +01:00
Richard Henderson
14407ec200 target/arm: Rename DISAS_UPDATE to DISAS_UPDATE_EXIT
Emphasize that the is_jmp option exits to the main loop.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200626033144.790098-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-26 14:31:12 +01:00
Peter Maydell
3e96b20528 target/arm: Convert simple fp Neon 2-reg-misc insns
Convert the Neon 2-reg-misc insns which are implemented with
simple calls to functions that take the input, output and
fpstatus pointer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200616170844.13318-16-peter.maydell@linaro.org
2020-06-23 11:39:47 +01:00
Peter Maydell
8966808205 target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree
Convert the VREV32 and VREV16 insns in the Neon 2-reg-misc group
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200616170844.13318-13-peter.maydell@linaro.org
2020-06-23 11:39:46 +01:00
Peter Maydell
5de3fd045b target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs
All the other typedefs like these spell "Op" with a lowercase 'p';
remane the NeonGenTwoSingleOPFn and NeonGenTwoDoubleOPFn typedefs to
match.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200616170844.13318-11-peter.maydell@linaro.org
2020-06-23 11:39:46 +01:00
Peter Maydell
039f4e809a target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
The NeonGenOneOpFn typedef breaks with the pattern of the other
NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
(which we will need in a subsequent commit).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200616170844.13318-10-peter.maydell@linaro.org
2020-06-23 11:39:46 +01:00
Peter Maydell
f5b2840120 target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree
Convert the Neon 3-reg-diff insns VABAL and VABDL to decodetree.
Like almost all the remaining insns in this group, these are
a combination of a two-input operation which returns a double width
result and then a possible accumulation of that double width
result into the destination.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2020-06-16 10:32:25 +01:00
Richard Henderson
cfdb2c0c95 target/arm: Vectorize SABA/UABA
Include 64-bit element size in preparation for SVE2.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
50c160d44e target/arm: Vectorize SABD/UABD
Include 64-bit element size in preparation for SVE2.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-16-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
146aa66ce5 target/arm: Create gen_gvec_{qrdmla,qrdmls}
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
c7715b6b51 target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub}
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
8161b75357 target/arm: Create gen_gvec_{cmtst,ushl,sshl}
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
271063206a target/arm: Create gen_gvec_{mla,mls}
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
69d5e2bf8c target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.

Macro-ize the 5 nearly identical comparisons.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
893ab0542a target/arm: Create gen_gvec_{sri,sli}
The functions eliminate duplication of the special cases for
this operation.  They match up with the GVecGen2iFn typedef.

Add out-of-line helpers.  We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
6ccd48d4ea target/arm: Create gen_gvec_{u,s}{rshr,rsra}
Create vectorized versions of handle_shri_with_rndacc
for shift+round and shift+round+accumulate.  Add out-of-line
helpers in preparation for longer vector lengths from SVE.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Richard Henderson
631e565450 target/arm: Create gen_gvec_[us]sra
The functions eliminate duplication of the special cases for
this operation.  They match up with the GVecGen2iFn typedef.

Add out-of-line helpers.  We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-05-14 15:03:08 +01:00
Peter Maydell
9aefc6cf9b target/arm: Move gen_ function typedefs to translate.h
We're going to want at least some of the NeonGen* typedefs
for the refactored 32-bit Neon decoder, so move them all
to translate.h since it makes more sense to keep them in
one group.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-23-peter.maydell@linaro.org
2020-05-04 12:59:26 +01:00
Peter Maydell
a4e143ac5b target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree
Convert the Neon 3-reg-same VADD and VSUB insns to decodetree.

Note that we don't need the neon_3r_sizes[op] check here because all
size values are OK for VADD and VSUB; we'll add this when we convert
the first insn that has size restrictions.

For this we need one of the GVecGen*Fn typedefs currently in
translate-a64.h; move them all to translate.h as a block so they
are visible to the 32-bit decoder.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200430181003.21682-15-peter.maydell@linaro.org
2020-05-04 12:57:56 +01:00
Richard Henderson
6b375d3546 target/arm: Vectorize integer comparison vs zero
These instructions are often used in glibc's string routines.
They were the final uses of the 32-bit at a time neon helpers.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200418162808.4680-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-04-30 15:35:41 +01:00
Richard Henderson
87b74e8b6e target/arm: Vectorize USHL and SSHL
These instructions shift left or right depending on the sign
of the input, and 7 bits are significant to the shift.  This
requires several masks and selects in addition to the actual
shifts to form the complete answer.

That said, the operation is still a small improvement even for
two 64-bit elements -- 13 vector operations instead of 2 * 7
integer operations.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 16:07:02 +00:00
Richard Henderson
cc28fc30e3 target/arm: Update get_a64_user_mem_index for VHE
The EL2&0 translation regime is affected by Load Register (unpriv).

The code structure used here will facilitate later changes in this
area for implementing UAO and NV.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-36-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:26 +00:00
Richard Henderson
fba37aedec target/arm: Rename ARMMMUIdx_S1SE[01] to ARMMMUIdx_SE10_[01]
This is part of a reorganization to the set of mmu_idx.
This emphasizes that they apply to the Secure EL1&0 regime.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200206105448.4726-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-07 14:04:23 +00:00
Marc Zyngier
5bb0a20b74 target/arm: Handle AArch32 CP15 trapping via HSTR_EL2
HSTR_EL2 offers a way to trap ranges of CP15 system register
accesses to EL2, and it looks like this register is completely
ignored by QEMU.

To avoid adding extra .accessfn filters all over the place (which
would have a direct performance impact), let's add a new TB flag
that gets set whenever HSTR_EL2 is non-zero and that QEMU translates
a context where this trap has a chance to apply, and only generate
the extra access check if the hypervisor is actively using this feature.

Tested with a hand-crafted KVM guest accessing CBAR.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191201122018.25808-5-maz@kernel.org
[PMM: use is_a64(); fix comment syntax]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-12-16 10:46:35 +00:00
Peter Maydell
9de65783e1 Allow page table bit to swap endianness.
Reorganize watchpoints out of i/o path.
 Return host address from probe_write / probe_access.
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAl1uiyYdHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8AuwgAnYLQQbL8kjSqzp7q
 gRlj0M2SX41ZW3fMkI794RwsljD9Z0QS7YGnpzHolig9XUYrGnip7STrMvlCr/1L
 CIMWNHlgitgBMszLqg42/TB+6RxXn+DMX/ShUzTagC6xQhinCIpdEjoLaTKSgeP+
 foIyJ2uoJLKOBP8cPTQp8evongtoQIljpsZZ0K8a4sreO1d6ytH+olkuoGiROft+
 VoJkA+kNHd9cE+LPCva8UFGu1QE6uCySvhepzOpnvOtK+SXKUm2yLOFGu7RWP1pT
 RkE0oRyRnImtg+cViHfUUFogIffFROdL5tuYMQVuqbINeROPUgJPav+R1Nz1P60a
 xM2HEw==
 =bLLU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190903' into staging

Allow page table bit to swap endianness.
Reorganize watchpoints out of i/o path.
Return host address from probe_write / probe_access.

# gpg: Signature made Tue 03 Sep 2019 16:47:50 BST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20190903: (36 commits)
  tcg: Factor out probe_write() logic into probe_access()
  tcg: Make probe_write() return a pointer to the host page
  s390x/tcg: Pass a size to probe_write() in do_csst()
  hppa/tcg: Call probe_write() also for CONFIG_USER_ONLY
  mips/tcg: Call probe_write() for CONFIG_USER_ONLY as well
  tcg: Enforce single page access in probe_write()
  tcg: Factor out CONFIG_USER_ONLY probe_write() from s390x code
  s390x/tcg: Fix length calculation in probe_write_access()
  s390x/tcg: Use guest_addr_valid() instead of h2g_valid() in probe_write_access()
  tcg: Check for watchpoints in probe_write()
  cputlb: Handle watchpoints via TLB_WATCHPOINT
  cputlb: Remove double-alignment in store_helper
  cputlb: Fix size operand for tlb_fill on unaligned store
  exec: Factor out cpu_watchpoint_address_matches
  cputlb: Fold TLB_RECHECK into TLB_INVALID_MASK
  exec: Factor out core logic of check_watchpoint()
  exec: Move user-only watchpoint stubs inline
  target/sparc: sun4u Invert Endian TTE bit
  target/sparc: Add TLB entry with attributes
  cputlb: Byte swap memory transaction attribute
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-09-04 16:29:18 +01:00
Tony Nguyen
14776ab5a1 tcg: TCGMemOp is now accelerator independent MemOp
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.

Target dependant attributes are conditionalized upon NEED_CPU_H.

Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-09-03 08:30:38 -07:00
Richard Henderson
429a71d67e Revert "target/arm: Use unallocated_encoding for aarch32"
This reverts commit 3cb3663715.

Despite the fact that the text for the call to gen_exception_insn
is identical for aarch64 and aarch32, the implementation inside
gen_exception_insn is totally different.

This fixes exceptions raised from aarch64.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20190826151536.6771-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-09-03 16:20:34 +01:00
Richard Henderson
3cb3663715 target/arm: Use unallocated_encoding for aarch32
Promote this function from aarch64 to fully general use.
Use it to unify the code sequences for generating illegal
opcode exceptions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-16 14:02:50 +01:00
Richard Henderson
a04159166b target/arm: Replace s->pc with s->base.pc_next
We must update s->base.pc_next when we return from the translate_insn
hook to the main translator loop.  By incrementing s->base.pc_next
immediately after reading the insn word, "pc_next" contains the address
of the next instruction throughout translation.

All remaining uses of s->pc are referencing the address of the next insn,
so this is now a simple global replacement.  Remove the "s->pc" field.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-16 14:02:49 +01:00
Richard Henderson
43722a6d4f target/arm: Introduce pc_curr
Add a new field to retain the address of the instruction currently
being translated.  The 32-bit uses are all within subroutines used
by a32 and t32.  This will become less obvious when t16 support is
merged with a32+t32, and having a clear definition will help.

Convert aarch64 as well for consistency.  Note that there is one
instance of a pre-assert fprintf that used the wrong value for the
address of the current instruction.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-16 14:02:49 +01:00
Peter Maydell
8bd587c106 target/arm: Fix routing of singlestep exceptions
When generating an architectural single-step exception we were
routing it to the "default exception level", which is to say
the same exception level we execute at except that EL0 exceptions
go to EL1. This is incorrect because the debug exception level
can be configured by the guest for situations such as single
stepping of EL0 and EL1 code by EL2.

We have to track the target debug exception level in the TB
flags, because it is dependent on CPU state like HCR_EL2.TGE
and MDCR_EL2.TDE. (That we were previously calling the
arm_debug_target_el() function to determine dc->ss_same_el
is itself a bug, though one that would only have manifested
as incorrect syndrome information.) Since we are out of TB
flag bits unless we want to expand into the cs_base field,
we share some bits with the M-profile only HANDLER and
STACKCHECK bits, since only A-profile has this singlestep.

Fixes: https://bugs.launchpad.net/qemu/+bug/1838913
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190805130952.4415-3-peter.maydell@linaro.org
2019-08-16 14:02:49 +01:00
Peter Maydell
c1d5f50f09 target/arm: Factor out 'generate singlestep exception' function
Factor out code to 'generate a singlestep exception', which is
currently repeated in four places.

To do this we need to also pull the identical copies of the
gen-exception() function out of translate-a64.c and translate.c
into translate.h.

(There is a bug in the code: we're taking the exception to the wrong
target EL.  This will be simpler to fix if there's only one place to
do it.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190805130952.4415-2-peter.maydell@linaro.org
2019-08-16 14:02:48 +01:00