Many of the Neon "2 register misc" instruction forms require invalid
size fields to cause the instruction to UNDEF. Pull this information
out into an array; this simplifies the code and also means we can do
the check early and avoid the problem of leaking TCG temporaries in
the illegal_op case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add missing checks for cases which must UNDEF in the Neon "2 registers and
a scalar" data processing instruction space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add missing UNDEF checks for instructions in the Neon "3 registers of
different widths" data processing space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For Neon "one register and a modified immediate value" forms, the
combination op=1 cmode=1111 is unallocated and should UNDEF.
All instructions of this form also UNDEF if Q == 1 and Vd<0> == 1.
We also add a comment on the only UNPREDICTABLE in this space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Collapse some switch cases for VSRI into those for VSHL, VSLI,
since the bodies are the same. (This is not completely obvious
for the size < 3 case, but since for VSRI we know U=1 the
GEN_NEON_INTEGER_OP() expansion is equivalent to the open-coded
VSHL/VSLI case.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correctly handle all the UNDEF cases for Neon instructions of the
"2 registers and shift" form, and make sure that we check for these
cases early enough not to leak TCG temporaries.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Since we know that the case of (pairwise && q) has been caught
earlier, we can simplify the register setup code for each pass
in the three-register-same-size Neon loop.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correct the handling of UNDEF cases for the NEON "3 registers same
size" forms, by adding missing checks and rationalising some others
so they are done early enough to avoid leaking TCG temporaries.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Simplify the checks for invalid size values for the Neon "three registers
of the same size" instruction forms (and add them where they were missing)
by using a lookup table.
This includes adding symbolic constants for the op values in this space,
since we now use them in multiple places.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Currently target-arm/ assumes at least ARMv5 core. Add support for
handling also ARMv4/ARMv4T. This changes the following instructions:
BX(v4T and later)
BKPT, BLX, CDP2, CLZ, LDC2, LDRD, MCRR, MCRR2, MRRC, MCRR, MRC2, MRRC,
MRRC2, PLD QADD, QDADD, QDSUB, QSUB, STRD, SMLAxy, SMLALxy, SMLAWxy,
SMULxy, SMULWxy, STC2 (v5 and later)
All instructions that are "v5TE and later" are also bound to just v5, as
that's how it was before.
This patch doesn _not_ include disabling of cp15 access and base-updated
data abort model (that will be required to emulate chips based on a
ARM7TDMI), because:
* no ARM7TDMI chips are currently emulated (or planned)
* those features aren't strictly necessary for my purposes (SA-1 core
emulation).
All v5 models are handled as they are v5T. Internally we still have a
check if the model is a v5(T) or v5TE, but as all emulated cores are
v5TE, those two cases are simply aliased (for now).
Patch is heavily based on patch by Filip Navara <filip.navara@gmail.com>
which in turn is based on work by Ulrich Hecht <uli@suse.de> and Vincent
Sanders <vince@kyllikki.org>.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
tcg_gen_exit_tb takes a parameter of type tcg_target_long,
so the type casts of pointer to long should be replaced by
type casts of pointer to tcg_target_long (suggested by Blue Swirl).
These changes are needed for build environments where
sizeof(long) != sizeof(void *), especially for w64.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the global 'env' variable in the helper functions in iwmmxt_helper.c.
This means we don't need to pass env as an argument to them any more.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the global 'env' variable in the helper functions in neon_helper.c.
This means we don't need to pass env as an argument to them any more.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implementing the floating-point versions of VCLE #0 and VCLT #0 by
doing a GT comparison and inverting the result gives the wrong
result if the input is a NaN. Implement as a GT comparison with the
operands swapped instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Move the allocation and freeing of the TCG temp used for the address for
Neon load/store instructions so that we don't allocate the temporary
until we've done enough decoding to know that the instruction is not
an UNDEF pattern; this avoids leaking the TCG temp in these cases.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix several bugs in VLD of single element to all lanes:
The "single element to all lanes" form of VLD1 differs from those for
VLD2, VLD3 and VLD4 in that bit 5 indicates whether the loaded element
should be written to one or two Dregs (rather than being a register
stride). Handle this by special-casing VLD1 rather than trying to
have one loop which deals with both VLD1 and 2/3/4.
Handle VLD4.32 with 16 byte alignment specified, rather than UNDEFfing.
UNDEF for the invalid size and alignment combinations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
SMUAD and SMLAD are supposed to set the Q bit if the addition of
the two 16x16 multiply products and optional accumulator overflows
considered as a signed value. However we were only doing this check
for the addition of the accumulator, not when adding the products,
with the effect that we were mishandling the edge case where
both inputs are 0x80008000.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Decode of Thumb load/store was merging together the cases of 'bit 11==0'
(reg+reg LSL imm) and 'bit 11==1' (reg+imm). This happens to work for
valid instruction patterns but meant that we would not UNDEF for the
cases the architecture mandates that we must. Make the decode actually
look at bit 11 as well as [10..8] so that we UNDEF in the right places.
This change also removes what was a spurious unreachable 'case 8',
and correctly frees TCG temporaries on the illegal-insn codepaths.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Newer ARM kernels try to probe for whether the CPU has hardware breakpoint
support. For this to work QEMU has to implement a minimal set of the cp14
debug registers. The architecture requires v7 cores to implement debug
and so there is no defined way to report its absence; however in practice
returning a zero DBGDIDR (ie with a reserved value for "debug architecture
version") should cause well-written hw debug users to do the right thing.
We also implement DBGDRAR and DBGDSAR as RAZ, indicating no memory mapped
debug components.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the new TCG temporary leak debugging facilities to
check that each ARM instruction does not leak temporaries.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This commit removes the ad-hoc resource leak checking code from
target-arm. This includes replacing all uses of new_tmp() with
tcg_temp_new_i32() and all uses of dead_tmp() with
tcg_temp_free_i32().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The code for Thumb2 ORNS (or negated and set flags) was trashing
a TCG input register which was needed later for use in calculating
flags, with the effect that the carry flag was always set with
the wrong sense. Fix this by using the TCG orc op instead of
separate not and or ops.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix two bugs in the translation of the instructions VMOV sa,sb,rx,ry and
VMOV rx,ry,sa,sb (which copy between a pair of ARM core registers and a
pair of VFP single precision registers):
* An incorrect condition meant these instruction patterns were being
treated as load/store multiple, which resulted in the generation
of bad code and a runtime segfault
* The order of the core register pair was reversed so the values would
go to the wrong registers
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
In v7 of the ARM architecture, WFI (wait for interrupt) is a first-class
instruction, but in previous versions this functionality was provided
via a cp15 coprocessor register. Add correct feature checks to the
decoding of the cp15 WFI instructions so that they behave correctly
for newer cores. In particular, the old 0,c7,c8,2 encoding used on
ARM940 has been reused for VA-to-PA translation in v6 and v7.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For Neon shifts by immediate and narrow, correctly handle the case
where the source registers and the destination registers overlap
(the second pass should use the original register contents, not the
results of the first pass).
This includes a refactoring to pull the size check outside the
loop rather than inside, since there is now very little common
code between the size == 3 and size != 3 case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Pull the code which decodes narrowing operations as being either
signed/unsigned saturate or plain out into its own function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Call the normal shift helpers instead of the rounding ones.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Move the implementation of the Neon VUZP unzip instruction from inline
code to helper functions. (At 50+ TCG ops it was well over the
recommended limit for coding inline.) The helper implementations also
give the correct answers where the inline implementation did not.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Move the implementation of the Neon VUZP unzip instruction from inline
code to helper functions. (At 50+ TCG ops it was well over the
recommended limit for coding inline.) The helper implementations also
fix the handling of the quadword version of the instruction.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
We handle Thumb Neon data processing instructions by converting them
into the equivalent ARM encoding, as the two are very close. However
the ARM encoding should have bit 28 set, not clear. This wasn't causing
any problems because we don't actually look at that bit during decode;
however it is better to do the conversion correctly to avoid problems
later if we add checks to UNDEF on SBZ/SBO bits.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For VQDMLSL, negation has to occur after saturation, not before.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Refactor the handling of VQDMULL so that it is dealt with in
its own if() case rather than together with the accumulating
instructions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement VMULL.P8 (the 32x32->64 version of the polynomial multiply
instruction).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The Neon half-precision conversion operations (VCVT.F16.F32 and
VCVT.F32.F16) use ARM standard floating-point arithmetic, unlike
the VFP versions (VCVTB and VCVTT).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix bit mask used when widening the result of shift on narrow input.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
VQMOVUN does a signed-to-unsigned saturating conversion. This is
different from both the signed-to-signed and unsigned-to-unsigned
conversions already implemented, so we need a new set of helper
functions (neon_unarrow_sat*).
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Refine the decoding of the Thumb preload and hint space, so we
UNDEF on the patterns that are supposed to UNDEF rather than NOP.
We also move the tests for this space earlier, so we don't emit
harmless but unnecessary address generation code for preload
hints (which by their nature are likely to be in hot code paths).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correct the decoding of the ARM preload and memory hint space,
by adding decoding of PLI, PLDW and the v7MP unallocated hint
space. This commit also corrects a slightly overexuberant
decoding of PLD(register) which was not checking that bit 4
was one.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This patch fixes the errors reported by my tests in VSRA.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix the register and part of register we get the scalar from in
the various "multiply vector by scalar" ops (VMUL by scalar
and friends).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This patch fixes corner-case saturations, when the target range is
zero. It merely removes the guard against (sh == 0), and makes:
__ssat(0x87654321, 1) return 0xffffffff and set the saturation flag
__usat(0x87654321, 0) return 0 and set the saturation flag
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add support for logging the start of instructions in TCG
code debug dumps for ARM targets.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
We were not correctly restoring the IT bits when resuming execution
after taking an unexpected exception in the middle of an IT block.
Fix this by tracking them along with PC changes and restoring in
gen_pc_load().
This fixes bug https://bugs.launchpad.net/qemu/+bug/581335
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Create a new function which does the common sequence of gen_set_condexec,
gen_set_pc_im, gen_exception, set is_jmp to DISAS_JUMP.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Remove a redundant call to gen_set_condexec() in the translation of Thumb
mode SWI. (SWI and WFI generate "exceptions" which happen after the
execution of the instruction, ie when PC and IT bits have updated.
So the condexec bits at this point are not correct. However, the code
that handles finishing the translation of the TB will write the correct
value of the condexec bits later, so the only effect was that a conditional
Thumb SWI would generate slightly worse code than necessary.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When translating, get the user/priv state from the TB flags, not
the CPUState.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When translating, the condexec bits for the TB are in the TB flags;
the CPUState condexec bits may be different.
This patch fixes https://bugs.launchpad.net/bugs/604872 where we might
segfault if we took an exception in the middle of a TB with an IT
block, because when we came to retranslate in cpu_restore_state()
the CPUState condexec bits would have advanced compared to the start
of the TB and we would generate different (wrong) code.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The Thumb/ARM state for the TB being translated should come from
the TB flags, not the CPUState.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When translating, the VFP vector length and stride for this TB are encoded
in the TB flags; the CPUState copies may be different and must not be used.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When translating code, whether the VFP unit is enabled for this TB
is stored in a bit in the TB flags. Use this rather than incorrectly
reading the FPEXC from the CPUState passed to translation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When translating the SRS instruction, handle the "store registers
to stack of current mode" case in the helper function rather than
inline. This means the generated code does not make assumptions
about the current CPU mode which might not be valid when the TB
is executed later.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix errors in the decoding of ARM VQSHL/VQSHLU immediate forms,
including using the new VQSHLU helper functions where appropriate.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
SMMLA and SMMLS are broken on both in normal and thumb mode, that is
both (different) implementations are wrong. They try to avoid a 64-bit
add for the rounding, which is not trivial if you want to support both
SMMLA and SMMLS with the same code.
The code below uses the same implementation for both modes, using the
code from the ARM manual. It also fixes the thumb decoding that was a
mix between normal and thumb mode.
This fixes the issues reported in
https://bugs.launchpad.net/qemu/+bug/629298
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
UMAAL should use unsigned multiply instead of signed.
This patch fixes this issue by handling UMAAL separately from
UMULL/UMLAL/SMULL/SMLAL as these instructions are different
enough. It also explicitly list instructions in case and catch
nonexistent instruction as illegal. Also fixes a few style issues.
This fixes the issues reported in
https://bugs.launchpad.net/qemu/+bug/696015
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correct the arguments passed when generating neon qshl_{u,s}64()
helpers so that we use the correct registers.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The translation of REVSH shifted the low byte 8 steps left before performing
an 8-bit sign extend, causing this part of the expression to alwas be 0.
Reported-by: Johan Bengtsson <teofrastius@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix errors in the decoding of the Neon forms of fixed-point VCVT:
* fixed-point VCVT is op 14 and 15, not 15 and 16
* the fbits immediate field was being misinterpreted
* the sense of the to_fixed bit was inverted
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Correct the decoding of source and destination registers
for the VFP forms of the VCVT instructions which convert
between floating point and integer or fixed-point.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Correct ldrexd and strexd code to always read and write the
high word of the 64-bit value from addr+4.
Also make ldrexd and strexd agree that for a 64 bit value the
address in env->exclusive_addr is that of the low word.
This fixes the issues reported in
https://bugs.launchpad.net/qemu/+bug/670883
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Refine check on bkpt so that smc and undefined instruction encodings are
handled as an undefined instruction and trap.
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
The thumb2 decoder contained a mixup between the bit controlling
doubling and the bit controlling if the operation was an add or a sub.
Signed-off-by: Johan Bengtsson <teofrastius@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
The PKHxx instructions were not recognized by the thumb2 decoder. The
solution provided in this changeset is identical to the arm-mode
implementation.
Signed-off-by: Johan Bengtsson <teofrastius@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
fprintf_function uses format checking with GCC_FMT_ATTR.
Format errors were fixed in
* target-i386/helper.c
* target-mips/translate.c
* target-ppc/translate.c
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When combining multiple values as part of a NEON array load, do explcit
shift/or rather than using gen_bfi. This voids redundant mask
operations.
Signed-off-by: Paul Brook <paul@codesourcery.com>
This patch fixes few resource leaks in the iwmmxt disassemble.
Signed-off-by: Lars Munch <lars@segv.dk>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Do not try to insert a conditional jump over next instruction when the
condition code is AL as this will trigger an internal error.
Signed-off-by: Johan Bengtsson <teofrastius@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The rfe instruction can be used with any register, not just sp. Adjust the
condition check accordingly.
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
There's a return missing in the srs handling which leads to srs always being
treated an an invalid op.
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
implementation only widened the 32bit source vector elements into a
64bit destination vector but forgot to perform the actual shifting
operation.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The rounding/truncating options were inverted. truncating
was done when rounding was meant and vice verse.
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Support the "subs pc, lr" Thumb-2 exception return instruction.
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Paul Brook <paul@codesourcery.com>
The Thumb CPS currently does not work correctly: CPSID touches more bits
than the instruction wants to, and CPSIE does nothing. Fix it by
passing the correct mask (the "affect" bits) and value.
Signed-off-by: Rabin Vincent <rabin@rab.in>
Access the cp15.c13 TLS registers directly with TCG ops instead of with
a slow helper. If the the cp15 read/write was not TLS register access,
fall back to the cp15 helper.
This makes accessing __thread variables in linux-user when apps are compiled
with -mtp=cp15 possible. legal cp15 register to acces from linux-user are
already checked in cp15_user_ok.
While at it, make the cp15.c13 Thread ID registers available only on
ARMv6K and newer.
Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
Shift by immediate value is incorrectly overwritten by a temporary
variable in the processing of NEON vsri, vshl and vsli instructions.
This patch has been revised to also include a fix for the special
case where the code would previously try to shift an integer value
over 31 bits left/right.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
All other bits except for the EN in the VFP FPEXC register are defined
as subarchitecture specific and real functionality for any of the
other bits has not been implemented in QEMU. However, current code
allows modifying all bits in the VFP FPEXC register leading to
problems when guest code is writing 1's to the subarchitecture
specific bits and checking whether the bits stay up to verify the
existence of functionality which in fact does not exist in QEMU.
This patch has been revised to include the same behavior change in
the gdb register write function.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add support for NEON vld1.64 and vst1.64 instructions. This patch is
revised to follow more closely the specification and raises
undefined exception if 64bit element size is used for vld2/vst2 or
vld4/vst4 instructions.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
In the existing code shift value is clobbered during the pass loop.
This patch changes the code so that it stores the intermediate
result in the target neon register directly and eliminates the need
to use a temporary to hold the intermediate value thus leaving the
shift value in the temporary variable intact. This is a new patch
in this version of the patch series.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
tmp4 and tmp5 temporary variables are allocated using tcg_const_i32
but incorrectly released using dead_tmp which will cause resource
leak tracking to report false leaks.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Current implementation of thumb mul instruction is implemented as a
32x32->64 multiply which then uses only 32 least significant bits of
the result. Replace that with a simple 32x32->32 multiply.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Revised patch for getting rid of tcg temporary variable leaks in
target-arm/translate.c. This version also includes the leak patch for
gen_set_cpsr macro, now converted as a static inline function, which I
sent earlier as a separate patch on top of this patch.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Acked-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>