Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
By shifting the 16-bit input left by 16, we can align the desired
portion of the 48-bit product and use tcg_gen_muls2_i32.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since all of the inputs and outputs are i32, dispense with
the intermediate promotion to i64 and use tcg_gen_add2_i32.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since all of the inputs and outputs are i32, dispense with
the intermediate promotion to i64 and use tcg_gen_mulu2_i32
and tcg_gen_add2_i32.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Convert the modified immediate form of the data processing insns.
For A32, we can finally remove any code that was intertwined with
the register and register-shifted-register forms.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Convert the register shifted by register form of the data
processing insns. For A32, we cannot yet remove any code
because the legacy decoder intertwines the immediate form.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Convert the register shifted by immediate form of the data
processing insns. For A32, we cannot yet remove any code
because the legacy decoder intertwines the reg-shifted-reg
and immediate forms.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the infrastructure that will become the new decoder.
No instructions adjusted so far.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This function already includes the test for an interworking write
to PC from a load. Change the T32 LDM implementation to match the
A32 LDM implementation.
For LDM, the reordering of the tests does not change valid
behaviour because the only case that differs is has rn == 15,
which is UNPREDICTABLE.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190904193059.26202-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reorganize watchpoints out of i/o path.
Return host address from probe_write / probe_access.
-----BEGIN PGP SIGNATURE-----
iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAl1uiyYdHHJpY2hhcmQu
aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8AuwgAnYLQQbL8kjSqzp7q
gRlj0M2SX41ZW3fMkI794RwsljD9Z0QS7YGnpzHolig9XUYrGnip7STrMvlCr/1L
CIMWNHlgitgBMszLqg42/TB+6RxXn+DMX/ShUzTagC6xQhinCIpdEjoLaTKSgeP+
foIyJ2uoJLKOBP8cPTQp8evongtoQIljpsZZ0K8a4sreO1d6ytH+olkuoGiROft+
VoJkA+kNHd9cE+LPCva8UFGu1QE6uCySvhepzOpnvOtK+SXKUm2yLOFGu7RWP1pT
RkE0oRyRnImtg+cViHfUUFogIffFROdL5tuYMQVuqbINeROPUgJPav+R1Nz1P60a
xM2HEw==
=bLLU
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190903' into staging
Allow page table bit to swap endianness.
Reorganize watchpoints out of i/o path.
Return host address from probe_write / probe_access.
# gpg: Signature made Tue 03 Sep 2019 16:47:50 BST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20190903: (36 commits)
tcg: Factor out probe_write() logic into probe_access()
tcg: Make probe_write() return a pointer to the host page
s390x/tcg: Pass a size to probe_write() in do_csst()
hppa/tcg: Call probe_write() also for CONFIG_USER_ONLY
mips/tcg: Call probe_write() for CONFIG_USER_ONLY as well
tcg: Enforce single page access in probe_write()
tcg: Factor out CONFIG_USER_ONLY probe_write() from s390x code
s390x/tcg: Fix length calculation in probe_write_access()
s390x/tcg: Use guest_addr_valid() instead of h2g_valid() in probe_write_access()
tcg: Check for watchpoints in probe_write()
cputlb: Handle watchpoints via TLB_WATCHPOINT
cputlb: Remove double-alignment in store_helper
cputlb: Fix size operand for tlb_fill on unaligned store
exec: Factor out cpu_watchpoint_address_matches
cputlb: Fold TLB_RECHECK into TLB_INVALID_MASK
exec: Factor out core logic of check_watchpoint()
exec: Move user-only watchpoint stubs inline
target/sparc: sun4u Invert Endian TTE bit
target/sparc: Add TLB entry with attributes
cputlb: Byte swap memory transaction attribute
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.
Target dependant attributes are conditionalized upon NEED_CPU_H.
Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
An attempt to do an exception-return (branch to one of the magic
addresses) in linux-user mode for M-profile should behave like
a normal branch, because linux-user mode is always going to be
in 'handler' mode. This used to work, but we broke it when we added
support for the M-profile security extension in commit d02a8698d7.
In that commit we allowed even handler-mode calls to magic return
values to be checked for and dealt with by causing an
EXCP_EXCEPTION_EXIT exception to be taken, because this is
needed for the FNC_RETURN return-from-non-secure-function-call
handling. For system mode we added a check in do_v7m_exception_exit()
to make any spurious calls from Handler mode behave correctly, but
forgot that linux-user mode would also be affected.
How an attempted return-from-non-secure-function-call in linux-user
mode should be handled is not clear -- on real hardware it would
result in return to secure code (not to the Linux kernel) which
could then handle the error in any way it chose. For QEMU we take
the simple approach of treating this erroneous return the same way
it would be handled on a CPU without the security extensions --
treat it as a normal branch.
The upshot of all this is that for linux-user mode we should never
do any of the bx_excret magic, so the code change is simple.
This ought to be a weird corner case that only affects broken guest
code (because Linux user processes should never be attempting to do
exception returns or NS function returns), except that the code that
assigns addresses in RAM for the process and stack in our linux-user
code does not attempt to avoid this magic address range, so
legitimate code attempting to return to a trampoline routine on the
stack can fall into this case. This change fixes those programs,
but we should also look at restricting the range of memory we
use for M-profile linux-user guests to the area that would be
real RAM in hardware.
Cc: qemu-stable@nongnu.org
Reported-by: Christophe Lyon <christophe.lyon@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190822131534.16602-1-peter.maydell@linaro.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1840922
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The function neon_store_reg32() doesn't free the TCG temp that it
is passed, so the caller must do that. We got this right in most
places but forgot to free the TCG temps in trans_VMOV_64_sp().
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190827121931.26836-1-peter.maydell@linaro.org
The previous simplification got the order of operands to the
subtraction wrong. Since the 64-bit product is the subtrahend,
we must use a 64-bit subtract to properly compute the borrow
from the low-part of the product.
Fixes: 5f8cd06ebc ("target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR")
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20190829013258.16102-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The translation table walk for an ATS instruction can result in
various faults. In general these are just reported back via the
PAR_EL1 fault status fields, but in some cases the architecture
requires that the fault is turned into an exception:
* synchronous stage 2 faults of any kind during AT S1E0* and
AT S1E1* instructions executed from NS EL1 fault to EL2 or EL3
* synchronous external aborts are taken as Data Abort exceptions
(This is documented in the v8A Arm ARM DDI0487A.e D5.2.11 and
G5.13.4.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20190816125802.25877-3-peter.maydell@linaro.org
Currently the only part of an ARMCPRegInfo which is allowed to cause
a CPU exception is the access function, which returns a value indicating
that some flavour of UNDEF should be generated.
For the ATS system instructions, we would like to conditionally
generate exceptions as part of the writefn, because some faults
during the page table walk (like external aborts) should cause
an exception to be raised rather than returning a value.
There are several ways we could do this:
* plumb the GETPC() value from the top level set_cp_reg/get_cp_reg
helper functions through into the readfn and writefn hooks
* add extra readfn_with_ra/writefn_with_ra hooks that take the GETPC()
value
* require the ATS instructions to provide a dummy accessfn,
which serves no purpose except to cause the code generation
to emit TCG ops to sync the CPU state
* add an ARM_CP_ flag to mark the ARMCPRegInfo as possibly
throwing an exception in its read/write hooks, and make the
codegen sync the CPU state before calling the hooks if the
flag is set
This patch opts for the last of these, as it is fairly simple
to implement and doesn't require invasive changes like updating
the readfn/writefn hook function prototype signature.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20190816125802.25877-2-peter.maydell@linaro.org
Make this a static function private to translate.c.
Thus we can use the same idiom between aarch64 and aarch32
without actually sharing function implementations.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20190826151536.6771-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commit 3cb3663715.
Despite the fact that the text for the call to gen_exception_insn
is identical for aarch64 and aarch32, the implementation inside
gen_exception_insn is totally different.
This fixes exceptions raised from aarch64.
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20190826151536.6771-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h
in comments replaced]
Prior patch resets can_do_io flag at the TB entry. Therefore there is no
need in resetting this flag at the end of the block.
This patch removes redundant gen_io_end calls.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404429499.18669.13404064982854123855.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
* target/arm: generate a custom MIDR for -cpu max
* hw/misc/zynq_slcr: refactor to use standard register definition
* Set ENET_BD_BDU in I.MX FEC controller
* target/arm: Fix routing of singlestep exceptions
* refactor a32/t32 decoder handling of PC
* minor optimisations/cleanups of some a32/t32 codegen
* target/arm/cpu64: Ensure kvm really supports aarch64=off
* target/arm/cpu: Ensure we can use the pmu with kvm
* target/arm: Minor cleanups preparatory to KVM SVE support
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAl1WrIsZHHBldGVyLm1h
eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3tJ/D/9I0ccyciHwuekySUHs+Wq6
2grX8t6RFzlhA1ULoAaEO4x8uWWGnbiGTeSGM819T3nj1a7neQV12Xe5RRGG0j7n
aeVseYnZF96oshKPkDSVTcGQisVfmmHIJ0oqx2k1aUGrmyFJlTuLWQBZCCiZKhxA
zA6YzUbOA2apfi9nun6SbbjysiRD2lp2i9vI79nVlo+ca77v/1sdFUwzg0hRE//X
IondHeWtCZScmc/GwABv4EdNzQ4Aerfe10v/pOKXEC59rPwEiaiSGBPu6SRUaGWH
qHlwjVU2+BFGkz9Oy/7+tDTBk6saPi4taZF8SxxiC/QTyNV2ijyKV5iy9KOYAFw7
E41fhv4+Kch569/SX7fiyAxL0gAS2HGFtegByuQEgjjioOCRugFcX275NXvuW06j
jfOP/zSD9P39WA0jCJaNj5FdJTcLmIuFxKjBUEX3Cdb+3igIq1BW0ZFd/OOBoo1W
GHcEmO6tLyx35kigOb3TkayQpkqCoaGCcgzJ0g2Oy06rKwlcci+BfCfc3aG+uSSY
+TuGjRhpQxQJJt880d7tBqeH9R5FABvQ0TEwGuACylDEZM5bN7BpZxCxCVN/bFG+
pzvzs/QtOq0FN7LK4L4rbuJui4nBhAyalbiIXQ8ihWQgmMqaYQSK8mXFgSZgizFl
qATcYIr/q2gL4wHRos3XdA==
=8BAF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20190816' into staging
target-arm queue:
* target/arm: generate a custom MIDR for -cpu max
* hw/misc/zynq_slcr: refactor to use standard register definition
* Set ENET_BD_BDU in I.MX FEC controller
* target/arm: Fix routing of singlestep exceptions
* refactor a32/t32 decoder handling of PC
* minor optimisations/cleanups of some a32/t32 codegen
* target/arm/cpu64: Ensure kvm really supports aarch64=off
* target/arm/cpu: Ensure we can use the pmu with kvm
* target/arm: Minor cleanups preparatory to KVM SVE support
# gpg: Signature made Fri 16 Aug 2019 14:15:55 BST
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20190816: (29 commits)
target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word
target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR
target/arm: Use tcg_gen_rotri_i32 for gen_swap_half
target/arm: Use ror32 instead of open-coding the operation
target/arm: Remove redundant shift tests
target/arm: Use tcg_gen_deposit_i32 for PKHBT, PKHTB
target/arm: Use tcg_gen_extract_i32 for shifter_out_im
target/arm/kvm64: Move the get/put of fpsimd registers out
target/arm/kvm64: Fix error returns
target/arm/cpu: Use div-round-up to determine predicate register array size
target/arm/helper: zcr: Add build bug next to value range assumption
target/arm/cpu: Ensure we can use the pmu with kvm
target/arm/cpu64: Ensure kvm really supports aarch64=off
target/arm: Remove helper_double_saturate
target/arm: Use unallocated_encoding for aarch32
target/arm: Remove offset argument to gen_exception_bkpt_insn
target/arm: Replace offset with pc in gen_exception_internal_insn
target/arm: Replace offset with pc in gen_exception_insn
target/arm: Replace s->pc with s->base.pc_next
target/arm: Remove redundant s->pc & ~1
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Separate shift + extract low will result in one extra insn
for hosts like RISC-V, MIPS, and Sparc.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-8-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
All of the inputs to these instructions are 32-bits. Rather than
extend each input to 64-bits and then extract the high 32-bits of
the output, use tcg_gen_muls2_i32 and other 32-bit generator functions.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-7-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rotate is the more compact and obvious way to swap 16-bit
elements of a 32-bit word.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The helper function is more documentary, and also already
handles the case of rotate by zero.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The immediate shift generator functions already test for,
and eliminate, the case of a shift by zero.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-4-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use deposit as the composit operation to merge the
bits from the two inputs.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Extract is a compact combination of shift + and.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190808202616.13782-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move the getting/putting of the fpsimd registers out of
kvm_arch_get/put_registers() into their own helper functions
to prepare for alternatively getting/putting SVE registers.
No functional change.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A couple return -EINVAL's forgot their '-'s.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Unless we're guaranteed to always increase ARM_MAX_VQ by a multiple of
four, then we should use DIV_ROUND_UP to ensure we get an appropriate
array size.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The current implementation of ZCR_ELx matches the architecture, only
implementing the lower four bits, with the rest RAZ/WI. This puts
a strict limit on ARM_MAX_VQ of 16. Make sure we don't let ARM_MAX_VQ
grow without a corresponding update here.
Suggested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We first convert the pmu property from a static property to one with
its own accessors. Then we use the set accessor to check if the PMU is
supported when using KVM. Indeed a 32-bit KVM host does not support
the PMU, so this check will catch an attempt to use it at property-set
time.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If -cpu <cpu>,aarch64=off is used then KVM must also be used, and it
and the host must support running the vcpu in 32-bit mode. Also, if
-cpu <cpu>,aarch64=on is used, then it doesn't matter if kvm is
enabled or not.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace x = double_saturate(y) with x = add_saturate(y, y).
There is no need for a separate more specialized helper.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Promote this function from aarch64 to fully general use.
Use it to unify the code sequences for generating illegal
opcode exceptions.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Unlike the other more generic gen_exception{,_internal}_insn
interfaces, breakpoints always refer to the current instruction.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The offset is variable depending on the instruction set.
Passing in the actual value is clearer in intent.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The offset is variable depending on the instruction set, whereas
we have stored values for the current pc and the next pc. Passing
in the actual value is clearer in intent.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We must update s->base.pc_next when we return from the translate_insn
hook to the main translator loop. By incrementing s->base.pc_next
immediately after reading the insn word, "pc_next" contains the address
of the next instruction throughout translation.
All remaining uses of s->pc are referencing the address of the next insn,
so this is now a simple global replacement. Remove the "s->pc" field.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The thumb bit has already been removed from s->pc, and is always even.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-6-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Provide a common routine for the places that require ALIGN(PC, 4)
as the base address as opposed to plain PC. The two are always
the same for A32, but the difference is meaningful for thumb mode.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We currently have 3 different ways of computing the architectural
value of "PC" as seen in the ARM ARM.
The value of s->pc has been incremented past the current insn,
but that is all. Thus for a32, PC = s->pc + 4; for t32, PC = s->pc;
for t16, PC = s->pc + 2. These differing computations make it
impossible at present to unify the various code paths.
With the newly introduced s->pc_curr, we can compute the correct
value for all cases, using the formula given in the ARM ARM.
This changes the behaviour for load_reg() and load_reg_var()
when called with reg==15 from a 32-bit Thumb instruction:
previously they would have returned the incorrect value
of pc_curr + 6, and now they will return the architecturally
correct value of PC, which is pc_curr + 4. This will not
affect well-behaved guest software, because all of the places
we call these functions from T32 code are instructions where
using r15 is UNPREDICTABLE. Using the architectural PC value
here is more consistent with the T16 and A32 behaviour.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-4-richard.henderson@linaro.org
[PMM: added commit message note about UNPREDICTABLE T32 cases]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a new field to retain the address of the instruction currently
being translated. The 32-bit uses are all within subroutines used
by a32 and t32. This will become less obvious when t16 support is
merged with a32+t32, and having a clear definition will help.
Convert aarch64 as well for consistency. Note that there is one
instance of a pre-assert fprintf that used the wrong value for the
address of the current instruction.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This function is used in two different contexts, and it will be
clearer if the function is given the address to which it applies.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190807045335.1361-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When generating an architectural single-step exception we were
routing it to the "default exception level", which is to say
the same exception level we execute at except that EL0 exceptions
go to EL1. This is incorrect because the debug exception level
can be configured by the guest for situations such as single
stepping of EL0 and EL1 code by EL2.
We have to track the target debug exception level in the TB
flags, because it is dependent on CPU state like HCR_EL2.TGE
and MDCR_EL2.TDE. (That we were previously calling the
arm_debug_target_el() function to determine dc->ss_same_el
is itself a bug, though one that would only have manifested
as incorrect syndrome information.) Since we are out of TB
flag bits unless we want to expand into the cs_base field,
we share some bits with the M-profile only HANDLER and
STACKCHECK bits, since only A-profile has this singlestep.
Fixes: https://bugs.launchpad.net/qemu/+bug/1838913
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190805130952.4415-3-peter.maydell@linaro.org
Factor out code to 'generate a singlestep exception', which is
currently repeated in four places.
To do this we need to also pull the identical copies of the
gen-exception() function out of translate-a64.c and translate.c
into translate.h.
(There is a bug in the code: we're taking the exception to the wrong
target EL. This will be simpler to fix if there's only one place to
do it.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190805130952.4415-2-peter.maydell@linaro.org
While most features are now detected by probing the ID_* registers
kernels can (and do) use MIDR_EL1 for working out of they have to
apply errata. This can trip up warnings in the kernel as it tries to
work out if it should apply workarounds to features that don't
actually exist in the reported CPU type.
Avoid this problem by synthesising our own MIDR value.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190726113950.7499-1-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator. Evidence:
* It's included widely: in my "build everything" tree, changing
sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
objects (not counting tests and objects that don't depend on
qemu/osdep.h, down from 5400 due to the previous two commits).
* It pulls in more than a dozen additional headers.
Split stuff related to run state management into its own header
sysemu/runstate.h.
Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200. Touching new sysemu/runstate.h recompiles some 500 objects.
Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-30-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[Unbreak OS-X build]
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
Almost a third of its inclusions are actually superfluous. Delete
them. Downgrade two more to qapi/qapi-types-run-state.h, and move one
from char/serial.h to char/serial.c.
hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and
stubs/semihost.c define variables declared in sysemu/sysemu.h without
including it. The compiler is cool with that, but include it anyway.
This doesn't reduce actual use much, as it's still included into
widely included headers. The next commit will tackle that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-27-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
hw/boards.h pulls in almost 60 headers. The less we include it into
headers, the better. As a first step, drop superfluous inclusions,
and downgrade some more to what's actually needed. Gets rid of just
one inclusion into a header.
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-23-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h). It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.
Include qemu/main-loop.h only where it's needed. Touching it now
recompiles only some 1700 objects. For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the
others, they shrink only slightly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
The previous commits have left only the declaration of hw_error() in
hw/hw.h. This permits dropping most of its inclusions. Touching it
now recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
We declare incomplete struct VMStateDescription in a couple of places
so we don't have to include migration/vmstate.h for the typedef.
That's fine with me. However, the next commit will drop
migration/vmstate.h from a massive number of compiles. Move the
typedef to qemu/typedefs.h now, so I don't have to insert struct in
front of VMStateDescription all over the place then.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-15-armbru@redhat.com>
In my "build everything" tree, changing hw/irq.h triggers a recompile
of some 5400 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
hw/hw.h supposedly includes it for convenience. Several other headers
include it just to get qemu_irq and.or qemu_irq_handler.
Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to
qemu/typedefs.h, and then include hw/irq.h only where it's still
needed. Touching it now recompiles only some 500 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-13-armbru@redhat.com>
In Arm v8.0 M-profile CPUs without the Security Extension and also in
v7M CPUs, there is no NSACR register. However, the code we have to handle
the FPU does not always check whether the ARM_FEATURE_M_SECURITY bit
is set before testing whether env->v7m.nsacr permits access to the
FPU. This means that for a CPU with an FPU but without the Security
Extension we would always take a bogus fault when trying to stack
the FPU registers on an exception entry.
We could fix this by adding extra feature bit checks for all uses,
but it is simpler to just make the internal value of nsacr 0xcff
("all non-secure accesses allowed"), since this is not guest
visible when the Security Extension is not present. This allows
us to continue to follow the Arm ARM pseudocode which takes a
similar approach. (In particular, in the v8.1 Arm ARM the register
is documented as reading as 0xcff in this configuration.)
Fixes: https://bugs.launchpad.net/qemu/+bug/1838475
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Message-id: 20190801105742.20036-1-peter.maydell@linaro.org
Most Arm architectural debug exceptions (eg watchpoints) are ignored
if the configured "debug exception level" is below the current
exception level (so for example EL1 can't arrange to get debug exceptions
for EL2 execution). Exceptions generated by the BRK or BPKT instructions
are a special case -- they must always cause an exception, so if
we're executing above the debug exception level then we
must take them to the current exception level.
This fixes a bug where executing BRK at EL2 could result in an
exception being taken at EL1 (which is strictly forbidden by the
architecture).
Fixes: https://bugs.launchpad.net/qemu/+bug/1838277
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190730132522.27086-1-peter.maydell@linaro.org
In arm_cpu_realizefn() we make several assertions about the values of
guest ID registers:
* if the CPU provides AArch32 v7VE or better it must advertise the
ARM_DIV feature
* if the CPU provides AArch32 A-profile v6 or better it must
advertise the Jazelle feature
These are essentially consistency checks that our ID register
specifications in cpu.c didn't accidentally miss out a feature,
because increasingly the TCG emulation gates features on the values
in ID registers rather than using old-style checks of ARM_FEATURE_FOO
bits.
Unfortunately, these asserts can cause problems if we're running KVM,
because in that case we don't control the values of the ID registers
-- we read them from the host kernel. In particular, if the host
kernel is older than 4.15 then it doesn't expose the ID registers via
the KVM_GET_ONE_REG ioctl, and we set up dummy values for some
registers and leave the rest at zero. (See the comment in
target/arm/kvm64.c kvm_arm_get_host_cpu_features().) This set of
dummy values is not sufficient to pass our assertions, and so on
those kernels running an AArch32 guest on AArch64 will assert.
We could provide a more sophisticated set of dummy ID registers in
this case, but that still leaves the possibility of a host CPU which
reports bogus ID register values that would cause us to assert. It's
more robust to only do these ID register checks if we're using TCG,
as that is the only case where this is truly a QEMU code bug.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190718125928.20147-1-peter.maydell@linaro.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1830864
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported by GCC9 when building with -Wimplicit-fallthrough=2:
target/arm/helper.c: In function ‘arm_cpu_do_interrupt_aarch32_hyp’:
target/arm/helper.c:7958:14: error: this statement may fall through [-Werror=implicit-fallthrough=]
7958 | addr = 0x14;
| ~~~~~^~~~~~
target/arm/helper.c:7959:5: note: here
7959 | default:
| ^~~~~~~
cc1: all warnings being treated as errors
Fixes: b9bc21ff9f
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reported-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190719111451.12406-1-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the M-profile architecture, when we do a vector table fetch and it
fails, we need to report a HardFault. Whether this is a Secure HF or
a NonSecure HF depends on several things. If AIRCR.BFHFNMINS is 0
then HF is always Secure, because there is no NonSecure HardFault.
Otherwise, the answer depends on whether the 'underlying exception'
(MemManage, BusFault, SecureFault) targets Secure or NonSecure. (In
the pseudocode, this is handled in the Vector() function: the final
exc.isSecure is calculated by looking at the exc.isSecure from the
exception returned from the memory access, not the isSecure input
argument.)
We weren't doing this correctly, because we were looking at
the target security domain of the exception we were trying to
load the vector table entry for. This produces errors of two kinds:
* a load from the NS vector table which hits the "NS access
to S memory" SecureFault should end up as a Secure HardFault,
but we were raising an NS HardFault
* a load from the S vector table which causes a BusFault
should raise an NS HardFault if BFHFNMINS == 1 (because
in that case all BusFaults are NonSecure), but we were raising
a Secure HardFault
Correct the logic.
We also fix a comment error where we claimed that we might
be escalating MemManage to HardFault, and forgot about SecureFault.
(Vector loads can never hit MPU access faults, because they're
always aligned and always use the default address map.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190705094823.28905-1-peter.maydell@linaro.org
The ARMv5 architecture didn't specify detailed per-feature ID
registers. Now that we're using the MVFR0 register fields to
gate the existence of VFP instructions, we need to set up
the correct values in the cpu->isar structure so that we still
provide an FPU to the guest.
This fixes a regression in the arm926 and arm1026 CPUs, which
are the only ones that both have VFP and are ARMv5 or earlier.
This regression was introduced by the VFP refactoring, and more
specifically by commits 1120827fa1 and 266bd25c48,
which accidentally disabled VFP short-vector support and
double-precision support on these CPUs.
Fixes: 1120827fa1
Fixes: 266bd25c48
Fixes: https://bugs.launchpad.net/qemu/+bug/1836192
Reported-by: Christophe Lyon <christophe.lyon@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Christophe Lyon <christophe.lyon@linaro.org>
Message-id: 20190711131241.22231-1-peter.maydell@linaro.org
When we converted to using feature bits in 602f6e42cf we missed out
the fact (dp && arm_dc_feature(s, ARM_FEATURE_V8)) was supported for
-cpu max configurations. This caused a regression in the GCC test
suite. Fix this by setting the appropriate bits in mvfr1.FPHP to
report ARMv8-A with FP support (but not ARMv8.2-FP16).
Fixes: https://bugs.launchpad.net/qemu/+bug/1836078
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190711103737.10017-1-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
At present we have a potential error in that helper_retaddr contains
data for handle_cpu_signal, but we have not ensured that those stores
will be scheduled properly before the operation that may fault.
It might be that these races are not in practice observable, due to
our use of -fno-strict-aliasing, but better safe than sorry.
Adjust all of the setters of helper_retaddr.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In commit e9d652824b we extracted the vfp_set_fpscr_to_host()
function but failed at calling it in the correct place, we call
it after xregs[ARM_VFP_FPSCR] is modified.
Fix by calling this function before we update FPSCR.
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20190705124318.1075-1-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Off by one error in the EL2 and EL3 tests. Remove the test
against EL3 entirely, since it must always be true.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190702104732.31154-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The global smp variables in arm are replaced with smp machine properties.
The init_cpus() and *_create_rpu() are refactored to pass MachineState.
A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190518205428.90532-9-like.xu@linux.intel.com>
[ehabkost: Fix hw/arm/sbsa-ref.c and hw/arm/aspeed.c]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Coverity points out (CID 1402195) that the loop in trans_VMOV_imm_dp()
that iterates over the destination registers in a short-vector VMOV
accidentally throws away the returned updated register number
from vfp_advance_dreg(). Add the missing assignment. (We got this
correct in trans_VMOV_imm_sp().)
Fixes: 18cf951af9
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190702105115.9465-1-peter.maydell@linaro.org
Thumb instructions in an IT block are set up to be conditionally
executed depending on a set of condition bits encoded into the IT
bits of the CPSR/XPSR. The architecture specifies that if the
condition bits are 0b1111 this means "always execute" (like 0b1110),
not "never execute"; we were treating it as "never execute". (See
the ConditionHolds() pseudocode in both the A-profile and M-profile
Arm ARM.)
This is a bit of an obscure corner case, because the only legal
way to get to an 0b1111 set of condbits is to do an exception
return which sets the XPSR/CPSR up that way. An IT instruction
which encodes a condition sequence that would include an 0b1111 is
UNPREDICTABLE, and for v8A the CONSTRAINED UNPREDICTABLE choices
for such an IT insn are to NOP, UNDEF, or treat 0b1111 like 0b1110.
Add a comment noting that we take the latter option.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190617175317.27557-7-peter.maydell@linaro.org
In the various helper functions for v7M/v8M instructions, use
the _ra versions of cpu_stl_data() and friends. Otherwise we
may get wrong behaviour or an assert() due to not being able
to locate the TB if there is an exception on the memory access
or if it performs an IO operation when in icount mode.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190617175317.27557-5-peter.maydell@linaro.org
In preparation for supporting TCG disablement on ARM, we move most
of TCG related v7m/v8m helpers and APIs into their own file.
Note: It is easier to review this commit using the 'histogram'
diff algorithm:
$ git diff --diff-algorithm=histogram ...
or
$ git diff --histogram ...
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190702144335.10717-2-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: updated qapi #include to match recent changes there]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Per Peter Maydell:
Semihosting hooks either SVC or HLT instructions, and inside KVM
both of those go to EL1, ie to the guest, and can't be trapped to
KVM.
Let check_for_semihosting() return False when not running on TCG.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701194942.10092-3-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
These routines are TCG specific.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701194942.10092-2-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-14-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Move commands query-cpu-definitions, query-cpu-model-baseline,
query-cpu-model-comparison, and query-cpu-model-expansion with their
types from target.json to machine-target.json. Also move types
CpuModelInfo, CpuModelExpansionType, and CpuModelCompareResult from
misc.json there. Add machine-target.json to MAINTAINERS section
"Machine core".
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-13-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[Commit message typo fixed]
In the next commit we will split the M-profile functions from this
file. Some function will be called out of helper.c. Declare them in
the "internals.h" header.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-22-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In few commits we will split the M-profile functions from this
file, and this function will also be called in the new file.
Declare it in the "internals.h" header.
Since it is in the middle of a block of M profile functions,
move it previous to this block to ease the later refactor.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-21-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Under KVM, the kernel gets the HVC call and handle the PSCI requests.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-20-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This code is specific to the SoftFloat floating-point
implementation, which is only used by TCG.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-18-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The vfp_set_fpscr() helper contains code specific to the host
floating point implementation (here the SoftFloat library).
Extract this code to vfp_set_fpscr_from_host().
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-17-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The vfp_set_fpscr() helper contains code specific to the host
floating point implementation (here the SoftFloat library).
Extract this code to vfp_set_fpscr_to_host().
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-16-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
To ease the review of the next commit,
move the vfp_exceptbits_to_host() function directly after
vfp_exceptbits_from_host(). Amusingly the diff shows we
are moving vfp_get_fpscr().
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-15-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
These routines are TCG specific.
The arm_deliver_fault() function is only used within the new
helper. Make it static.
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-13-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the next commit we will split the TLB related routines of
this file, and this function will also be called in the new
file. Declare it in the "internals.h" header.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-12-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-11-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Those helpers are a software implementation of the ARM v8 memory zeroing
op code. They should be moved to the op helper file, which is going to
eventually be built only when TCG is enabled.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Robert Bradford <robert.bradford@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-10-philmd@redhat.com
[PMD: Rebased]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since we'll move this code around, fix its style first.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-9-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since commit 8c06fbdf36 checkpatch.pl enforce a new multiline
comment syntax. Since we'll move this code around, fix its style
first.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-8-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-7-philmd@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Robert Bradford <robert.bradford@intel.com>
Reviewed-by: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-6-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Group SOFTMMU objects together.
Since PSCI is TCG specific, keep it separate.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-5-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Group KVM rules together.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-4-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Group ARM objects together, TCG related ones at the bottom.
This will help when restricting TCG-only objects.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-3-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Group Aarch64 rules together, TCG related ones at the bottom.
This will help when restricting TCG-only objects.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190701132516.26392-2-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Simiar to how kvm_init_vcpu() calls kvm_arch_init_vcpu() to perform
arch-dependent initialisation, introduce kvm_arch_destroy_vcpu()
to be called from kvm_destroy_vcpu() to perform arch-dependent
destruction.
This was added because some architectures (Such as i386)
currently do not free memory that it have allocated in
kvm_arch_init_vcpu().
Suggested-by: Maran Wilson <maran.wilson@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-3-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In commit 1120827fa1 we accidentally put the
"UNDEF unless FPU has double-precision support" check in
the single-precision VFM function. Put it in the dp
function where it belongs.
Fixes: 1120827fa1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190617160130.3207-1-peter.maydell@linaro.org
The architecture permits FPUs which have only single-precision
support, not double-precision; Cortex-M4 and Cortex-M33 are
both like that. Add the necessary checks on the MVFR0 FPDP
field so that we UNDEF any double-precision instructions on
CPUs like this.
Note that even if FPDP==0 the insns like VMOV-to/from-gpreg,
VLDM/VSTM, VLDR/VSTR which take double precision registers
still exist.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190614104457.24703-3-peter.maydell@linaro.org
In several places cut and paste errors meant we were using the wrong
type for the 'arg' struct in trans_ functions called by the
decodetree decoder, because we were using the _sp version of the
struct in the _dp function. These were harmless, because the two
structs were identical and so decodetree made them typedefs of the
same underlying structure (and we'd have had a compile error if they
were not harmless), but we should clean them up anyway.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190614104457.24703-2-peter.maydell@linaro.org
Remove the now unused TCG globals cpu_F0s, cpu_F0d, cpu_F1s, cpu_F1d.
cpu_M0 is still used by the iwmmxt code, and cpu_V0 and
cpu_V1 are used by both iwmmxt and Neon.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-13-peter.maydell@linaro.org
Remove some old constructns from NEON_2RM_VCVT_F16_F32 code:
* don't use CPU_F0s
* don't use tcg_gen_st_f32
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-12-peter.maydell@linaro.org
Remove some old constructs from NEON_2RM_VCVT_F16_F32 code:
* don't use cpu_F0s
* don't use tcg_gen_ld_f32
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-11-peter.maydell@linaro.org
Stop using cpu_F0s in the Neon VCVT fixed-point operations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-10-peter.maydell@linaro.org
Stop using cpu_F0s for the Neon f32/s32 VCVT operations.
Since this is the last user of cpu_F0s in the Neon 2rm-op
loop, we can remove the handling code for it too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-9-peter.maydell@linaro.org
Stop using cpu_F0s for NEON_2RM_VRECPE_F and NEON_2RM_VRSQRTE_F.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-8-peter.maydell@linaro.org
Stop using cpu_F0s for the NEON_2RM_VCVT[ANPM][US] ops.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-7-peter.maydell@linaro.org
Switch NEON_2RM_VRINT* away from using cpu_F0s.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-6-peter.maydell@linaro.org
Switch NEON_2RM_VABS_F away from using cpu_F0s.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-5-peter.maydell@linaro.org
Where Neon instructions are floating point operations, we
mostly use the old VFP utility functions like gen_vfp_abs()
which work on the TCG globals cpu_F0s and cpu_F1s. The
Neon for-each-element loop conditionally loads the inputs
into either a plain old TCG temporary for most operations
or into cpu_F0s for float operations, and similarly stores
back either cpu_F0s or the temporary.
Switch NEON_2RM_VABS_F away from using cpu_F0s, and
update neon_2rm_is_float_op() accordingly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-4-peter.maydell@linaro.org
The AArch32 VMOV (immediate) instruction uses the same VFP encoded
immediate format we already handle in vfp_expand_imm(). Use that
function rather than hand-decoding it.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-3-peter.maydell@linaro.org
We want to use vfp_expand_imm() in the AArch32 VFP decode;
move it from the a64-only header/source file to the
AArch32 one (which is always compiled even for AArch64).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190613163917.28589-2-peter.maydell@linaro.org
Allow the DSP extension to be disabled via a CPU property for
M-profile CPUs. (A and R-profile CPUs don't have this extension
as a defined separate optional architecture extension, so
they don't need the property.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190517174046.11146-3-peter.maydell@linaro.org
Allow VFP and neon to be disabled via a CPU property. As with
the "pmu" property, we only allow these features to be removed
from CPUs which have it by default, not added to CPUs which
don't have it.
The primary motivation here is to be able to optionally
create Cortex-M33 CPUs with no FPU, but we provide switches
for both VFP and Neon because the two interact:
* AArch64 can't have one without the other
* Some ID register fields only change if both are disabled
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20190517174046.11146-2-peter.maydell@linaro.org
For VFP short vectors, the VFP registers are divided into a
series of banks: for single-precision these are s0-s7, s8-s15,
s16-s23 and s24-s31; for double-precision they are d0-d3,
d4-d7, ... d28-d31. Some banks are "scalar" meaning that
use of a register within them triggers a pure-scalar or
mixed vector-scalar operation rather than a full vector
operation. The scalar banks are s0-s7, d0-d3 and d16-d19.
When using a bank as part of a vector operation, we
iterate through it, increasing the register number by
the specified stride each time, and wrapping around to
the beginning of the bank.
Unfortunately our calculation of the "increment" part of this
was incorrect:
vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask)
will only do the intended thing if bank_mask has exactly
one set high bit. For instance for doubles (bank_mask = 0xc),
if we start with vd = 6 and delta_d = 2 then vd is updated
to 12 rather than the intended 4.
This only causes problems in the unlikely case that the
starting register is not the first in its bank: if the
register number doesn't have to wrap around then the
expression happens to give the right answer.
Fix this bug by abstracting out the "check whether register
is in a scalar bank" and "advance register within bank"
operations to utility functions which use the right
bit masking operations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the float-to-integer VCVT instructions to decodetree.
Since these are the last unconverted instructions, we can
delete the old decoder structure entirely now.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VCVT (between floating-point and fixed-point) instructions
to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VJCVT instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VCVT integer-to-float instructions to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VCVT double/single precision conversion insns to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP round-to-integer instructions VRINTR, VRINTZ and
VRINTX to decodetree.
These instructions were only introduced as part of the "VFP misc"
additions in v8A, so we check this. The old decoder's implementation
was incorrectly providing them even for v7A CPUs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VCVTT and VCVTB instructions which convert from
f32 and f64 to f16 to decodetree.
Since we're no longer constrained to the old decoder's style
using cpu_F0s and cpu_F0d we can perform a direct 16 bit
store of the right half of the input single-precision register
rather than doing a load/modify/store sequence on the full
32 bits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VCVTT, VCVTB instructions that deal with conversion
from half-precision floats to f32 or 64 to decodetree.
Since we're no longer constrained to the old decoder's style
using cpu_F0s and cpu_F0d we can perform a direct 16 bit
load of the right half of the input single-precision register
rather than loading the full 32 bits and then doing a
separate shift or sign-extension.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP comparison instructions to decodetree.
Note that comparison instructions should not honour the VFP
short-vector length and stride information: they are scalar-only
operations. This applies to all the 2-operand instructions except
for VMOV, VABS, VNEG and VSQRT. (In the old decoder this is
implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VSQRT instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VNEG instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP VABS instruction to decodetree.
Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or
VFPGen2OpDPFn because none of the operations which use this format
and support short vectors will need it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP VMOV (immediate) instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP fused multiply-add instructions (VFNMA, VFNMS,
VFMA, VFMS) to decodetree.
Note that in the old decode structure we were implementing
these to honour the VFP vector stride/length. These instructions
were introduced in VFPv4, and in the v7A architecture they
are UNPREDICTABLE if the vector stride or length are non-zero.
In v8A they must UNDEF if stride or length are non-zero, like
all VFP instructions; we choose to UNDEF always.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VDIV instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VSUB instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VADD instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VNMUL instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VMUL instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP VNMLA instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP VNMLS instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP VMLS instruction to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP VMLA instruction to decodetree.
This is the first of the VFP 3-operand data processing instructions,
so we include in this patch the code which loops over the elements
for an old-style VFP vector operation. The existing code to do this
looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since
we are going to be converting instructions one at a time anyway
we can take the opportunity to make the new loop use TCG temporaries,
which means we can do that conversion one operation at a time
rather than needing to do it all in one go.
We include an UNDEF check which was missing in the old code:
short-vector operations (with stride or length non-zero) were
deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec
field does not indicate that support for short vectors is present
we UNDEF the operations that would use them. (This is a change
of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which
previously were all incorrectly allowing short-vector operations.)
Note that the conversion fixes a bug in the old code for the
case of VFP short-vector "mixed scalar/vector operations". These
happen where the destination register is in a vector bank but
but the second operand is in a scalar bank. For example
vmla.f64 d10, d1, d16 with length 2 stride 2
is equivalent to the pair of scalar operations
vmla.f64 d10, d1, d16
vmla.f64 d8, d3, d16
where the destination and first input register cycle through
their vector but the second input is scalar (d16). In the
old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d}
as a temporary output for the multiply, which trashes the
second input operand. For the fully-scalar case (where we
never do a second iteration) and the fully-vector case
(where the loop loads the new second input operand) this
doesn't matter, but for the mixed scalar/vector case we
will end up using the wrong value for later loop iterations.
In the new code we use TCG temporaries and so avoid the bug.
This bug is present for all the multiply-accumulate insns
that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS.
Note 2: the expression used to calculate the next register
number in the vector bank is not in fact correct; we leave
this behaviour unchanged from the old decoder and will
fix this bug later in the series.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans
functions which perform the memory accesses by going via the TCG
globals cpu_F0s and cpu_F0d, to use local TCG temps instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP load/store multiple insns to decodetree.
This includes tightening up the UNDEF checking for pre-VFPv3
CPUs which only have D0-D15 : they now UNDEF for any access
to D16-D31, not merely when the smallest register in the
transfer list is in D16-D31.
This conversion does not try to share code between the single
precision and the double precision versions; this looks a bit
duplicative of code, but it leaves the door open for a future
refactoring which gets rid of the use of the "F0" registers
by inlining the various functions like gen_vfp_ld() and
gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }"
conditionalisation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP single load/store insns VLDR and VSTR to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VFP two-register transfer instructions to decodetree
(in the v8 Arm ARM these are the "Advanced SIMD and floating-point
64-bit move" encoding group).
Again, we expand out the sequences involving gen_vfp_msr() and
gen_msr_vfp().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the "single-precision" register moves to decodetree:
* VMSR
* VMRS
* VMOV between general purpose register and single precision
Note that the VMSR/VMRS conversions make our handling of
the "should this UNDEF?" checks consistent between the two
instructions:
* VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0
(previously was a nop)
* VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better
(previously was a nop)
* VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better
(previously would write to the register, which had no
guest-visible effect because we always UNDEF reads)
We also tighten up the decode: we were previously underdecoding
some SBZ or SBO bits.
The conversion of VMOV_single includes the expansion out of the
gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr()
sequences into the simpler direct load/store of the TCG temp via
neon_{load,store}_reg32(): we know in the new function that we're
always single-precision, we don't need to use the old-and-deprecated
cpu_F0* TCG globals, and we don't happen to have the declaration of
gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the
new function is.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the "double-precision" register moves to decodetree:
this covers VMOV scalar-to-gpreg, VMOV gpreg-to-scalar and VDUP.
Note that the conversion process has tightened up a few of the
UNDEF encoding checks: we now correctly forbid:
* VMOV-to-gpr with U:opc1:opc2 == 10x00 or x0x10
* VMOV-from-gpr with opc1:opc2 == 0x10
* VDUP with B:E == 11
* VDUP with Q == 1 and Vn<0> == 1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
The accesses of elements < 32 bits could be improved by doing
direct ld/st of the right size rather than 32-bit read-and-shift
or read-modify-write, but we leave this for later cleanup,
since this series is generally trying to stick to fixing
the decode.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The current VFP code has two different idioms for
loading and storing from the VFP register file:
1 using the gen_mov_F0_vreg() and similar functions,
which load and store to a fixed set of TCG globals
cpu_F0s, CPU_F0d, etc
2 by direct calls to tcg_gen_ld_f64() and friends
We want to phase out idiom 1 (because the use of the
fixed globals is a relic of a much older version of TCG),
but idiom 2 is quite longwinded:
tcg_gen_ld_f64(tmp, cpu_env, vfp_reg_offset(true, reg))
requires us to specify the 64-bitness twice, once in
the function name and once by passing 'true' to
vfp_reg_offset(). There's no guard against accidentally
passing the wrong flag.
Instead, let's move to a convention of accessing 64-bit
registers via the existing neon_load_reg64() and
neon_store_reg64(), and provide new neon_load_reg32()
and neon_store_reg32() for the 32-bit equivalents.
Implement the new functions and use them in the code in
translate-vfp.inc.c. We will convert the rest of the VFP
code as we do the decodetree conversion in subsequent
commits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Move the trans_*() functions we've just created from translate.c
to translate-vfp.inc.c. This is pure code motion with no textual
changes (this can be checked with 'git show --color-moved').
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Convert the VCVTA/VCVTN/VCVTP/VCVTM instructions to decodetree.
trans_VCVT() is temporarily left in translate.c.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>