qemu/target/arm
Peter Maydell e6b2fa1b81 target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed)
Our implementation of the indexed version of SVE SDOT/UDOT/USDOT got
the calculation of the inner loop terminator wrong.  Although we
correctly account for the element size when we calculate the
terminator for the first iteration:
   intptr_t segend = MIN(16 / sizeof(TYPED), opr_sz_n);
we don't do that when we move it forward after the first inner loop
completes.  The intention is that we process the vector in 128-bit
segments, which for a 64-bit element size should mean (1, 2), (3, 4),
(5, 6), etc.  This bug meant that we would iterate (1, 2), (3, 4, 5,
6), (7, 8, 9, 10) etc and apply the wrong indexed element to some of
the operations, and also index off the end of the vector.

You don't see this bug if the vector length is small enough that we
don't need to iterate the outer loop, i.e.  if it is only 128 bits,
or if it is the 64-bit special case from AA32/AA64 AdvSIMD.  If the
vector length is 256 bits then we calculate the right results for the
elements in the vector but do index off the end of the vector. Vector
lengths greater than 256 bits see wrong answers. The instructions
that produce 32-bit results behave correctly.

Fix the recalculation of 'segend' for subsequent iterations, and
restore a version of the comment that was lost in the refactor of
commit 7020ffd656 that explains why we only need to clamp segend to
opr_sz_n for the first iteration, not the later ones.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2595
Fixes: 7020ffd656 ("target/arm: Macroize helper_gvec_{s,u}dot_idx_{b,h}")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241101185544.2130972-1-peter.maydell@linaro.org
2024-11-05 10:09:58 +00:00
..
hvf hvf: arm: Implement and use hvf_get_physical_address_range 2024-09-13 15:31:47 +01:00
tcg target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed) 2024-11-05 10:09:58 +00:00
arch_dump.c target/arm: Move feature test functions to their own header 2023-10-27 11:44:32 +01:00
arm-powerctl.c target/arm: Expose arm_cpu_mp_affinity() in 'multiprocessing.h' header 2024-01-26 11:30:48 +00:00
arm-powerctl.h target/arm/arm-powerctl: Add new arm_set_cpu_on_and_reset() 2019-02-28 11:03:04 +00:00
arm-qmp-cmds.c target: Improve error reporting for CpuModelInfo member @props 2024-03-12 14:03:00 +01:00
common-semi-target.h target/arm/common-semi-target.h: Remove unnecessary boot.h include 2023-10-19 14:32:13 +01:00
cortex-regs.c target/arm: Saturate L2CTLR_EL1 core count field rather than overflowing 2023-05-18 11:39:33 +01:00
cpregs.h target/arm/cpregs: Include missing 'kvm-consts.h' header 2024-01-26 11:30:48 +00:00
cpu64.c target/arm/tcg: refine cache descriptions with a wrapper 2024-09-13 15:31:47 +01:00
cpu-features.h target/arm/tcg: refine cache descriptions with a wrapper 2024-09-13 15:31:47 +01:00
cpu-param.h license: Update deprecated SPDX tag LGPL-2.0+ to LGPL-2.0-or-later 2024-09-20 10:11:59 +03:00
cpu-qom.h target/arm: Add support for Non-maskable Interrupt 2024-04-25 10:21:04 +01:00
cpu.c target/arm: Explicitly set 2-NaN propagation rule 2024-11-05 10:09:53 +00:00
cpu.h target/arm: Add new MMU indexes for AArch32 Secure PL1&0 2024-11-05 10:09:58 +00:00
debug_helper.c target/arm: Mark up VNCR offsets (offsets 0x100..0x160) 2024-01-09 14:44:45 +00:00
gdbstub64.c gdbstub: Add support for MTE in system mode 2024-09-10 23:33:51 +01:00
gdbstub.c gdbstub: Re-factor gdb command extensions 2024-07-22 09:37:44 +01:00
gtimer.h target/arm: Move GTimer definitions to new 'gtimer.h' header 2024-01-26 11:30:49 +00:00
helper.c target/arm: Add new MMU indexes for AArch32 Secure PL1&0 2024-11-05 10:09:58 +00:00
helper.h target/arm: Widen NeonGenNarrowEnvFn return to 64 bits 2024-09-19 12:58:58 +01:00
hvf_arm.h hvf: arm: Implement and use hvf_get_physical_address_range 2024-09-13 15:31:47 +01:00
hyp_gdbstub.c target/arm: remove break after g_assert_not_reached() 2024-09-24 13:53:35 +02:00
idau.h Use DECLARE_*CHECKER* macros 2020-09-09 09:27:09 -04:00
internals.h target/arm: Add new MMU indexes for AArch32 Secure PL1&0 2024-11-05 10:09:58 +00:00
Kconfig kconfig: express dependency of individual boards on libfdt 2024-05-10 15:45:15 +02:00
kvm_arm.h target/arm: kvm: require KVM_CAP_DEVICE_CTRL 2024-10-29 15:04:47 +00:00
kvm-consts.h exec: Rename NEED_CPU_H -> COMPILING_PER_TARGET 2024-04-26 09:49:51 +02:00
kvm-stub.c target/arm: Avoid bare abort() or assert(0) 2022-05-05 09:35:51 +01:00
kvm.c target/arm: kvm: require KVM_CAP_DEVICE_CTRL 2024-10-29 15:04:47 +00:00
machine.c target/arm: Rename FPSR_MASK and FPCR_MASK and define them symbolically 2024-07-11 11:41:33 +01:00
meson.build target/arm: Move v7m-related code from cpu32.c into a separate file 2024-03-08 14:45:03 +00:00
multiprocessing.h target/arm: Expose arm_cpu_mp_affinity() in 'multiprocessing.h' header 2024-01-26 11:30:48 +00:00
op_addsub.h
ptw.c target/arm: Add new MMU indexes for AArch32 Secure PL1&0 2024-11-05 10:09:58 +00:00
syndrome.h target/arm: fix exception syndrome for AArch32 bkpt insn 2024-02-02 13:51:57 +00:00
tcg-stubs.c target/arm: Move hflags code into the tcg directory 2023-02-27 13:27:04 +00:00
trace-events target/arm: Implement FEAT_ECV CNTPOFF_EL2 handling 2024-03-07 12:19:03 +00:00
trace.h trace: switch position of headers to what Meson requires 2020-08-21 06:18:24 -04:00
vfp_helper.c target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr 2024-10-29 15:04:46 +00:00