qemu/target at staging-9.0 - qemu

History

Peter Maydell 39ef8286e6 target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed) Our implementation of the indexed version of SVE SDOT/UDOT/USDOT got the calculation of the inner loop terminator wrong. Although we correctly account for the element size when we calculate the terminator for the first iteration: intptr_t segend = MIN(16 / sizeof(TYPED), opr_sz_n); we don't do that when we move it forward after the first inner loop completes. The intention is that we process the vector in 128-bit segments, which for a 64-bit element size should mean (1, 2), (3, 4), (5, 6), etc. This bug meant that we would iterate (1, 2), (3, 4, 5, 6), (7, 8, 9, 10) etc and apply the wrong indexed element to some of the operations, and also index off the end of the vector. You don't see this bug if the vector length is small enough that we don't need to iterate the outer loop, i.e. if it is only 128 bits, or if it is the 64-bit special case from AA32/AA64 AdvSIMD. If the vector length is 256 bits then we calculate the right results for the elements in the vector but do index off the end of the vector. Vector lengths greater than 256 bits see wrong answers. The instructions that produce 32-bit results behave correctly. Fix the recalculation of 'segend' for subsequent iterations, and restore a version of the comment that was lost in the refactor of commit `7020ffd656` that explains why we only need to clamp segend to opr_sz_n for the first iteration, not the later ones. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2595 Fixes: `7020ffd656` ("target/arm: Macroize helper_gvec_{s,u}dot_idx_{b,h}") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241101185544.2130972-1-peter.maydell@linaro.org (cherry picked from commit `e6b2fa1b81`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>		2024-11-08 13:02:58 +03:00
..
alpha	target/alpha: Prefer fast cpu_env() over slower CPU QOM cast macro	2024-03-12 11:46:16 +01:00
arm	target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed)	2024-11-08 13:02:58 +03:00
avr	target/avr: Prefer fast cpu_env() over slower CPU QOM cast macro	2024-03-12 11:46:17 +01:00
cris	target/cris: Prefer fast cpu_env() over slower CPU QOM cast macro	2024-03-12 11:46:17 +01:00
hexagon	target/hexagon: don't look for static glib	2024-08-28 08:37:29 +03:00
hppa	target/hppa: Fix PSW V-bit packaging in cpu_hppa_get for hppa64	2024-09-05 22:59:09 +03:00
i386	target/i386: Use probe_access_full_mmu in ptw_translate	2024-11-08 13:02:57 +03:00
loongarch	target/loongarch: Fix helper_lddir() a CID INTEGER_OVERFLOW issue	2024-07-26 13:12:12 +03:00
m68k	target/m68k: Always return a temporary from gen_lea_mode	2024-10-10 21:03:54 +03:00
microblaze	target/microblaze: Use insn_start from DisasContextBase	2024-04-09 07:45:09 -10:00
mips	target/mips: Prefer fast cpu_env() over slower CPU QOM cast macro	2024-03-12 12:04:24 +01:00
nios2	target/nios2: Prefer fast cpu_env() over slower CPU QOM cast macro	2024-03-12 12:04:24 +01:00
openrisc	target/openrisc: Prefer fast cpu_env() over slower CPU QOM cast macro	2024-03-12 12:04:24 +01:00
ppc	target/ppc: Fix mtDPDES targeting SMT siblings	2024-11-08 13:02:58 +03:00
riscv	target/riscv: Fix vcompress with rvv_ta_all_1s	2024-11-08 13:02:58 +03:00
rx	target/rx: Use target_ulong for address in LI	2024-08-28 08:37:28 +03:00
s390x	target/s390x: Use insn_start from DisasContextBase	2024-04-09 07:45:09 -10:00
sh4	target/sh4: Update DisasContextBase.insn_start	2024-05-09 16:48:26 +03:00
sparc	target/sparc: Restrict STQF to sparcv9	2024-08-28 08:37:29 +03:00
tricore	target/tricore/helper: Use correct string format in cpu_tlb_fill()	2024-03-26 14:24:06 +01:00
xtensa	target/xtensa: Prefer fast cpu_env() over slower CPU QOM cast macro	2024-03-12 12:04:25 +01:00
Kconfig
meson.build	target: Make qemu_target_page_mask() available for *-user	2024-01-29 21:04:10 +10:00
target-common.c	target: Make qemu_target_page_mask() available for *-user	2024-01-29 21:04:10 +10:00