Richard Henderson 22437b4de9 util/bufferiszero: Add simd acceleration for aarch64
Because non-embedded aarch64 is expected to have AdvSIMD enabled, merely
double-check with the compiler flags for __ARM_NEON and don't bother with
a runtime check.  Otherwise, model the loop after the x86 SSE2 function.

Use UMAXV for the vector reduction.  This is 3 cycles on cortex-a76 and
2 cycles on neoverse-n1.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-03 08:03:35 -07:00
..
2023-12-19 19:03:38 +01:00
2024-01-10 06:58:50 +00:00
2023-09-07 20:32:11 -05:00
2024-04-18 11:17:27 +02:00
2023-08-31 19:47:43 +02:00
2023-11-15 12:06:05 +03:00
2024-01-30 21:20:20 +03:00
2023-11-03 09:20:31 +01:00