Richard Henderson 22437b4de9 util/bufferiszero: Add simd acceleration for aarch64
Because non-embedded aarch64 is expected to have AdvSIMD enabled, merely
double-check with the compiler flags for __ARM_NEON and don't bother with
a runtime check.  Otherwise, model the loop after the x86 SSE2 function.

Use UMAXV for the vector reduction.  This is 3 cycles on cortex-a76 and
2 cycles on neoverse-n1.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-05-03 08:03:35 -07:00
..
2023-03-20 12:43:50 +01:00
2023-12-19 19:03:38 +01:00
2023-05-23 15:20:15 +08:00
2023-03-20 12:43:50 +01:00
2024-01-10 06:58:50 +00:00
2023-09-07 20:32:11 -05:00
2024-04-18 11:17:27 +02:00
2023-04-24 11:29:00 +02:00
2023-08-31 19:47:43 +02:00
2023-02-02 11:48:20 +00:00
2023-03-28 15:23:10 -07:00
2023-11-15 12:06:05 +03:00
2023-04-27 16:39:43 +02:00
2024-01-30 21:20:20 +03:00
2023-11-03 09:20:31 +01:00