Commit Graph

78 Commits

Author SHA1 Message Date
rin d754abaff4 sys/crypto: Introduce arch/{arm,x86} to share common MD headers
Dedup between aes and chacha. No binary changes.
2023-08-07 01:07:35 +00:00
rin 8ee3d6ae37 sys/crypto/{aes,chacha}/arch/arm/arm_neon.h: Sync (whitespace fix)
No binary changes.
2023-08-07 00:58:35 +00:00
jmcneill 3f729ba586 Make aes and chacha prints debug only. 2022-11-05 17:36:33 +00:00
riastradh 57b54f5ca4 arm/aes_neon: Fix formatting of self-test failure message.
Discovered by code inspection.  Remarkably, a combination of errors
made this fail to be a stack buffer overrun.  Verified by booting
with ARMv8.0-AES disabled and with the self-test artificially made to
fail.
2022-06-26 17:52:54 +00:00
msaitoh 68c4a8e200 s/folllowing/following/ 2021-12-05 04:48:35 +00:00
jmcneill 129a3690a1 Upgrade self-test passed messages from verbose to debug. 2021-10-17 14:45:45 +00:00
rin d50adbc140 Fix build with clang for earmv7hf; loadroundkey() is used only for __aarch64__. 2020-11-21 08:09:21 +00:00
jmcneill 4a48ef14f2 Fix detection of NEON features. ID_AA64PFR0_EL1_ADV_SIMD_NONE means SIMD
is not available, and any other value means it is.
2020-10-10 08:24:10 +00:00
riastradh ea2d112d7c aes neon: Gather mc_forward/backward so we can load 256 bits at once. 2020-09-10 11:31:03 +00:00
riastradh 3e1dd6a02d aes neon: Hoist dsbd/dsbe address calculation out of loop. 2020-09-10 11:30:28 +00:00
riastradh db39c37e7d aes neon: Tweak register usage.
- Call r12 by its usual name, ip.
- No need for r7 or r11=fp at the moment.
2020-09-10 11:30:08 +00:00
riastradh b5c99049d3 aes neon: Write vtbl with {qN} rather than {d(2N)-d(2N+1)}.
Cosmetic; no functional change.
2020-09-10 11:29:43 +00:00
riastradh 8bfafdf5aa aes neon: Issue 256-bit loads rather than pairs of 128-bit loads.
Not sure why I didn't realize you could do this before!

Saves some temporary registers that can now be allocated to shave off
a few cycles.
2020-09-10 11:29:02 +00:00
riastradh c71abd7388 aesarmv8: Reallocate registers to shave off unnecessary MOV. 2020-09-08 23:58:09 +00:00
riastradh f70af73535 aesarmv8: Issue two 4-register ld/st, not four 2-register ld/st. 2020-09-08 23:57:43 +00:00
riastradh ab19f80d4d aesarmv8: Adapt aes_armv8_64.S to big-endian.
Patch mainly from (and tested by) jakllsch@ with minor tweaks by me.
2020-09-08 23:57:13 +00:00
riastradh 0fc796c545 aes(9): Fix edge case in bitsliced SSE2 AES-CBC decryption.
Make sure self-tests exercise this edge case.

Discovered by confusion over code inspection of jak's adaptation of
aes_armv8_64.S for big-endian.
2020-09-08 22:48:24 +00:00
jakllsch 3eade4a405 Acknowledge clang warning for NEON cipher code on aarch64eb
We've already made the nonportable vector initializations portable; the
code works on aarch64eb.
2020-09-08 17:35:27 +00:00
jakllsch 9cb9f9bc98 Fix vgetq_lane_u32 for aarch64eb with GCC
Fixes NEON AES on aarch64eb
2020-09-07 18:06:13 +00:00
maxv 60236c8c49 x86: fix several CPUID flags
- Rename: CPUID_PN      -> CPUID_PSN
           CPUID_CFLUSH  -> CPUID_CLFSH
           CPUID_SBF     -> CPUID_PBE
           CPUID_LZCNT   -> CPUID_ABM
           CPUID_P1GB    -> CPUID_PAGE1GB
           CPUID2_PCLMUL -> CPUID2_PCLMULQDQ
           CPUID2_CID    -> CPUID2_CNXTID
           CPUID2_xTPR   -> CPUID2_XTPR
           CPUID2_AES    -> CPUID2_AESNI
   To match the x86 specification and the other OSes.

 - Remove: CPUID_B10, CPUID_B20, CPUID_IA64. They do not exist.
2020-09-05 07:45:44 +00:00
riastradh 613921b5b8 Fix AES NEON code for big-endian softfp ARM.
...which is how the kernel runs.  Switch to using __SOFTFP__ for
consistency with how it gets exposed to C, although I'm not sure how
to get it defined automagically in the toolchain for .S files so
that's set manually in files.aesneon for now.
2020-08-16 18:02:03 +00:00
rin 0d644b585e Add hack to compile aes_ccm_tag() with -O0 for m68k for GCC8.
GCC 8 miscompiles aes_ccm_tag() for m68k with optimization level -O[12],
which results in failure in aes_ccm_selftest():

| aes_ccm_selftest: tag 0: 8 bytes @ 0x4d3e38
| 03 80 5f 08 22 6f cb fe                          | .._."o..
| aes_ccm_selftest: verify 0 failed
| ...
| WARNING: module error: built-in module aes_ccm failed its MODULE_CMD_INIT, error 5

This is observed for amiga (A1200, 68060), mac68k (Quadra 840AV, 68040),
and luna68k (nono, 68030 emulator). However, it is not for sun3 (TME, 68020
emulator) and sun2 (TME, 68010 emulator). At the moment, it is unclear
whether this is due to differences b/w 68010-20 vs 68030-60, or something
wrong with TME.
2020-08-10 06:27:29 +00:00
riastradh 062ecd5ff2 Fix some clang neon intrinsics.
Compile-tested only, with -Wno-nonportable-vector-initializers.  Need
to address -- and test -- this stuff properly but this is progress.
2020-08-09 02:49:38 +00:00
riastradh 6e727d4c03 Use vshlq_n_s32 rather than vsliq_n_s32 with zero destination.
Not sure why I reached for vsliq_n_s32 at first -- probably so I
wouldn't have to deal with a new intrinsic in arm_neon.h!
2020-08-09 02:48:38 +00:00
riastradh da4b946081 Nix outdated comment.
I implemented this parallelism a couple weeks ago.
2020-08-09 02:00:57 +00:00
riastradh 43f5649092 Fix mistake in big-endian arm clang.
Swapped the two halves (only gcc does that, I think) and wrote j,i
backwards, oops.

(I don't have a big-endian arm clang build handy to test; hoping this
works.)
2020-08-09 01:59:04 +00:00
riastradh 18ff0ad8d5 Fix ARM NEON implementations of AES and ChaCha on big-endian ARM.
New macros such as VQ_N_U32(a,b,c,d) for NEON vector initializers.
Needed because GCC and Clang disagree on the ordering of lanes,
depending on whether it's 64-bit big-endian, 32-bit big-endian, or
little-endian -- and, bizarrely, both of them disagree with the
architectural numbering of lanes.

Experimented with using

static const uint8_t x8[16] = {...};

        uint8x16_t x = vld1q_u8(x8);

which doesn't require knowing anything about the ordering of lanes,
but this generates considerably worse code and apparently confuses
GCC into not recognizing the constant value of x8.

Fix some clang mistakes while here too.
2020-08-08 14:47:01 +00:00
riastradh 1dd279420f Draft 2x vectorized neon vpaes for aarch64.
Gives a modest speed boost on rk3399 (Cortex-A53/A72), around 20% in
cgd tests, for parallelizable operations like CBC decryption; same
improvement should probably carry over to rpi4 CPU which lacks
ARMv8.0-AES.
2020-07-28 20:11:09 +00:00
riastradh 9c455bb20f Initialize authctr in both branches.
I guess I didn't test the unaligned case, weird.
2020-07-28 14:01:35 +00:00
riastradh 74648be169 Add RCSIDs to the AES and ChaCha .S sources. 2020-07-27 20:57:23 +00:00
riastradh 0b26be56cd Issue aese/aesmc and aesd/aesimc in pairs.
Advised by the aarch64 optimization guide; increases cgd throughput
by about 10%.
2020-07-27 20:54:11 +00:00
riastradh 57324de2aa Align critical-path loops in AES and ChaCha. 2020-07-27 20:53:22 +00:00
riastradh 6bb4d9815b PIC for aes_neon_32.S.
Without this, tests/sys/crypto/aes/t_aes fails to start on armv7
because of R_ARM_ABS32 relocations in a nonwritable text segment for
a PIE -- which atf quietly ignores in the final report!  Yikes.
2020-07-27 20:52:10 +00:00
riastradh 15cabcb36d New sysctl subtree kern.crypto.
kern.crypto.aes.selected (formerly hw.aes_impl)
kern.crypto.chacha.selected (formerly hw.chacha_impl)

XXX Should maybe deduplicate creation of kern.crypto.
2020-07-27 20:45:15 +00:00
riastradh e73a779a1d Gather auth[16] and ctr[16] into one authctr[32].
Should appease clang.
2020-07-27 20:44:30 +00:00
riastradh 93e8ddd146 Ensure aes_ccm module init runs after aes module init.
Otherwise the AES implementation might not be selected early enough.
2020-07-26 04:44:47 +00:00
riastradh 82803013ac Add some Intel intrinsics for ChaCha.
_mm_load1_ps
_mm_loadu_si128
_mm_movelh_ps
_mm_slli_epi32
_mm_storeu_si128
_mm_unpackhi_epi32
_mm_unpacklo_epi32
2020-07-25 22:45:10 +00:00
riastradh e9ec169980 Fix target attribute on _mm_movehl_ps, fix clang _mm_unpacklo_epi64.
- _mm_movehl_ps is available in SSE2, no need for SSSE3.
- _mm_unpacklo_epi64 operates on v2di, not v4si; fix.
2020-07-25 22:44:32 +00:00
riastradh a3e3d8cd1e Add 32-bit load, store, and shift intrinsics.
vld1q_u32
vst1q_u32
vshlq_n_u32
vshrq_n_u32
2020-07-25 22:43:01 +00:00
riastradh e5d58aafe6 Fix missing clang big-endian case. 2020-07-25 22:42:31 +00:00
riastradh 8c023d7fa6 Make aes boot message verbose-only. 2020-07-25 22:42:03 +00:00
riastradh 7e7db0caf8 Remove now-unused legacy rijndael API. 2020-07-25 22:40:08 +00:00
riastradh fe314d33a5 Remove now-needless AES-CCM fallback logic.
These paths are no longer exercised because all of the aes_impls now
do the AES-CCM operations.
2020-07-25 22:36:42 +00:00
riastradh f8ae41373c Implement AES-CCM with NEON. 2020-07-25 22:36:06 +00:00
riastradh 36d44afd5a Implement AES-CCM with ARMv8.5-AES. 2020-07-25 22:33:04 +00:00
riastradh 9fab18571b Invert some loops to save a branch instruction on every iteration. 2020-07-25 22:32:09 +00:00
riastradh 71992d371e Implement AES-CCM with VIA ACE. 2020-07-25 22:31:32 +00:00
riastradh 2619efef58 Implement AES-CCM with SSSE3. 2020-07-25 22:31:04 +00:00
riastradh 081e09b4c1 Implement AES-CCM with SSE2. 2020-07-25 22:29:56 +00:00
riastradh 6054d74828 Implement AES-CCM with x86 AES-NI. 2020-07-25 22:29:06 +00:00