It is true that cgd_crypto.c depends on sys/crypto/adiantum now, and
transitively on sys/crypto/aes.
However, there's a problem with the cgd module having a formal
(transitive) module dependency on the aes module.
Yesterday I thought the problem with this was that fpu_kern_enter was
artificially restricted while cold -- to detect, and noisily crash
on, reentrance, it raises the IPL to IPL_VM, asserts that the IPL is
not _higher_ (so it can't be re-entered by an IPL_SCHED or IPL_HIGH
interrupt), and asserts that it's not currently in use on the current
CPU.
Early at boot, the IPL is at IPL_HIGH, and no interrupts are possible
anyway, so the assertions tripped for artificial reasons, which I
fixed in:
https://mail-index.netbsd.org/source-changes/2022/04/01/msg137840.html
However, I had forgotten that there's a deeper problem for the cgd
module dependency on aes. The ordering of events is:
1. Initialize builtin MODULE_CLASS_DRIVER modules -- including cgd.
2. Run configure -- including detecting CPUs, which on aarch64 is
where the decision of which AES (and ChaCha) implementation to use
based on supported CPU features.
3. Initialize builtin MODULE_CLASS_MISC modules -- including aes,
_if_ there are no driver-class modules that depend on it.
There's a tangle of ordering dependencies here:
- MODULE_CLASS_DRIVER modules providing _autoconf_ drivers generally
have to be initialized _before_ configure, because you need the
driver to be initialized before configure can attach its devices.
- configure must run _before_ aes is initialized because the decision
of which AES implementation to choose depends on CPU features
detected in configure, and the prospect of dynamically changing the
AES implementation is too painful to contemplate (it may change the
key schedule, so it would invalidate any existing key schedules
precomputed by callers like uvm_swap or configured cgd devices,
which raises a host of painful concurrency issues to invalidate
these cached key schedules on all CPUs in all subsystems using
them).
- cgd doesn't figure into the configure stage of autoconf, but it
nevertheless has to be MODULE_CLASS_DRIVER because specfs autoloads
MODULE_CLASS_DRIVER modules in case they provide _devsw_ drivers
(i.e., /dev nodes), as cgd does. And we don't have a mechanism for
identifying `autoconf driver modules' separately from `devsw driver
modules' because some modules provide both and each module can have
only one class.
For now, this is breaking boot on several tier I architectures so
let's nix the cgd->adiantum->aes module dependency as a stop-gap
measure.