No other errors are allowed -- other errors must be transmitted by
crypto_done. All drivers in tree (sun8i_crypto, glxsb, via_padlock,
mvcesa, mvxpsec, hifn, qat, ubsec, cryptosoft) have been audited for
this.
I'm pretty sure this never worked reliably based on code inspection,
and it's unlikely to have ever been tested because it only applies
when unregistering a driver -- but we have no crypto drivers for
removable devices, so it would only apply if we went out of our way
to trigger detach with drvctl.
Instead, just make the operation fail with ENODEV, and remove all the
callback logic to resubmit the request on EAGAIN. (Maybe this should
be ENXIO, but crypto_kdispatch already does ENODEV.)
- For crypto_getreq this makes downstream reasoning easier: on
success, crp_desc is guaranteed to be nonnull.
- For crypto_kgetreq, this was already assumed, just silently
ignored and not checked by anything.
This is extremely unlikely because I don't think we have any drivers
for removable crypto decelerators^Waccelerators...but if we were to
sprout one, and someone ran crypto_dispatch concurrently with
crypto_unregister, cryptodev_cb/mcb would enter with crp->crp_etype =
EAGAIN and with CRYPTO_F_DONE set in crp->crp_flags. In this case,
cryptodev_cb/mcb would issue crypto_dispatch but -- since nothing
clears CRYPTO_F_DONE -- it would _also_ consider the request done and
notify the ioctl thread of that.
With this change, we return early if crypto_dispatch succeeds. No
need to consult CRYPTO_F_DONE: if the callback is invoked it's done,
and if we try to redispatch it on EAGAIN but crypto_dispatch fails,
it's done. (Soon we'll get rid of the possibility of crypto_dispatch
failing synchronously, but not just yet.)
XXX This path could really use some testing!
Previously, crypto_newsession could sometimes return 0 as the
driver-specific part of the session id, and 0 as the hid, for sid=0.
But netipsec assumes that it is always safe to free sid=0 from
zero-initialized memory even if crypto_newsession has never
succeeded. So it was up to every driver in tree to gracefully handle
sid=0, if it happened to get assigned hid=0. And, as long as the
freesession callback was expected to just return an error code when
given a bogus session id, that worked out fine...because nothing ever
used the error code.
That was a terrible fragile system that should never have been
invented. Instead, let's just ensure that valid session ids are
nonzero, and make crypto_freesession with sid=0 be a no-op.
This deliberately ignores the error code returned by crypto_dispatch,
but that error code is fundamentally incoherent and the issue will be
mooted by subsequent changes to make it return void and always pass
the error through the callback, as well as subsequent changes to rip
out the EAGAIN logic anyway.
The condvar may be destroyed by the time we got here, and nothing
waits on it anyway -- instead the caller is expected to select/poll
for completion in userland.
The bug was already here, but the recent change to eliminate
CRYPTO_F_CBIMM made it happen more often by causing the callback to
_always_ be run asynchronously instead of sometimes being run
synchronously.
These should only ever have been potentially called from hard
interrupt context by CRYPTO_F_CBIMM callbacks (CBIMM = call back
immediately). CRYPTO_F_CBIMM is no more, so there is no more need to
allow this case of call from hard interrupt context.
CRYPTO_F_USER is no longer needed. It was introduced in 2008 by
darran@ in crypto.c 1.30, cryptodev.c 1.45 in an attempt to avoid
double-free between the issuing thread and asynchronous callback.
But the `fix' didn't work. In 2017, knakahara@ fixed it properly in
cryptodev.c 1.87 by distinguishing `the crypto operation has
completed' (CRYPTO_F_DONE) from `the callback is done touching the
crp object' (CRYPTO_F_DQRETQ, now renamed to CRYPTODEV_F_RET).
CRYPTO_F_CBIMM formerly served to invoke the callback synchronously
from the driver's interrupt completion routine, to reduce contention
on what was once a single cryptoret thread. Now, there is a per-CPU
queue and softint for much cheaper processing, so there is less
motivation for this in the first place. So let's remove the
complicated logic. This means the callbacks never run in hard
interrupt context, which means we don't need to worry about recursion
into crypto_dispatch in hard interrupt context.
interfaces, make sure that initialization and destruction
follow the proper sequence. This is triggered by the recent
changes to the devsw stuff; per riastradh@ the required call
sequence is:
devsw_attach()
config_init_component() or config_cf*_attach()
...
config_fini_component() or config_cf*_detach()
devsw_detach()
While here, add a few missing calls to some of the detach
routines.
Testing of these changes has been limited to:
1. compile without build break
2. no related test failures from atf
3. modload/modunload work as well as
before.
No functional device testing done, since I don't have any
of these devices. Let me know of any damage I might cause
here!
XXX Some of the modules affected by this commit are already
XXX broken; see kern/56772. This commit does not break
any additional modules (as far as I know).
This hasn't worked since it was written in 2009; if anyone cared
surely they would have fixed it by now!
(Fixing this properly -- and putting a more reasonable upper bound
than the maximum that size_t arithmetic allows -- left as an exercise
or the reader.)
Reported-by: syzbot+798d4a16bc15ae88526e@syzkaller.appspotmail.com
While here, apply various rijndael->aes renames, reduce the size
of aesxcbc_ctx by 480 bytes, and convert some malloc->kmem.
Leave in the symbol enc_xform_rijndael128 for now, though, so this
doesn't break any kernel ABI.
and pool_prime() (and their pool_cache_* counterparts):
- the pool_set*wat() APIs are supposed to specify thresholds for the count of
free items in the pool before pool pages are automatically allocated or freed
during pool_get() / pool_put(), whereas pool_sethardlimit() and pool_prime()
are supposed to specify minimum and maximum numbers of total items
in the pool (both free and allocated). these were somewhat conflated
in the existing code, so separate them as they were intended.
- change pool_prime() to take an absolute number of items to preallocate
rather than an increment over whatever was done before, and wait for
any memory allocations to succeed. since pool_prime() can no longer fail
after this, change its return value to void and adjust all callers.
- pool_setlowat() is documented as not immediately attempting to allocate
any memory, but it was changed some time ago to immediately try to allocate
up to the lowat level, so just fix the manpage to describe the current
behaviour.
- add a pool_cache_prime() to complete the API set.
automate installation of sysctl nodes.
Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself. These are not changed.