Commit Graph

295003 Commits

Author SHA1 Message Date
riastradh 8ef4287173 Revert "entropy(9): Nix rnd_trylock_sources."
Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden.  Need to do this another way.
2022-03-20 14:05:41 +00:00
riastradh e98aa0c66a entropy(9): Nix rnd_trylock_sources.
This was a premature optimization that turned out to be bogus.  It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.
2022-03-20 13:44:18 +00:00
riastradh 718cd90794 ualea(4): Enter the data under the softc lock.
This avoids a race with a concurrent ualea_get updating sc_needed,
which could lead to hang when requesting more entropy.

ualea(4) now survives

sysctl -w kern.entropy.depletion=1
cat </dev/random >/dev/null &
cat </dev/random >/dev/null &

without hanging for longer (even if yanked and reinserted in the
middle, although the detach path is not relevant to the bug this
change fixes).
2022-03-20 13:18:30 +00:00
riastradh f7b53447aa entropy(9): Fix another new race in entropy_account_cpu.
The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.
2022-03-20 13:18:11 +00:00
riastradh 5798170187 entropy(9): Shuffle some assertions around.
Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.
2022-03-20 13:17:44 +00:00
riastradh 260710e2b4 entropy(9): Lock the per-CPU state in entropy_account_cpu.
This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep.  Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it.  So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
  - lock global entropy state
  - lock per-CPU state
  - transfer
  - unlock per-CPU state
  - unlock global entropy state
2022-03-20 13:17:32 +00:00
riastradh 450311ec18 entropy(9): Factor out logic to lock and unlock per-CPU state.
No functional change intended.
2022-03-20 13:17:09 +00:00
riastradh 961b7b01fb ualea(4): Simplify xfer error branches.
- Avoid going into a loop in case the transfer fails repeatedly --
  just give up immediately if it fails.

- Assert result size is reasonable; no need to assume usbdi(9) is
  malicious.  If it can return ux_actlen > ux_length, that's a bug in
  usbdi(9) that we should fix.
2022-03-20 13:13:10 +00:00
riastradh 149a055e6b ualea(4): Fix detach and error paths.
- Set sc_needed before aborting the pipe to prevent the xfer callback
  from rescheduling itself.

- Make sure all paths out of the xfer callback clear sc_inflight.

While here, use device_printf instead of aprint_* after attach.

Now my system survives repeated insertion and yanking of ualea(4)
during:

sysctl -w kern.entropy.depletion=1
cat </dev/random >/dev/null
2022-03-20 00:41:01 +00:00
riastradh e61314e482 usbdi(9): Make sure aborting a pipe waits for all callbacks.
There may be a callback in flight from an xfer that has already been
taken off the queue by the time usbd_ar_pipe gets to it.  We must
guarantee that even that callback has completed before returning
control to the caller.
2022-03-20 00:40:52 +00:00
riastradh 88beb6d7fa entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.
Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared.  This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it.  To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.
2022-03-20 00:19:11 +00:00
riastradh a14f526cc0 usb: Insert assertion to diagnose ud_cdesc/ud_ifaces inconsistency.
Syzbot found a way to see ud_cdesc=NULL but ud_ifaces!=NULL:

https://syzkaller.appspot.com/bug?id=e6d4449a128e73a9a88100a5cc833e5cae9fecae

Maybe it's a race with two threads somehow doing usbd_free_device at
the same time when only one should, but let's rule this case out
early on to make it easier to prove it has to be a race.
2022-03-19 20:50:32 +00:00
riastradh 4d8f12e265 umidi(4): Parse descriptors a little more robustly.
Reported-by: syzbot+fd58d1d4dd12f8931486@syzkaller.appspotmail.com
2022-03-19 20:44:07 +00:00
riastradh 78aba61cf9 viornd(4): Revert IPL change for lock.
This lock is taken in hard interrupt context, so it needs to remain
at IPL_VM.
2022-03-19 16:20:45 +00:00
rillig 5cb183c368 lint: be more specific in comment about <ctype.h> check
No binary change.
2022-03-19 14:48:31 +00:00
rillig c5b32978c5 Makefile: fix location of postinstall program for MAKEVERBOSE > 2
If MAKEVERBOSE > 2, each shell command from a make target is echoed.
This resulted in two additional words ending up in the variable
_POSTINSTALL.  Noticed by Brad Harder.

Before:
$ make -v _POSTINSTALL MAKEVERBOSE=3
echo .../usr.sbin/postinstall .../usr.sbin/postinstall/postinstall ...

After:
$ make -v _POSTINSTALL MAKEVERBOSE=3
.../usr.sbin/postinstall/postinstall ...
2022-03-19 14:35:13 +00:00
riastradh 66528ec8b6 rnd(9): Delete legacy rnd_initial_entropy symbol.
Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump
2022-03-19 14:35:08 +00:00
riastradh 084a8413ac kern: Delete kernel_ticks from kernel ABI.
Use getticks() instead.
2022-03-19 14:34:47 +00:00
hannken 9ea3b23daf Remove now unused VV_LOCKSWORK, all file systems support locking.
Remove unused predicates vn_locked() and vn_anylocked().

Welcome to 9.99.95
2022-03-19 13:53:32 +00:00
hannken f9f51a4a90 Remove now unused genfs_nolock(), genfs_nounlock() and genfs_noislocked(). 2022-03-19 13:52:45 +00:00
hannken f0222b464b Switch spec_vnodeop vector to real vnode locking, VV_LOCKSWORK now. 2022-03-19 13:52:11 +00:00
hannken afc8fd54cd Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.
2022-03-19 13:51:35 +00:00
hannken ce218897d7 Lock vnode across VOP_OPEN. 2022-03-19 13:50:02 +00:00
hannken 0c2cd91233 Lock vnode across VOP_OPEN/VOP_IOCTL/VOP_CLOSE. 2022-03-19 13:49:21 +00:00
hannken f38668ff6c Switch MFS device node to real vnode locking, VV_LOCKSWORK now. 2022-03-19 13:48:42 +00:00
hannken 54ba77bc44 As FSTRANS is part of VOP_*LOCK() since June 4, 2017 the vdead_check()
from union_lock() is no longer needed.

Adapt union_lock() to the recent addition of upgrade or downgrade.

VV_LOCKSWORK now.
2022-03-19 13:48:04 +00:00
riastradh 8c13a1c7ef rnd(9): Omit needless locks in various HWRNG drivers.
Now that the rnd(9) API guarantees serial callbacks, we can simplify
everything a bit more.

(Some drivers like hifn(4) and sun8icrypto(4) still use locks to
coordinate with other parts of the driver to submit requests to and
process responses from the device.)
2022-03-19 11:55:03 +00:00
riastradh 54916e3adb rnd(9): Document the serial use of rndsource callbacks.
This simplifies the rndsource API -- no need to lock, unless you're
also coordinating with other driver logic like concurrent
opencrypto(4) requests that share device requests.
2022-03-19 11:54:53 +00:00
riastradh 574989065b tegra124_car(4): Attach rndsource synchronously.
It looks like the original motivation for deferring to
config_interrupts was to wait until softint_establish worked.  But
this no longer needs to use softints to deliver the entropy, so
that's moot.

Doing this synchronously gives us a better chance for more entropy
earlier.
2022-03-19 11:37:17 +00:00
riastradh 698c20237e rnd(9): Adjust IPL of locks used by rndsource callbacks.
These no longer ever run from hard interrupt context or with a spin
lock held, so there is no longer any need to have them at IPL_VM to
block hard interrupts.  Instead, lower them to IPL_SOFTSERIAL.
2022-03-19 11:37:05 +00:00
riastradh 09845d8340 tegra124_car(4): No need for rnd lock -- delete it.
This only ever reads from a single device register, so no need to
serialize access.

XXX This should really have a hardware-specific health test, but I
can't find any documentation on the underlying physical entropy
source.
2022-03-19 11:36:53 +00:00
riastradh 084d57f7d6 meson_rng(4): No need for lock -- delete it.
We only ever read a single register at a time; no exclusive access or
serialization needed.

XXX This driver should have some kind of hardware-specific health
test -- is there documentation anywhere for what this RNG actually
is?
2022-03-19 11:36:43 +00:00
riastradh 828f07b88f usbdi(9): Fix usbd_get_no_alts.
This incorrectly rejected the configuration as invalid if any
descriptor is not large enough to be interface descriptors.

Instead, it should reject the configuration only if any descriptor is
not large enough to be a _descriptor_, or if any interface-type
descriptor is not large enough to be an interface descriptor, but
skip over descriptors of other types even if they're smaller than
interface descriptors.

Candidate fix for PR kern/56762.
2022-03-19 10:05:52 +00:00
skrll 0f687ae99b Improve a VPRINTF 2022-03-19 09:55:30 +00:00
skrll fbc3b9a854 Alight code re-organisation so it better matches the VPRINTF headings
it is under. NFCI.
2022-03-19 09:54:25 +00:00
skrll 8003e477b5 Slight code re-organisation. NFCI. 2022-03-19 09:53:18 +00:00
riastradh 18cd690f6a clockrnd(9): Use atomic_load_relaxed for struct clockrnd::needed.
This may be set concurrently by clockrnd_get, so let's match the
atomic_store_relaxed and avoid the appearance of data races.
2022-03-18 23:37:14 +00:00
riastradh a0fb64e97e entropy(9): Establish the softint a little earlier.
Just need to wait until softint_establish and high-priority xcalls
will work, no later than that.  Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.
2022-03-18 23:37:06 +00:00
riastradh f646e78fb1 sun8icrypto(4): Do self-test and first RNG draw synchronously.
If the self-test fails, disable everything else at boot -- don't just
leave it to the operator to notice and do something.

This way we get entropy earlier at boot, before threads start and
before the first things in the kernel that draw from it (cprng fast
init, ssp init).
2022-03-18 23:36:57 +00:00
riastradh ee48cc646f sun8icrypto(4): Attach rndsource as RND_TYPE_RNG.
Previously this was attached as RND_TYPE_UNKNOWN, at a time when the
kernel assumed _any_ RNG-type rndsource produced independent uniform
random bits and subjected it to automatic tests that would fail with
high probability for many other distributions.  But sun8icrypto(4) is
very nonuniform (probably yields consecutive samples of a ring
oscillator, which are very much not independent).

Now the kernel no longer makes this assumption, so it is valid to
label this as what it is -- a hardware RNG.  We should ideally still
have better information from the vendor about what's going on under
the hood before enabling nonzero entropy for it.  But at least we can
label its type accurately.
2022-03-18 23:36:42 +00:00
riastradh ddaf907b4f sun8icrypto(4): Split out interrupt and thread locks.
No need to block interrupts while we're going through all the data
structures -- only need to block interrupts for the handoff from
interrupt handler to lower-priority logic.
2022-03-18 23:35:48 +00:00
riastradh 5901ceaa11 getrandom(2): Fix return value checks in automatic tests.
The syscall only guarantees up to 256 bytes in a single go -- if
interrupted, it might return short, but if the caller requested at
least 256 bytes it will definitely return 256 bytes.
2022-03-18 23:35:37 +00:00
riastradh e2caead148 entropy(9): Count dropped or truncated interrupt samples. 2022-03-18 23:35:28 +00:00
riastradh 4b3ca98c58 entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.
This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.
2022-03-18 23:35:19 +00:00
riastradh e4ceb72edc entropy(9): Request entropy after the softint is enabled.
Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor.  This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.
2022-03-18 23:35:07 +00:00
riastradh ceeae26ca4 entropy(9): Use the early-entropy path only while cold.
This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts.  So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint.  If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.
2022-03-18 23:34:56 +00:00
riastradh 0107837f03 entropy(9): Create per-CPU state earlier.
This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.
2022-03-18 23:34:44 +00:00
riastradh a9f92eb79d drm: In ww_mutex_unlock, do lockdebug check first.
This way we get a full lockdebug dump when LOCKDEBUG is enabled,
instead of just the panic message (which includes the lock address
you could pass to `show lock' in ddb, but let's get the dump by
default even if you don't enter ddb).

Also in the KASSERT print the mutex.
2022-03-18 23:33:41 +00:00
riastradh c08c8e5f45 dwc2: Use getticks(), not hardclock_ticks. 2022-03-18 23:32:59 +00:00
riastradh a7f1d9045b drm: Use getticks(), not hardclock_ticks. 2022-03-18 23:32:49 +00:00