Push it all into MD x86 code to keep it simpler, until we have other
examples on other CPUs. Simplify RDSEED-to-RDRAND fallback.
Eliminate cpu_earlyrng in favour of just using entropy_extract, which
is available early now.
Primary goals:
1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.
Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.
ENTROPY POOL
- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.
- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.
- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.
- Per-CPU gathering pools avoid contention on a global queue.
- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.
- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.
- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy
CPRNG_STRONG
- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.
(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)
- Event counters provide operator visibility into when reseeding
happens.
INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)
- Unwired for now; will be rewired in a subsequent commit.
The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Take whichever byte order gives a number that is
reasonable, i.e. lower than the number of bits in the buffer.
Continue to have `rndctl -S' generate it in machine-dependent byte
order for now, so that if you roll back to an older rndctl(8) then
`rndctl -L' on the same machine will still be able to load it with
the right entropy estimate. In a future revision, perhaps we can
change it to be little-endian.
PR toolchain/54877
GCC 8.4 miscompiles dwarf2expr.c with -O2 or -O1 for earmv7hf{,eb}, which
results in crashes described in the PRs. No upstream fixes up to now. So,
let us disable optimization for this file.
Note that this affects only earmv7hf{,eb} as far as I can see. Crashes do
not occur neither for earmv6hf{,eb} nor earmv7{,eb}.
- Don't use a static buffer for the result.
- kauth_cred_getgroups refuses to return more than the actual number
of groups, so passing NGROUPS_MAX generally doesn't work.
To avoid patching zfs, just expose struct kauth_cred::cr_groups
directly, with __KAUTH_PRIVATE. Unclear why the official API only
exposes it via memcpy or copyout anyway.
This makes unprivileged zfs operations work, by anyone with access to
/dev/zfs (which is conventionally mode 777, and which we should maybe
set it to by default; zfs has its own ACL system, zfs allow).
This avoids leaking NO_STATIC_MODULES into the public header, which
has led to considerable confusion and workarounds in pkgrsc.
PR security/39313
PR security/55216
ok christos
PID to look up a proc. Add a separate proc_find_lwpid() to look up a
proc by the ID of any of its LWPs.
- Add proc_find_lwp_acquire_proc(), which enables looking up the LWP
*and* a proc given the ID of any LWP. Returns with the proc::p_lock
held.
- Rewrite lwp_find2() in terms of proc_find_lwp_acquire_proc(), and add
allow the proc to be wildcarded, rather than just curproc or specific
proc.
- lwp_find2() now subsumes the original intent of lwp_getref_lwpid(), but
in a much nicer way, so garbage-collect the remnants of that recently
added mechanism.
The logic in futex_wait assumes there are two paths out:
1. Error (signal or timeout), in which case we take ourselves off the
queue.
2. Wakeup, in which case the waker takes us off the queue.
But if the user does FUTEX_WAIT_BITSET(bitset=0), as in the
futex_wait_pointless_bitset test, then we will never even go to
sleep, so there will be nobody to wake us as in (2), but it's not an
error as in (1) either. As a result, we're left on the queue.
Instead, don't bother with any of the wait machinery in that case.
This does not actually match Linux semantics -- Linux returns EINVAL
if bitset is zero. But let's make sure this passes the releng test
rig as the tests are written now, and then fix both the logic and the
tests -- this is a candidate fix for:
lib/libc/sys/t_futex_ops (277/847): 20 test cases
futex_basic_wait_wake_private: [6.645189s] Passed.
futex_basic_wait_wake_shared: [6.572692s] Passed.
futex_cmp_requeue: [4.624082s] Passed.
futex_requeue: [4.427191s] Passed.
futex_wait_pointless_bitset: [0.202865s] Passed.
futex_wait_timeout_deadline: [ 9074.4164779] panic: TAILQ_INSERT_TAIL 0xffff000056a1ad48 /tmp/bracket/build/2020.04.28.03.00.23-evbarm-aarch64/src/sys/kern/sys_futex.c:826
[ 9074.4340691] cpu0: Begin traceback...
[ 9074.4340691] trace fp ffffc0004ceffb40
[ 9074.4340691] fp ffffc0004ceffb60 vpanic() at ffffc000004aac58 netbsd:vpanic+0x160
[ 9074.4441432] fp ffffc0004ceffbd0 panic() at ffffc000004aad4c netbsd:panic+0x44
[ 9074.4441432] fp ffffc0004ceffc60 futex_wait_enqueue() at ffffc000004b7710 netbsd:futex_wait_enqueue+0x138
[ 9074.4555795] fp ffffc0004ceffc80 futex_func_wait.part.5() at ffffc000004b82f4 netbsd:futex_func_wait.part.5+0x17c
[ 9074.4660518] fp ffffc0004ceffd50 do_futex() at ffffc000004b8cd8 netbsd:do_futex+0x1d0
[ 9074.4660518] fp ffffc0004ceffdf0 sys___futex() at ffffc000004b9078 netbsd:sys___futex+0x50