It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.
5c44459c3b
This bug was reported by Danilo Ramos of Eideticom, Inc. It has
lain in wait 13 years before being found! The bug was introduced
in zlib 1.2.2.2, with the addition of the Z_FIXED option. That
option forces the use of fixed Huffman codes. For rare inputs with
a large number of distant matches, the pending buffer into which
the compressed data is written can overwrite the distance symbol
table which it overlays. That results in corrupted output due to
invalid distances, and can result in out-of-bound accesses,
crashing the application.
The fix here combines the distance buffer and literal/length
buffers into a single symbol buffer. Now three bytes of pending
buffer space are opened up for each literal or length/distance
pair consumed, instead of the previous two bytes. This assures
that the pending buffer cannot overwrite the symbol table, since
the maximum fixed code compressed length/distance is 31 bits, and
since there are four bytes of pending space for every three bytes
of symbol space.
Samples added to the entropy pool in hard interrupt context are only
buffered, never processed directly, and if they fill the buffer, the
sample is dropped -- this serves to encourage taking timing samples
in hard interrupt context because it's cheap, and we have no idea how
many samples we really need for full entropy so it's safer to err on
the side of `as many as we can get'.
But for viornd(4), we assume the host has full entropy so we only
need a single 32-byte sample, and we want to avoid dropping it so we
get full entropy ASAP. Entering the sample in a soft interrupt
rather than hard interrupt achieves this.
More fallout from the IPL_VM->IPL_SOFTSERIAL change.
In entropy_enter, there is a window when the lwp can be migrated to
another CPU:
ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();
/* lwp migration possible here */
if (pending)
entropy_account_cpu(ec);
If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.
No need to do this in entropy_softintr because softints are bound to
the CPU anyway.
Changes to code
Fix bug when mktime gets confused by truncated TZif files with
unspecified local time. (Problem reported by Almaz Mingaleev.)
Fix bug when 32-bit time_t code reads malformed 64-bit TZif data.
(Problem reported by Christos Zoulas.)
When reading a version 2 or later TZif file, the TZif reader now
validates the version 1 header and data block only enough to skip
over them, as recommended by RFC 8536 section 4. Also, the TZif
reader no longer mistakenly attempts to parse a version 1 TZIf
file header as a TZ string.
zdump -v now outputs "(localtime failed)" and "(gmtime failed)"
when local time and UT cannot be determined for a timestamp.
Previously, we sampled the time of each _failed_ config_search. I'm
not sure why -- there was no explanation in the comment or the commit
message introducing this in rev. 1.230.2.1 on tls-earlyentropy.
With this change, we sample the time of _every_ search including the
successful ones -- and also measure the time to attach which often
includes things like probing device registers, triggering device
reset and waiting for it to post, &c.
The HD audio specification does not cover PCI config space, and this
driver was unconditionally writing to a vendor specific register. Reduce
scope of config space accesses based on PCI IDs.
With this cleaned up, add support for Intel PCH devices which require
some additional vendor specific configuration to bypass no snoop mode.
- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).
- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.
not from ftp://ftp.iana.org/tz/releases/tzdata2022agtz.tar.gz
(2022a comes from ftp://ftp.iana.org/tz/releases/tzdata2022a.tar.gz)
Note that 2022agtz is mechanically derived from 2022a by moving back
zone data from the "backzone" file that had been removed as "redundant"
(because differences to some other zone are all prior to 1970) so that
this pre 1970 data is restored. It isn't necessarily correct in all
cases, but it is usually better than using some other zone's data which
is just as likely to be incorrect for where it applies, and more so elsewhere.
Summary of changes in tzdata2022a (2022-03-15 23:02:01 -0700):
* Palestine will spring forward on 2022-03-27, not 2022-03-26.
* From 1992 through spring 1996, Ukraine's DST transitions were at
02:00 standard time, not at 01:00 UTC.
* Chile's Santiago Mean Time and its LMT precursor have been adjusted
eastward by 1 second to align with past and present law.
* Changes to commentary.
- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).
- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.
- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.
Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.
Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.
This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.
This avoids a race with a concurrent ualea_get updating sc_needed,
which could lead to hang when requesting more entropy.
ualea(4) now survives
sysctl -w kern.entropy.depletion=1
cat </dev/random >/dev/null &
cat </dev/random >/dev/null &
without hanging for longer (even if yanked and reinserted in the
middle, although the detach path is not relevant to the bug this
change fixes).
The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.
Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.
This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:
- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state
- Avoid going into a loop in case the transfer fails repeatedly --
just give up immediately if it fails.
- Assert result size is reasonable; no need to assume usbdi(9) is
malicious. If it can return ux_actlen > ux_length, that's a bug in
usbdi(9) that we should fix.