The recent change to cpu.h and struct cpu_info needs an early definition
for the ARM_INTR_IMPLs that want to inline the spl functions and get
{,set_}curcpl() from cpu.h
The old boot_ustar had many hard coding and was small(<1KB).
The new xxboot_ustarfs is a part of integrated larger(<8KB) xxboot.
We had to maintain three similar but not the same bootloaders, but
now we have two!
- Rewrite boot.S completely.
boot.S now supports boot from SCSI HD/CD and floppy.
- Use IOCS call to identify the floppy format, instead of chkfmt.s which
accesses hardware directly.
- Import print_hex() debug function from boot_ufs.
- Import a feature that displays initial registers (for debug) from boot_ufs,
and restore it (this in boot_ufs has been broken).
- Add size optimized alternatives for some libkern routines.
- Stop linking libsa to prevent to link unexpected objects.
- Bump version to 2.0.
1. Give the offset of the rbnode, not some other random members to
overwrite with garbage.
2. Don't try to unlock a mutex at NULL.
3. Make sure all paths out after ugenif_acquire go via
ugenif_release.
This way putting ugenif in kernel config actually works to wire it to
the /dev/ugenN.MM device nodes in userland.
Not a fully fleshed out solution to the ugen problem -- there's no
way for a userland driver to kick out a kernel driver and take over,
but this will let us, e.g., use uhidev(4) for Yubikey OTP/U2F/FIDO2
but ugen(4), with pcscd(8), for Yubikey CCID.
Fix various MP-safety issues while here (still not MPSAFE, but more
progress).
cv_signal already has this microoptimization.
While here, make the lock cover the relevant things we're issuing
cv_signal about -- progress toward real MP-safety.
it can be shared between ddb, other mips kernel components (see
locore), and an upcoming crash(8) port.
remove second copy of kdbpeek() (hidden by old DDB_TRACE
option, but they're functionally equivalent.)
tested on octeon.
ok simonb@
- Switch to TPIDRPRW_IS_CURLWP, because curlwp is accessed much more often
by MI code. It also makes curlwp preemption safe,
- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).
- Make sure ASTs are always set on the same CPU as the target LWP, and
delivered via IPI if posted from a remote CPU so that they are resolved
quickly.
- Add some cache line padding to struct cpu_info.
- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit().
- Use word alignment (rather than longword alignment) for text/data section.
- Use short reference. In the bootloader, all symbols can be expressed in
short.
- Check the filesize of resulting file instead of section sizes.
If there is a gap between sections, the filesize may be exceeded even
though the section total is not exceeded. Of course that doesn't
usually happen but I've experienced this during development.
Three ways to call:
getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.
getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.
getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.
Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.
- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.
- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.
As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):
https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html
- Implement pmap_growkernel(), and update kernel pmap's stats with atomics.
- Then, pmap_kenter_pa() and pmap_kremove() no longer need to allocate
memory nor take pm_lock, because they only modify L3 PTEs.
- Then, pm_lock and pp_lock can be adaptive mutexes at IPL_NONE which are
cheaper than spin mutexes.
- Take the pmap's lock in pmap_extract() if not the kernel's pmap, otherwise
pmap_extract() might see inconsistent state.
all mine)
- Use tpidr_el1 to hold curlwp and not curcpu, because curlwp is accessed
much more often by MI code. It also makes curlwp preemption safe and
allows aarch64_curlwp() to be a const function (curcpu must be volatile).
- Make ASTs operate per-LWP rather than per-CPU, otherwise sometimes LWPs
can see spurious ASTs (which doesn't cause a problem, it just means some
time may be wasted).
- Use plain stores to set/clear ASTs. Make sure ASTs are always set on the
same CPU as the target LWP, and delivered via IPI if posted from a remote
CPU so that they are resolved quickly.
- Add some cache line padding to struct cpu_info, to match x86.
- Add a memory barrier in a couple of places where ci_curlwp is set. This
is needed whenever an LWP that is resuming on the CPU could hold an
adaptive mutex. The barrier needs to drain the CPU's store buffer, so
that the update to ci_curlwp becomes globally visible before the LWP can
resume and call mutex_exit(). By my reading of the ARM docs it looks like
the instruction I used will do the right thing, but I'm not 100% sure.
This change fixes a bug that extra delay() is called only once even if
atomic_cas_uint() isn't failed or delay() isn't called when atomic_cas_uint()
failed.
The reason of this bug was that I simply converted FreeBSD' atomic_cmpset_int()
to atomic_cas_uint(). The return value's semantics is different.