1. A race condition (bug) in sys/rump/librump/rumpkern/intr.c since
rumpuser_cv_signal() is called without holding a mutex
2. sleepq is implemented using a single (global) conditional
variable; that should be done per each sleepq separately
uvm_page_unbusy() so that all callers of uvm_page_unbusy() don't need to
handle this flag separately. Split out the pages part of uvm_aio_aiodone()
into uvm_aio_aiodone_pages() in rump just like in the real kernel.
In ZFS functions that can fail to copy data between the ARC and VM pages,
use uvm_aio_aiodone_pages() rather than uvm_page_unbusy() so that we can
handle these "I/O" errors. Fixes PR 55702.
right now. new address-of-packed-member and format-overflow
warnings have new GCC_NO_ADDR_OF_PACKED_MEMBER amd
GCC_NO_FORMAT_OVERFLOW variables to remove these warnings.
apply to a bunch of the tree. mostly, these are real bugs that
should be fixed, but in many cases, only by removing the 'packed'
attribute from some structure that doesn't really need it. (i
looked at many different ones, and while perhaps 60-80% were
already properly aligned, it wasn't clear to me that the uses
were always coming from sane data vs network alignment, so it
doesn't seem safe to remove packed without careful research for
each affect struct.) clang already warned (and was not erroring)
for many of these cases, but gcc picked up dozens more.
The poorly named uvm.h is generally supposed to be for uvm-internal
users only.
- Narrow it to files that actually need it -- mostly files that need
to query whether curlwp is the pagedaemon, which should maybe be
exposed by an external header.
- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
by it. We should split up uvm_extern.h but this will serve for now
to reduce the uvm.h dependencies.
- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
UVMHIST(ubchist), since ubchist is declared in uvm.h but the
reference evaporates if UVMHIST is not defined, so we reduce header
file dependencies.
- Make uvm_device.h and uvm_swap.h independently includable while
here.
ok chs@
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8)
tool compatible with wireguard-tools; update wg(4) for the minor
changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just
clarifies the current state of affairs as it exists in the development
tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
This may reduce performance by not taking advantage of 64x64->128
multiplications on some platforms, but let's worry about that later
and fix the build on the other platforms instead.
Three ways to call:
getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.
getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.
getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.
Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.
- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.
- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.
As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):
https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html
Adiantum is a wide-block cipher, built out of AES, XChaCha12,
Poly1305, and NH, defined in
Paul Crowley and Eric Biggers, `Adiantum: length-preserving
encryption for entry-level processors', IACR Transactions on
Symmetric Cryptology 2018(4), pp. 39--61.
Adiantum provides better security than a narrow-block cipher with CBC
or XTS, because every bit of each sector affects every other bit,
whereas with CBC each block of plaintext only affects the following
blocks of ciphertext in the disk sector, and with XTS each block of
plaintext only affects its own block of ciphertext and nothing else.
Adiantum generally provides much better performance than
constant-time AES-CBC or AES-XTS software do without hardware
support, and performance comparable to or better than the
variable-time (i.e., leaky) AES-CBC and AES-XTS software we had
before. (Note: Adiantum also uses AES as a subroutine, but only once
per disk sector. It takes only a small fraction of the time spent by
Adiantum, so there's relatively little performance impact to using
constant-time AES software over using variable-time AES software for
it.)
Adiantum naturally scales to essentially arbitrary disk sector sizes;
sizes >=1024-bytes take the most advantage of Adiantum's design for
performance, so 4096-byte sectors would be a natural choice if we
taught cgd to change the disk sector size. (However, it's a
different cipher for each disk sector size, so it _must_ be a cgd
parameter.)
The paper presents a similar construction HPolyC. The salient
difference is that HPolyC uses Poly1305 directly, whereas Adiantum
uses Poly1395(NH(...)). NH is annoying because it requires a
1072-byte key, which means the test vectors are ginormous, and
changing keys is costly; HPolyC avoids these shortcomings by using
Poly1305 directly, but HPolyC is measurably slower, costing about
1.5x what Adiantum costs on 4096-byte sectors.
For the purposes of cgd, we will reuse each key for many messages,
and there will be very few keys in total (one per cgd volume) so --
except for the annoying verbosity of test vectors -- the tradeoff
weighs in the favour of Adiantum, especially if we teach cgd to do
>>512-byte sectors.
For now, everything that Adiantum needs beyond what's already in the
kernel is gathered into a single file, including NH, Poly1305, and
XChaCha12. We can split those out -- and reuse them, and provide MD
tuned implementations, and so on -- as needed; this is just a first
pass to get Adiantum implemented for experimentation.
1. Rip out old variable-time reference implementation.
2. Replace it by BearSSL's constant-time 32-bit logic.
=> Obtained from commit dda1f8a0c46e15b4a235163470ff700b2f13dcc5.
=> We could conditionally adopt the 64-bit logic too, which would
likely give a modest performance boost on 64-bit platforms
without AES-NI, but that's a bit more trouble.
3. Select the AES implementation at boot-time; allow an MD override.
=> Use self-tests to verify basic correctness at boot.
=> The implementation selection policy is rather rudimentary at
the moment but it is isolated to one place so it's easy to
change later on.
This (a) plugs a host of timing attacks on, e.g., cgd, and (b) paves
the way to take advantage of CPU support for AES -- both things we
should've done a decade ago. Downside: Computing AES takes 2-3x the
CPU time. But that's what hardware support will be coming for.
Rudimentary measurement of performance impact done by:
mount -t tmpfs tmpfs /tmp
dd if=/dev/zero of=/tmp/disk bs=1m count=512
vnconfig -cv vnd0 /tmp/disk
cgdconfig -s cgd0 /dev/vnd0 aes-cbc 256 < /dev/zero
dd if=/dev/rcgd0d of=/dev/null bs=64k
dd if=/dev/zero of=/dev/rcgd0d bs=64k
The AES-CBC encryption performance impact is closer to 3x because it
is inherently sequential; the AES-CBC decryption impact is closer to
2x because the bitsliced AES logic can process two blocks at once.
Discussed on tech-kern:
https://mail-index.NetBSD.org/tech-kern/2020/06/18/msg026505.html
x86/x86/tsc.c rev. 1.67 reduced cache problem and got big improvement, but it
still has room. I measured the effect of lfence, mfence, cpuid and rdtscp.
The impact to TSC skew and/or drift is:
AMD: mfence > rdtscp > cpuid > lfence-serialize > lfence = nomodify
Intel: lfence > rdtscp > cpuid > nomodify
So, mfence is the best on AMD and lfence is the best on Intel. If it has no
SSE2, we can use cpuid.
NOTE:
- An AMD's document says DE_CFG_LFENCE_SERIALIZE bit can be used for
serializing, but it's not so good.
- On Intel i386(not amd64), it seems the improvement is very little.
- rdtscp instruct can be used as serializing instruction + rdtsc, but
it's not good as [lm]fence. Both Intel and AMD's document say that
the latency of rdtscp is bigger than rdtsc, so I suspect the difference
of the result comes from it.
This file is OS-agnostic and cannot include OS specific headers.
The fallback for NetBSD specific symbol __BEGIN_DECLS didn't work as it was
misspelled as _BEGIN_DECLS.
This is much better handled by a user-land tool.
Proposed on tech-net here:
https://mail-index.netbsd.org/tech-net/2020/04/22/msg007766.html
Note that the ioctl SIOCGIFINFO_IN6 no longer sets flags. That now
needs to be done using the pre-existing SIOCSIFINFO_FLAGS ioctl.
Compat is fully provided where it makes sense, but trying to turn on
RA handling will obviously throw an error as it no longer exists.
Note that if you use IPv6 temporary addresses, this now needs to be
turned on in dhcpcd.conf(5) rather than in sysctl.conf(5).