As discussed (briefly) on tech-userlevel, fix our quad support to work
correctly on LP64 platforms. This is mostly just s/long/int/ in the
appropriate places.
particular problem on hosts with only wireless interfaces that are
definitely not safe to use as entropy sources.
Add arc4randbytes() which hands out bytes from the same source used
by arc4random(). This is intended to be a _temporary_ interface
until we can design and implement a better general PRNG interface
that is decoupled from the entropy-pool implementation.
Modify key_randomfill() (used only for initialization vectors on
SA creation and via key_sa_stir_iv(), which does not "stir",
despite its name) to use arc4randbytes() instead of pulling bits
directly from the entropy pool. It is my hope that this change
will pose minimal integration problems for the KAME folks as the
random-pool interface is *already* different between each BSD
variant; this just simplifies the NetBSD case and solves a
fairly serious problem.
Note that it is generally considered acceptable cryptographic
practice to use a fast stream cipher to generate IVs for encryption
with stronger block ciphers. For example, the use of "non-Approved"
PRNGs to generate IVs for "Approved" block ciphers is explicitly
sanctioned by FIPS 140-2.
1) Speed up arc4random(). We make arc4randbyte() inline, which makes this
not much slower than, say, the other arc4 implementation in our kernel.
We also replace four calls to arc4randbyte() with a loop, saving about
20% on some processors where the "unrolled" arc4randbyte() calls would
needlessly stomp the cache.
2) Address various problems with the initialization/"stirring" code,
primarily in the area of handling of the source data from the kernel
entropy pool. We used to:
a) Ask the entropy pool for 32 bytes
b) If we got zero bytes, key with junk from the stack (ouch!)
which has some nasty implications, to say the least. For
example, we're most likely to get zero bytes at boot time,
when the stack contents are even more predictable than usual.
c) If we got less than 32 bytes but more than zero bytes, use
however many bytes we got as the arc4 key, copying it
repeatedly as per usual arc4 key setup.
Because of the way NetBSD's entropy pool works, this was
mostly harmless, because if you ask for RND_EXTRACT_ANY,
you always get as many bytes as you ask for. However,
this is probably a security hole in the original FreeBSD
code, where AFAICT you might end up using an 8-bit arc4
key -- not good, much worse than using the output of the
entropy pool hash function even when it thinks it only
has 8 bits of entropy to give you.
One thing this code could do on NetBSD that was not so
good was to replace a key with a lot of entropy with
one with less entropy. That's clearly counterproductive.
The new code, instead:
a) Asks for 32 good bytes. If it gets them, use them as the
arc4 key in the usual way.
b) Tracks how many entropy bytes the key it's replacing had.
If the new entropy request got less bytes, leave the old
key in place. Note that the first time through, the "old
key" had zero bytes, so we'll always replace it.
c) If we get less then 32 bytes but more than we had, request
EXTRACT_ANY bytes from the entropy pool, padding the key
out to 32 bytes which we then use as the arc4 key in the
usual way.
This is still really all rather backwards. Instead of this generator
deciding to rekey itself using a basically arbitrary metric, it should
register a callback so that the entropy pool code could rekey it when
a lot of bits were available. Details at 11.
Finally, rename the "stir" function (which did not stir) to "rekey",
which is what it actually does.
SH-5, meet NetBSD.
Let's hope this is the start of a long and fruitful relationship. :-)
This code, funded by Wasabi Systems, adds initial support for the
Hitachi SuperH(tm) SH-5 cpu architecture to NetBSD.
At the present time, NetBSD/evbsh5 only runs on a SH-5 core simulator
which has no simulated devices other than a simple console. However, it
is good enough to get to the "root device: " prompt.
Device driver support for Real SH-5 Hardware is in place, particularly for
supporting the up-coming Cayman evaluation board, and should be quite
easy to get running when the hardware is available.
There is no in-tree toolchain for this port at this time. Gcc-current has
rudimentary SH-5 support but it is known to be buggy. A working toolchain
was obtained from SuperH to facilitate this port. Gcc-current will be
fixed in due course.
The SH-5 architecture is fully 64-bit capable, although NetBSD/evbsh5 has
currently only been tested in 32-bit mode. It is bi-endian, via a boot-
time option and it also has an "SHcompact" mode in which it will execute
SH-[34] user-land instructions.
For more information on the SH-5, see www.superh.com. Suffice to say it
is *not* just another respin of the SH-[34].
Calculate len separately for icache & dcache in case each has different
cacheline widths. Make the code for both loops the same except for the
dcbst/icbi. Deal with sizes >=2GB properly (like that'll happen but ...)
bcopy.S is no longer needed
memmove and memcpy were both stacking r0 and unstacking it to keep the return value, so push this down into _memcpy.
rename _memcpy.S to memcpy.S.
memmove.S is now just a placeholder otherwise the make system automagically adds a memmove.c file to libkern.
memmove is just another entry point for memcpy.
only 16-bit align the region and use loop mode with longwords
to do the bulk of the work.
On the 68020+, fixed an optimization bug where all regions that start
on odd addresses would not be properly longword aligned.