macros, __BIT, __BITS, SHIFTIN, SHIFTOUT, and __arraycount() from
lib/libkern/libkern.h to sys/cdefs.h. Add a __-prefix to SHIFTIN
and SHIFTOUT, and add a manual page for the bit-twiddling macros,
bits(3).
Make the __BIT and __BITS macros "widthless," as best I can, by
changing their type to uintmax_t from uint32_t. XXX The manual
page lags this change by a bit.
Define __PRIxBIT and __PRIxBITS printf(3) format strings.
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
These sources are ported from FreeBSD/ia64 code.
See individual source files for credits.
In addition, code from NetBSD/alpha NetBSD/sparc64,
NetBSD/i386 and NetBSD/amd64 were used as templates,
along with my own additions.
counting the number of elements in a static array, using the idiom,
sizeof(array)/sizeof(array[0]).
XXX This may move in the future, but this is a safe place to put
XXX it for use in the kernel.
FROMBCD()/TOBCD() macros into wrappers around it, resulting in both
smaller code footprint and elimination of possible issues due to
multiple evaluation of macro arguments.
Suggested by Simon Burge and Anders Gavare on tech-kern.
I had duplicated them. Improve the macros' names. Simplify their
implementation.
A brief description of each macro is below.
BIT(n): Return a bitmask with bit m set, where the least
significant bit is bit 0.
BITS(m, n): Return a bitmask with bits m through n, inclusive,
set. It does not matter whether m>n or m<=n.
The least significant bit is bit 0.
A "bitfield" is a span of consecutive bits defined by a
bitmask, where 1s select the bits in the bitfield. SHIFTIN,
SHIFTOUT, and SHIFTOUT_MASK help read and write bitfields
from device registers.
SHIFTIN(v, mask): Left-shift bits `v' into the bitfield
defined by `mask', and return them. No
side-effects.
SHIFTOUT(v, mask): Extract and return the bitfield selected
by `mask' from `v', right-shifting the
bits so that the rightmost selected bit
is at bit 0. No side-effects.
SHIFTOUT_MASK(mask): Right-shift the bits in `mask' so that
the rightmost non-zero bit is at bit
0. This is useful for finding the
greatest unsigned value that a bitfield
can hold. No side-effects. Note that
SHIFTOUT_MASK(m) = SHIFTOUT(m, m).
Examples:
/*
* Register definitions taken from the RFMD RF3000 manual.
*/
#define RF3000_GAINCTL 0x11 /* TX variable gain control */
#define RF3000_GAINCTL_TXVGC_MASK BITS(7, 2)
#define RF3000_GAINCTL_SCRAMBLER BIT(1)
/*
* Shift the transmit power into the transmit-power field of the
* gain-control register and write it to the baseband processor.
*/
atw_rf3000_write(sc, RF3000_GAINCTL,
SHIFTIN(txpower, RF3000_GAINCTL_TXVGC_MASK));
/*
* Register definitions taken from the ADMtek ADM8211 manual.
*
*/
#define ATW_RXSTAT_OWN BIT(31) /* 1: NIC may fill descriptor */
/* ... */
#define ATW_RXSTAT_DA1 BIT(17) /* DA bit 1, admin'd address */
#define ATW_RXSTAT_DA0 BIT(16) /* DA bit 0, group address */
#define ATW_RXSTAT_RXDR_MASK BITS(15,12) /* RX data rate */
#define ATW_RXSTAT_FL_MASK BITS(11,0) /* RX frame length, last
* descriptor only
*/
/* Extract the frame length from the Rx descriptor's
* status field.
*/
len = SHIFTOUT(rxstat, ATW_RXSTAT_FL_MASK);
(Note: memcmp/memset improvements also benefit non-Xscale).
memcmp() - Compare 32-bits at a time if possible. Special-case 6-byte
comparisons, for the benefit of the network stack.
memset() - More loop unrolling, plus use of 'strd' instruction,
results in > 100% speedup on Xscale.
memcpy() - Big-endian support, unrolled loops, 'strd/ldrd/pld', plus
special-cases for very common length/alignment combinations
(at least in the kernel). Benchmarks show ~50% improvment on
Xscale.
memmove() - Big-endian support. Use fast memcpy(), above, if the regions
don't overlap. Otherwise unchanged.
totally different version that was here. This version, of course, has an
BSD license on it while the old one did not. This one also compiles down
to tighter code--the smaller the better for libkern & libsa.
required. That's what memmove() is for.
This should fix port-powerpc/16889. The backwards copyin can confuse
uiomove/genfs_getpages, resulting in corruption of files written over NFS.
As discussed (briefly) on tech-userlevel, fix our quad support to work
correctly on LP64 platforms. This is mostly just s/long/int/ in the
appropriate places.
particular problem on hosts with only wireless interfaces that are
definitely not safe to use as entropy sources.
Add arc4randbytes() which hands out bytes from the same source used
by arc4random(). This is intended to be a _temporary_ interface
until we can design and implement a better general PRNG interface
that is decoupled from the entropy-pool implementation.
Modify key_randomfill() (used only for initialization vectors on
SA creation and via key_sa_stir_iv(), which does not "stir",
despite its name) to use arc4randbytes() instead of pulling bits
directly from the entropy pool. It is my hope that this change
will pose minimal integration problems for the KAME folks as the
random-pool interface is *already* different between each BSD
variant; this just simplifies the NetBSD case and solves a
fairly serious problem.
Note that it is generally considered acceptable cryptographic
practice to use a fast stream cipher to generate IVs for encryption
with stronger block ciphers. For example, the use of "non-Approved"
PRNGs to generate IVs for "Approved" block ciphers is explicitly
sanctioned by FIPS 140-2.
1) Speed up arc4random(). We make arc4randbyte() inline, which makes this
not much slower than, say, the other arc4 implementation in our kernel.
We also replace four calls to arc4randbyte() with a loop, saving about
20% on some processors where the "unrolled" arc4randbyte() calls would
needlessly stomp the cache.
2) Address various problems with the initialization/"stirring" code,
primarily in the area of handling of the source data from the kernel
entropy pool. We used to:
a) Ask the entropy pool for 32 bytes
b) If we got zero bytes, key with junk from the stack (ouch!)
which has some nasty implications, to say the least. For
example, we're most likely to get zero bytes at boot time,
when the stack contents are even more predictable than usual.
c) If we got less than 32 bytes but more than zero bytes, use
however many bytes we got as the arc4 key, copying it
repeatedly as per usual arc4 key setup.
Because of the way NetBSD's entropy pool works, this was
mostly harmless, because if you ask for RND_EXTRACT_ANY,
you always get as many bytes as you ask for. However,
this is probably a security hole in the original FreeBSD
code, where AFAICT you might end up using an 8-bit arc4
key -- not good, much worse than using the output of the
entropy pool hash function even when it thinks it only
has 8 bits of entropy to give you.
One thing this code could do on NetBSD that was not so
good was to replace a key with a lot of entropy with
one with less entropy. That's clearly counterproductive.
The new code, instead:
a) Asks for 32 good bytes. If it gets them, use them as the
arc4 key in the usual way.
b) Tracks how many entropy bytes the key it's replacing had.
If the new entropy request got less bytes, leave the old
key in place. Note that the first time through, the "old
key" had zero bytes, so we'll always replace it.
c) If we get less then 32 bytes but more than we had, request
EXTRACT_ANY bytes from the entropy pool, padding the key
out to 32 bytes which we then use as the arc4 key in the
usual way.
This is still really all rather backwards. Instead of this generator
deciding to rekey itself using a basically arbitrary metric, it should
register a callback so that the entropy pool code could rekey it when
a lot of bits were available. Details at 11.
Finally, rename the "stir" function (which did not stir) to "rekey",
which is what it actually does.