Just in case of uninitialized padding which would lead to kernel
stack disclosure. If the compiler can prove the memset redundant
then it can optimize it away; otherwise better safe than sorry.
Time granularity in microseconds is sometimes not enough to provide
accurate bandwidth control; actual bandwidth on a capped class
can exceed its limit considerably in some cases.
Treating time in nanoseconds requires the following changes:
- Use timespec instead of timeval
- Use nanotime(9) instead of microtime(9)
- Change the type of some variables, especially *idle, to long
Also the type of variables of it is changed to u_long from int.
This change provides fine-grain resolution of bandwidth. For example
750 Mbps was treated as 800 Mbps internally because bandwidth was
represented as nanoseconds per byte. Converting the representation
to picoseconds per byte enables to treat 750 Mbps as-is.
PR kern/56319
A delay period was calculated based on a suspended class even if the
class can borrow bandwidth from another class. In that case a delay
period should be calculated based on a borrowed class. Otherwise
a borrowing class can't borrow bandwidth correctly.
then only clean up required to to free cl. delete all the
code at err_ret that was probably used in the past, and move
the free + return upon failure earlier.
(Would be nice to change the types too, and it's *probably* safe to
replace int by u_int, but I'm reluctant to touch the ioctl
definitions without at least a modicum more thought. Also one of
them is a u_long, because why not?)
From Ilja Van Sprundel.
Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.
The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.
The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.
Proposed on tech-kern and tech-net.
The change ensures that ifnet objects in the ifnet list aren't freed during
list iterations by using pserialize(9) and psref(9).
Note that the change adds a pslist(9) for ifnet but doesn't remove the
original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We
shouldn't use the original list in the kernel anymore.
- Introduce NET_MPSAFE
- not defined by default
- Add ifq_lock to protect ifnet#if_snd
- Initialize ifq_lock and lock IFQ operations
when NET_MPSAFE
When NET_MPSAFE isn't defined, this modification
doesn't change its behavior and adds trivial
performance overheads.
Discussed with matt@ on tech-net
designated initializers.
I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:
An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.
A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.
The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.
An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.
A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.
An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.
In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.
The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.
The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.
A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.
The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.
Manual pages for the new kernel interfaces are forthcoming.
- it is not infinity
- it is exposed to userland
- it causes confusion when #include <math.h> follows
(I don't know why GCC does not object to redefining the symbol)
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)