This way it is no longer necessary to mark variables __diagused if
they are used in KASSERT conditions.
Fix fallout from this by removing now-unnecessary and `#ifdef
DIAGNOSTIC'.
Don't do the same for KDASSERT if !DEBUG -- unlike KASSERT and
DIAGNOSTIC, variables needed by KDASSERT and DEBUG are likely to be
expensive to compute (and potentially difficult for a compiler to
prove flushable), so we don't want to require them under !DEBUG.
that change had no effect because vpanic() is marked __dead / noreturn
and thus the compiler would optimize away everything after a call to vpanic().
the original problem has now been fixed differently (but only for x86 so far).
Normally, functions called from mcount.c are expected to be expanded inline,
so _mcount() will never be called recursively. But when build with COPTS=-O0,
`static inline' functions aren't inlined, and _mcount() will be called
recursively.
Even if _mcount() has `__attribute__((__no_ instrument_function__))',
it has no effect on the calling external (no-inlined) function.
To avoid this, PROF.<fn> is added can be set the profiling flag of any file.
"PROF.mcount.c" is set to blank by default, mcount.c itself is compiled
without -pg.
Fix ABI mismatch for armhf runtime routines for floating-point arithmetics;
For hard-float arm variants, provide
(1) generic runtime routines with correct calling convention, and
(2) EABI runtime routines at the same time.
I've confirmed that no binary changes for kernels.
LGTM by skrll
- dkcksum() and dkcksum_sized() move from subr_disk.c and from
libsa into libkern/dkcksum.c (which is missing _sized() version),
using the version from usr.sbin/disklabel.
- swap_disklabel() moves from subr_disk_mbr.c into libkern, now called
disklabel_swap(). (the sh3 version should be updated to use this.)
- DISKLABEL_EI becomes a first-class option with opt_disklabel.h.
- add libkern.h to libsa/disklabel.c.
this enables future work for bi-endian libsa/ufs.c (relevant for ffsv1,
ffsv2, lfsv1, and lfsv2), as well as making it possible for ports not
using subr_disk_mbr.c to include bi-endian disklabel support (which,
afaict, includes any disk on mbr-supporting platforms that do not have
an mbr as well as disklabel.)
builds successsfully on: alpha, i386, amd64, sun2, sun3, evbarm64,
evbarm64-eb, sparc, and sparc64. tested in anita on i386 and sparc,
testing in hardware on evbarm64*.
architectures.
Notes:
- On alpha and ia64 the function is kept but gets renamed locally to avoid
symbol collision. This is because on these two arches, I am not sure
whether the ASM callers do not rely on fixed registers, so I prefer to
keep the ASM body for now.
- On Vax, only the symbol is removed, because the body is used from other
functions.
- On RISC-V, this change fixes a bug: copystr() was just a wrapper around
strlcpy(), but strlcpy() makes the operation less safe (strlen on the
source beyond its size).
- The kASan, kCSan and kMSan wrappers are removed, because now that
copystr() is in C, the compiler transformations are applied to it,
without the need for manual wrappers.
Could test on amd64 only, but should be fine.
This is intended for 68060:
- GCC does not emit __muldi3() for 68020-40, that have 32 * 32 --> 64 mulul
- mulsl (and moveml), used in this code, are not implemented for 68010
In comparison with that from compiler_rt, this version saves:
- 12% of processing time
- 12 bytes of stack
- 50 bytes of code size
Also, slightly faster, memory saving, and smaller than libgcc version.
By examining with evcnt(9), __muldi3() is invoked more than 1000 times per
sec by kernel, which should justify to introduce assembler version of this
function.
Primary goals:
1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.
Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.
ENTROPY POOL
- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.
- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.
- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.
- Per-CPU gathering pools avoid contention on a global queue.
- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.
- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.
- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy
CPRNG_STRONG
- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.
(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)
- Event counters provide operator visibility into when reseeding
happens.
INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)
- Unwired for now; will be rewired in a subsequent commit.
codes generated by gcc (confirmed not only for kernel, but also all
executables and libraries for userland).
Fix build failure for MODULAR kernel, noticed by mlelstv.
kernel and bootloader for 68010.
They requires a special calling convention to udivsi3, and cannot to be
mixed up in normal routines provided by libgcc or compiler_rt. Although,
there's no problem for using in a controlled situation, i.e., kernel and
standalone programs.
Note that this does not affect m68k ports other than sun2, since codes
generated by gcc do not call these routines.
Assembler files are moved from common/lib/libc/arch/m68k/gen to
sys/lib/libkern/arch/m68k in order not to be compiled in libc.
Revert hack introduced to lib/libc/compiler_rt/Makefile.inc rev 1.37:
http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/compiler_rt/Makefile.inc#rev1.37
Proposed on port-sun2@ with no response...
(Again, this does not affect m68k ports other than sun2.)
http://mail-index.netbsd.org/port-sun2/2020/03/10/msg000102.html
http://mail-index.netbsd.org/source-changes/2020/04/16/msg116278.html
The reasoning turned out to be wrong; __KERNEL_RCSID() in header files
does *not* overwrite RCSID in main source files. The real problem is that
it inserts its RCSID into *every* object files. However, it can be still
useful even if heavily duplicated.
overwriting RCSID in main source files.
XXX
The first argument of __KERNEL_RCSID() is neglected for ELF. If we wish
to have RCSID of header files in kernel binary, we need something like
__FBSDID() macro in FreeBSD.
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.
We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.
The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.
The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).
We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.
Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.
Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.
The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.
This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.
kMSan is available with LLVM, but not with GCC.
The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.
to detect race conditions at runtime. It is a variation of TSan that is
easy to implement and more suited to kernel internals, albeit theoretically
less precise than TSan's happens-before.
We do basically two things:
- On every KCSAN_NACCESSES (=2000) memory accesses, we create a cell
describing the access, and delay the calling CPU (10ms).
- On all memory accesses, we verify if the memory we're reading/writing
is referenced in a cell already.
The combination of the two means that, if for example cpu0 does a read that
is selected and cpu1 does a write at the same address, kCSan will fire,
because cpu1's write collides with cpu0's read cell.
The coverage of the instrumentation is the same as that of kASan. Also, the
code is organized in a way similar to kASan, so it is easy to add support
for more architectures than amd64. kCSan is compatible with KCOV.
Reviewed by Kamil.