PR kern/58111 .
It would be extremely unlikely to trip this bug on NetBSD, as we don't
expose SEEK_DATA and SEEK_HOLE and you need to call ioctl(2) with
FIOSEEKDATA and FIOSEEKHOLE directly which no currently known code does,
and even then be unlucky enough to trip a race condition.
With a reproducer based on that in https://www.illumos.org/issues/16087,
I saw 11 groups of failures over 8 hours. With this patch, no
failures in 10 hours. The repro for NetBSD will be attached to
https://gnats.netbsd.org/58111 .
Original FreeBSD commit message:
--------------------------------
dnode_is_dirty: check dnode and its data for dirtiness
Over its history this the dirty dnode test has been changed between
checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
`dn_dirty_record`.
It turns out both are actually required.
In the case of appending data to a newly created file, the dnode proper
is dirtied (at least to change the blocksize) and dirty records are
added. Thus, a single logical operation is represented by separate
dirty indicators, and must not be separated.
The incorrect dirty check becomes a problem when the first block of a
file is being appended to while another process is calling lseek to skip
holes. There is a small window where the dnode part is undirtied while
there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
would not know that the file is dirty, and would go to
`dnode_next_offset()`. Since the object has no data blocks yet, it
returns `ESRCH`, indicating no data found, which results in `ENXIO`
being returned to `lseek()`'s caller.
This change simply updates the dirty check to check both types of dirty.
If there's anything dirty at all, we immediately go to the "wait for
sync" stage, It doesn't really matter after that; both changes are on
disk, so the dirty fields should be correct.
Sponsored by: Klara, Inc.
Sponsored by: Wasabi Technology, Inc.
1. For tools that use elftoolchain: always use elftoolchain's
elfdefinitions.h. Don't even think about looking at the host's
sys/exec_elf.h, which makes no sense and should never happen.
(ELF tools that don't use elftoolchain, like m68k-elf2coff,
continue to use nbincludes/sys/exec_elf.h. But no more nbincludes
hacks in elftoolchain.)
2. For kernel components (solaris, zfs, dtrace): always use
sys/exec_elf.h, even in Solaris components via sys/elf.h.
elfdefinitions.h is not wired up in the kernel build at all.
3. For most userland components that involve libelf: use
elfdefinitions.h via libelf header files (libelf.h, gelf.h).
libdtrace in particular requires _all_ R_* reloc type definitions,
but sys/exec_elf.h brings in only the _current machine's_ R_*
reloc type definitions. (While here: Use uintptr_t instead of
Elf_Addr for pointer-to-integer cast, since Elf_Addr is MD and
provided only by sys/exec_elf.h, not by elfdefinitions.h.)
And most userland components using libelf don't rely on any
properties of the current machine from sys/exec_elf.h, so they can
use libelf's elfdefinition.h.
Exceptions:
- dtrace drti.c relies on link.h -> link_elf.h -> sys/exec_elf.h,
but it also relies on sys/dtrace.h -> sys/elf.h ->
elfdefinitions.h like other userland components using sys/elf.h.
- kdump-ioctl.c uses sys/exec_elf.h directly and sys/dtrace.h ->
sys/elf.h -> elfdefinitions like other userland components using
sys/elf.h.
- t_ptrace_wait.c (via t_ptrace_core_wait.h) uses libelf to parse
core files, but relies on sys/exec_elf.h for struct
netbsd_elfcore_procinfo.
None of these exceptions needs all R_* reloc type definitions, so
as a workaround, we can just suppress libelf's elfdefinitions.h by
defining _SYS_ELFDEFINITIONS_H_ and use sys/exec_elf.h in these
exceptions.
And undo the whole BUILTIN_ELF_HEADERS mistake. This was:
- half bogus workarounds for missing build_install dependencies in
tools/Makefile, which are no longer missing now, and
- half futile attempt to use src/sys/sys/exec_elf.h via nbincludes in
tools involving libelf instead of libelf's elfdefinitions.h, which
collides.
Longer-term, we may wish to unify sys/exec_elf.h and libelf's
elfdefinitions.h, so we don't have to play these games.
But at least now the games are limited to three .c files (one of
which is generated by Makefile.ioctl-c), rather than haphazardly
applied tree-wide by monstrous kludges in widely used .h files with
broken hackarounds to get the tools build lurching to completion.
Add support in dtrace for SMAP, so that actions like copyinstr() work.
It would be better if dtrace could use the SMAP_* hotpatch macros directly,
but the hotpatching code does not currently operate on kernel modules,
so we'll use some tiny functions in the base kernel for now.
This is used only by dump and swap, which won't work safely on zvols
anyway. We should make swap work eventually, but right now it's
leading unwary ussers into deadlock scenarios, so let's make it fail
early instead.
pool_cache_invalidate invalidates cached objects, but doesn't return
any backing pages to the underlying page allocator.
pool_cache_reclaim does pool_cache_invalidate _and_ reutrns backing
pages to the underlying page alloator, so it is actually useful for
the page daemon to do when trying to free memory.
PR kern/57558
XXX pullup-10
XXX pullup-9
XXX pullup-8 (by patch to kmem.h instead of kmem.c)
Use ${CC_WNO_MAYBE_UNINITIALIZED} instead of
the older style more complex expressions.
Remove workarounds if they were for a specific
version of gcc < 10.
Rename compiler-warning-disable variables from
GCC_NO_warning
to
CC_WNO_warning
where warning is the full warning name as used by the compiler.
GCC_NO_IMPLICIT_FALLTHRU is CC_WNO_IMPLICIT_FALLTHROUGH
Using the convention CC_compilerflag, where compilerflag
is based on the full compiler flag name.
macOS/x86_64 defines boolean_t as 'unsigned int' not 'int',
which causes a build issue with tools/ctfmerge on that host
after my recent fixes for macOS semaphores.
So use the <mach/boolean.h> instead of a local typedef ifdef __APPLE__.
May fix a macOS/x86_64 build issue reported by cjep@.
Builds fine on NetBSD/amd64 or macOS/arm.
Note: this compat stuff is clunky, and based on the commit log,
annoyingly error prone. A newer sync of osnet from upstream /may/
improve a lot of these compat typedef workarounds for solaris types...
dispatch_semaphore_signal() doesn't return an error, just an
indicator of whether a thread was woken or not, so there's
no need to fail on non-zero return.
Use dispatch_semaphore_create() if present instead of sem_init().
macOS doesn't actually implement sem_init() (et al)
(even though it provides the prototypes as deprecated).
This was detected by the previous commit to ctfmerge
that added error handling.
Implement ctfmerge's barrier operations in terms of
dispatch(3) APIs such as dispatch_semaphore_create() (et al).
Update tools/compat/configure.ac to find dispatch_semaphore_create().
Fixes ctfmerge on macOS hosts.
Inspired by https://stackoverflow.com/a/27847103.
terminate() if sem_*() returns -1 or pthread_*() returns != 0.
(Set errno from pthread_*() so terminate() prints the strerror message).
Note: Failing on errors instead of ignoring them helps identify
reasons for intermittent failures, such as those on macOS host builds:
ERROR: nbctfmerge: barrier_init: sem_init(bar_sem): Function not implemented
arm is a little more complicated because it has three cases:
- big-endian data, big-endian instructions
- big-endian data, little-endian instructions
- little-endian data, little-endian instructions
Rename VM_KERNEL_IO_ADDRESS to VM_KERNEL_IO_BASE to match RISC-V
It's less letters, matches other similar variables and will help with
sharing code between the two architectures.
NFCI.
There is no valid reason to use this except in assertions of the form
KASSERT(mutex_owner(lock) == curlwp),
which is more obviously spelled as
KASSERT(mutex_owned(lock)).
Exception: There's one horrible kludge in zfs that abuses this, which
should be eliminated.
XXX kernel revbump -- deleting symbol
PR kern/47114
Apply this commit from FreeBSD:
commit f339a3ef6369b368f3a2455792a7a3a4c28f92c4
Author: Chuck Silvers <chs@FreeBSD.org>
Date: Wed Feb 9 17:09:26 2022 -0800
dtrace: remove unnecessary fflush()
This call was added back in the early days of dtrace porting and
no one knows why anymore. The extra flushing causes lots of
unnecessary CPU overhead when a script produces lots of output,
as well as easily losing output because the command can't keep up.
Sponsored by: Netflix
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D34216
Reapply the fix to dt_status() from rev 1.10
("Don't return success when the target CPU is offline")
which was lost in rev 1.12 ("sync with FreeBSD").
The FreeBSD version that we have been using since then does run on NetBSD
but always reports that CPU 0 is online and all other CPUs are offline,
because the sysctl that it uses does not exist on NetBSD.
This was never relevant on FreeBSD and I don't think it is relevant on
NetBSD either. The FreeBSD change to lift this restriction had the
following comment:
r306570 | markj | 2016-10-02 00:35:00 +0000 (Sun, 02 Oct 2016) | 7 lines
Allow tracing of functions prefixed by "__".
This restriction was inherited from upstream but is not relevant on FreeBSD.
Furthermore, it hindered the tracing of locking primitive subroutines.
there's an extra check that we inherited from FreeBSD that tries to
detect KVA exhaustion on platforms with limited KVA, but the condition
that decided whether to use the extra check was using a FreeBSDism
that doesn't exist on NetBSD, resulting in this check being used on
all platforms. on amd64 systems with lots of memory, this extra check
would result in the ARC thinking that it constantly needed to reclaim memory,
resulting in all the xcall threads running all the time but not doing
anything useful. change this condition so that this extra check for
KVA exhaustion is only used on 32-bit platforms. fixes PR 55707.