Got the diagnostic information I needed from this, and it's holding
up releng tests of everything else, so let's back this out until I
need more diagnostics or track down the original source of the
problem.
We seem to have a poltergeist that occasionally messes with the
biglock depth, but it's very hard to reproduce and only manifests as
some other CPU spinning out on the kernel lock which is no good for
diagnostics.
of active members in the pool cache (plus some slop) instead of looking
in all the free buffer list. Should reduce CPU usage for "systat vm"
to << 1% especially for machines with a larger number of buffers.
also, rather than pull in the FreeBSD V_NORMAL/V_ALT flags to
vinvalbuf() and the buf b_xflags field and BX_ALTDATA flag,
add a binvalbuf() function to invalid a specific buffer
and use that to invalidate the two possible exattr bufs
during IO_EXT truncations.
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.
this is required for Xen xbd(4) in order to not have to use bounce buffers
the alignment is implicitly provided when POOL_REDZONE is not active,
this change makes it also aligned when POOL_REDZONE _is_ active - that is
when (!KMSAN && (DIAGNOSTIC || KASAN))
functions: preempt_point() and preempt_needed().
- preempt(): if the LWP has exceeded its timeslice in kernel, strip it of
any priority boost gained earlier from blocking.
behavior in wapbl_start() which extended int to size_t.
Error message was:
> UBSan: Undefined Behavior in ../../../../kern/vfs_wapbl.c:609:41, signed integer overflow: 3345138 * 1024 cannot be represented in type 'int'
> /* XXX maybe use filesystem fragment size instead of 1024 */
> /* XXX fix actual number of buffers reserved per filesystem. */
> wl->wl_bufcount_max = (buf_nbuf() / 2) * 1024;
Need more work?
in PR kern/52639, as well as some general cleaning-up...
(As proposed on tech-kern@ with additional changes and enhancements.)
Details of changes:
* All history arguments are now stored as uintmax_t values[1], both in
the kernel and in the structures used for exporting the history data
to userland via sysctl(9). This avoids problems on some architectures
where passing a 64-bit (or larger) value to printf(3) can cause it to
process the value as multiple arguments. (This can be particularly
problematic when printf()'s format string is not a literal, since in
that case the compiler cannot know how large each argument should be.)
* Update the data structures used for exporting kernel history data to
include a version number as well as the length of history arguments.
* All [2] existing users of kernhist(9) have had their format strings
updated. Each format specifier now includes an explicit length
modifier 'j' to refer to numeric values of the size of uintmax_t.
* All [2] existing users of kernhist(9) have had their format strings
updated to replace uses of "%p" with "%#jx", and the pointer
arguments are now cast to (uintptr_t) before being subsequently cast
to (uintmax_t). This is needed to avoid compiler warnings about
casting "pointer to integer of a different size."
* All [2] existing users of kernhist(9) have had instances of "%s" or
"%c" format strings replaced with numeric formats; several instances
of mis-match between format string and argument list have been fixed.
* vmstat(1) has been modified to handle the new size of arguments in the
history data as exported by sysctl(9).
* vmstat(1) now provides a warning message if the history requested with
the -u option does not exist (previously, this condition was silently
ignored, with only a single blank line being printed).
* vmstat(1) now checks the version and argument length included in the
data exported via sysctl(9) and exits if they do not match the values
with which vmstat was built.
* The kernhist(9) man-page has been updated to note the additional
requirements imposed on the format strings, along with several other
minor changes and enhancements.
[1] It would have been possible to use an explicit length (for example,
uint64_t) for the history arguments. But that would require another
"rototill" of all the users in the future when we add support for an
architecture that supports a larger size. Also, the printf(3) format
specifiers for explicitly-sized values, such as "%"PRIu64, are much
more verbose (and less aesthetically appealing, IMHO) than simply
using "%ju".
[2] I've tried very hard to find "all [the] existing users of kernhist(9)"
but it is possible that I've missed some of them. I would be glad to
update any stragglers that anyone identifies.
this is needed to avoid name conflicts with ZFS and also
makes it clearer that other code shouldn't be messing with these.
remove the LFS debug code that poked around in bufqueues and
remove the BQ_EMPTY bufqueue since nothing uses it anymore.
provide a function to let LFS and wapbl read the value of nbuf for now.
buffer being written.)
There's some logic here that carefully checks for vp being null, and
other logic that will crash if it is. It appears that it's all
needless paranoia. See tech-kern for more info.
Unless someone sees the assertion go off (in which case a lot more
investigation is needed) I or someone will clean out the logic at some
future point.
Spotted by coypu.