Commit Graph

303 Commits

Author SHA1 Message Date
riastradh 0fa3d1b3d8 Revert "kern: Sprinkle biglock-slippage assertions."
Got the diagnostic information I needed from this, and it's holding
up releng tests of everything else, so let's back this out until I
need more diagnostics or track down the original source of the
problem.
2022-03-30 14:54:29 +00:00
riastradh e93349be5d kern: Sprinkle biglock-slippage assertions.
We seem to have a poltergeist that occasionally messes with the
biglock depth, but it's very hard to reproduce and only manifests as
some other CPU spinning out on the kernel lock which is no good for
diagnostics.
2022-03-30 10:34:14 +00:00
simonb 6947b62866 If we're only doing a count-only kern.buf sysctl, just return the number
of active members in the pool cache (plus some slop) instead of looking
in all the free buffer list.  Should reduce CPU usage for "systat vm"
to << 1% especially for machines with a larger number of buffers.
2021-07-25 06:06:40 +00:00
simonb 44f4ec8e31 Expose KERN_BUFSLOP in <sys/sysctl.h>. 2021-07-24 13:28:14 +00:00
simonb 89c455884f Pad out the slop for kern.buf based on the passed in element size,
rather than a size of an unrelated struct.
2021-07-24 13:27:39 +00:00
simonb 8fc838fd40 Add a sysctl hashstat collector for bufhash. 2021-04-01 06:25:59 +00:00
chs 1736a0aa97 fix the UFS2 extattr truncate code to play nice with wapbl.
also, rather than pull in the FreeBSD V_NORMAL/V_ALT flags to
vinvalbuf() and the buf b_xflags field and BX_ALTDATA flag,
add a binvalbuf() function to invalid a specific buffer
and use that to invalidate the two possible exattr bufs
during IO_EXT truncations.
2020-07-31 04:07:30 +00:00
ad 4b8a875ae2 uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched.  It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.
2020-06-11 19:20:42 +00:00
jdolecek 5e7d2b0bdd pass B_PHYS|B_RAW also in nestio_setup(), courtesy to e.g. xbd(4), which
wants to know whether the buf came from user space or bio subsystem
2020-04-27 07:51:02 +00:00
ad 4e73754f15 Rename buf_syncwait() to vfs_syncwait(), and have it wait on v_numoutput
rather than BC_BUSY.  Removes the dependency on bufhash.
2020-04-20 21:39:05 +00:00
jdolecek ed25d94c88 for bmempools set align, not ioff 2020-04-11 14:48:19 +00:00
jdolecek 550ba56bfe explicitly use DEV_BSIZE align for all bmempools
this is required for Xen xbd(4) in order to not have to use bounce buffers

the alignment is implicitly provided when POOL_REDZONE is not active,
this change makes it also aligned when POOL_REDZONE _is_ active - that is
when (!KMSAN && (DIAGNOSTIC || KASAN))
2020-04-11 14:38:26 +00:00
ad 83f44bb607 Remove buffer reference counting, now that it's safe to destroy b_busy after
waking any waiters.
2020-04-10 17:18:04 +00:00
ad 16d4fad635 - Hide the details of SPCF_SHOULDYIELD and related behind a couple of small
functions: preempt_point() and preempt_needed().

- preempt(): if the LWP has exceeded its timeslice in kernel, strip it of
  any priority boost gained earlier from blocking.
2020-03-14 18:08:38 +00:00
riastradh 00772fbc46 OOPS -- fix mistake in previous commit.
bbusy really needs to return the error; otherwise things are very
bad!
2020-02-21 02:04:40 +00:00
riastradh 1cc3b53db1 Buffer cache SDT probes. 2020-02-20 15:48:38 +00:00
ad cdffe17156 biodone2(): don't acquire kernel_lock for anybody anymore. 2020-01-17 19:33:14 +00:00
ad 5c06357c90 Rename uvm_free() -> uvm_availmem(). 2019-12-31 13:07:09 +00:00
msaitoh a0403cde04 s/transfered/transferred/ 2019-12-27 09:41:48 +00:00
ad ddd3a0be1e uvmexp.free -> uvm_free() 2019-12-21 13:00:20 +00:00
ad 0a55e6b767 Add a comment. 2019-12-11 20:50:32 +00:00
ad 0b2f97d6bd For safety, cv_broadcast(&bp->b_busy) in more places where the buffer is
changing identity or moving from one vnode list to another.
2019-12-08 20:35:23 +00:00
ad 51eccb6fe4 Adjustment to previous: if we're going to toss the buffer, then wake
everybody.
2019-12-08 19:49:25 +00:00
ad 8dc9961386 - Avoid thundering herd: cv_broadcast(&bp->b_busy) -> cv_signal(&bp->b_busy)
- Sprinkle __cacheline_aligned.
2019-12-08 19:26:05 +00:00
msaitoh a1f88951ea Change buf_nbuf()'s return value from int to u_int to avoid undefined
behavior in wapbl_start() which extended int to size_t.

Error message was:
> UBSan: Undefined Behavior in ../../../../kern/vfs_wapbl.c:609:41, signed integer overflow: 3345138 * 1024 cannot be represented in type 'int'

>        /* XXX maybe use filesystem fragment size instead of 1024 */
>         /* XXX fix actual number of buffers reserved per filesystem. */
>         wl->wl_bufcount_max = (buf_nbuf() / 2) * 1024;

Need more work?
2019-08-26 10:24:39 +00:00
maxv ab3615448e Fix kernel pointer leaks in sysctl_dobuf. While here constify argument.
Also memset the buffer, to prevent leaks (even if there doesn't seem to
be currently).
2018-11-24 17:52:39 +00:00
hannken 68701f9d85 Make sure getnewbuf() runs bawrite() inside fstrans.
Use fstrans_start_nowait() to skip buffers that would block.
2018-08-29 09:05:17 +00:00
pgoyette cb32a134a5 Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...

(As proposed on tech-kern@ with additional changes and enhancements.)

Details of changes:

* All history arguments are now stored as uintmax_t values[1], both in
  the kernel and in the structures used for exporting the history data
  to userland via sysctl(9).  This avoids problems on some architectures
  where passing a 64-bit (or larger) value to printf(3) can cause it to
  process the value as multiple arguments.  (This can be particularly
  problematic when printf()'s format string is not a literal, since in
  that case the compiler cannot know how large each argument should be.)

* Update the data structures used for exporting kernel history data to
  include a version number as well as the length of history arguments.

* All [2] existing users of kernhist(9) have had their format strings
  updated.  Each format specifier now includes an explicit length
  modifier 'j' to refer to numeric values of the size of uintmax_t.

* All [2] existing users of kernhist(9) have had their format strings
  updated to replace uses of "%p" with "%#jx", and the pointer
  arguments are now cast to (uintptr_t) before being subsequently cast
  to (uintmax_t).  This is needed to avoid compiler warnings about
  casting "pointer to integer of a different size."

* All [2] existing users of kernhist(9) have had instances of "%s" or
  "%c" format strings replaced with numeric formats; several instances
  of mis-match between format string and argument list have been fixed.

* vmstat(1) has been modified to handle the new size of arguments in the
  history data as exported by sysctl(9).

* vmstat(1) now provides a warning message if the history requested with
  the -u option does not exist (previously, this condition was silently
  ignored, with only a single blank line being printed).

* vmstat(1) now checks the version and argument length included in the
  data exported via sysctl(9) and exits if they do not match the values
  with which vmstat was built.

* The kernhist(9) man-page has been updated to note the additional
  requirements imposed on the format strings, along with several other
  minor changes and enhancements.

[1] It would have been possible to use an explicit length (for example,
    uint64_t) for the history arguments.  But that would require another
    "rototill" of all the users in the future when we add support for an
    architecture that supports a larger size.  Also, the printf(3) format
    specifiers for explicitly-sized values, such as "%"PRIu64, are much
    more verbose (and less aesthetically appealing, IMHO) than simply
    using "%ju".

[2] I've tried very hard to find "all [the] existing users of kernhist(9)"
    but it is possible that I've missed some of them.  I would be glad to
    update any stragglers that anyone identifies.
2017-10-28 00:37:11 +00:00
mrg 65d1d4aa12 normalise a BIOHIST log message 2017-08-04 07:00:17 +00:00
chs ec5ea71a90 move some buffer cache internals declarations from buf.h to vfs_bio.c.
this is needed to avoid name conflicts with ZFS and also
makes it clearer that other code shouldn't be messing with these.
remove the LFS debug code that poked around in bufqueues and
remove the BQ_EMPTY bufqueue since nothing uses it anymore.
provide a function to let LFS and wapbl read the value of nbuf for now.
2017-06-08 01:23:01 +00:00
pgoyette 3b2df19edf When logging a history record for biowait(), include the return address
as a parameter, to identify to which of the many calls to biowait() the
record refers.
2017-05-25 02:28:07 +00:00
jdolecek 6801660c77 expose disk device FUA/DPO support via DIOCGCACHE, and allow the flags
to be set for I/O; implement support in sd(4) and nvme(4)

discussed on tech-kern
2017-04-05 20:15:49 +00:00
skrll fa90529ca0 Use brelsel while the bufcache_lock is held rather than dropping it
and re-taking / dropping it in brelse
2017-03-21 10:46:49 +00:00
riastradh 068914dcb9 Nix trailing whitespace. 2017-03-18 05:45:48 +00:00
skrll c8226a8b4f Fix build 2017-01-20 09:45:13 +00:00
skrll fd0caf00f0 Simplify getiobuf. buf_init already does bp->b_objlock == &buffer_lock 2017-01-20 08:16:31 +00:00
pgoyette c129bbe940 Remove some extraneous whitespace 2016-12-28 06:25:40 +00:00
pgoyette ee1d5b993e Decouple BIOHIST from other users of KERNHIST. 2016-12-27 04:12:34 +00:00
pgoyette 6a7e4606d5 Fix locking so we don't release the lock between the time we check the
tailq (for being non-empty) and the time we remove an entry.
2016-12-26 23:15:15 +00:00
pgoyette 7f0851cee1 Add a BIOHIST option. As mentioned on tech-kern. 2016-12-26 23:12:33 +00:00
dholland b79a953f51 typo in comment 2016-12-18 05:43:20 +00:00
jdolecek 71a8e131fb fixup comment 2016-10-28 20:17:27 +00:00
christos d18e278dd0 Allow sparc kernels to build with SSP by using a constant PAGE_SIZE... 2016-09-29 18:47:35 +00:00
dholland 28ccf570bf In bwrite, add assertion that vp != NULL. (vp is the vnode from the
buffer being written.)

There's some logic here that carefully checks for vp being null, and
other logic that will crash if it is. It appears that it's all
needless paranoia. See tech-kern for more info.

Unless someone sees the assertion go off (in which case a lot more
investigation is needed) I or someone will clean out the logic at some
future point.

Spotted by coypu.
2016-07-31 04:05:32 +00:00
riz 716b7d01f8 Implement the 'io' provider for DTrace. From riastradh@, with
fixes from me.
2016-02-01 05:05:43 +00:00
dholland d59f36443c Whatever the point of this "biodone_vfs" global function pointer is
(something rumpity?) declare it properly in a header file instead of
in secret where its types can diverge.
2016-01-11 01:22:36 +00:00
martin 31584ffedb KASSERT->KASSERTMSG to allow debugging a double-free'd buffer in ddb. 2016-01-01 18:58:58 +00:00
pooka d8e04c9094 to garnish, dust with _KERNEL_OPT 2015-08-24 22:50:32 +00:00
maxv 6e39240181 Remove the 'cred' argument from bread(). Remove a now unused var in
ffs_snapshot.c. Update the man page accordingly.

ok hannken@
2015-03-28 19:24:04 +00:00
maxv bb338d5f26 Remove the 'cred' argument from breadn(), and update the man page
accordingly.

ok hannken@
2015-03-28 17:23:42 +00:00