Commit Graph

133 Commits

Author SHA1 Message Date
mrg
4312dbe036 fix error in previous: UVMHIST_PDHIST_SIZE needs to stay next to pdhistbuf[]. 2021-04-17 21:37:21 +00:00
mrg
f405a45af6 remove KERNHIST_INIT_STATIC(). it stradles the line between usable
early in boot and broken early in boot by requiring a partly static
structure with another structure that must be present by the time
any uses are performed.  theoretically platform code could allocate
a chunk while seting up memory and assign it here, giving a dynamic
sizing for the entry list, but the reality is that all users have
a statically allocated entry list as well.

the existing KERNHIST_LINK_STATIC() is used in conjunction with
KERNHIST_INITIALIZER() instead.

this stops a NULL pointer deref when the _LOG() macro is called
before the storage is linked in, which happens with GCC 10 on OCTEON
with UVMHIST enabled, crashing in very early kernel init.
2021-04-17 01:53:58 +00:00
chs
9133d44ed0 In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.
2020-11-04 01:30:19 +00:00
skrll
f3bd60e230 Consistently use UVMHIST(__func__)
Convert UVMHIST_{CALLED,LOG} into UVMHIST_CALLARGS
2020-07-09 05:57:15 +00:00
ad
ba90a6ba38 Counter tweaks:
- Don't need to count anonpages+filepages any more; clean+unknown+dirty for
  each kind of page can be summed to get the totals.

- Track the number of free pages with a counter so that it's one less thing
  for the allocator to do, which opens up further options there.

- Remove cpu_count_sync_one().  It has no users and doesn't save a whole lot.
  For the cheap option, give cpu_count_sync() a boolean parameter indicating
  that a cached value is okay, and rate limit the updates for cached values
  to hz.
2020-06-11 22:21:05 +00:00
ad
4b8a875ae2 uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched.  It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.
2020-06-11 19:20:42 +00:00
ad
3d8601a9ec uvm_pageout_done(): do nothing when npages is zero. 2020-05-25 19:46:20 +00:00
maxv
983fd9ccfe hardclock_ticks -> getticks() 2020-04-13 15:54:45 +00:00
ad
d2a0ebb67a UVM locking changes, proposed on tech-kern:
- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart.  v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap.  Others to follow later.
2020-02-23 15:46:38 +00:00
chs
5232c510c9 remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.
2020-02-18 20:23:17 +00:00
ad
05a3457e85 Merge from yamt-pagecache (after much testing):
- Reduce unnecessary page scan in putpages esp. when an object has a ton of
  pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
  precisely in uvm layer.
2020-01-15 17:55:43 +00:00
ad
94843b1390 - Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks.  Require that the page interlock be held over calls to
  uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state.  Rather than
  updating the global state synchronously, set an intended state on
  individual pages (active, inactive, enqueued, dequeued) while holding the
  page interlock.  After the interlock is released put the pages on a 128
  entry per-CPU queue for their state changes to be made real in batch.
  This results in in a ~400 fold decrease in contention on my test system.
  Proposed on tech-kern but modified to use the page interlock rather than
  atomics to synchronise as it's much easier to maintain that way, and
  cheaper.
2019-12-31 22:42:50 +00:00
ad
5c06357c90 Rename uvm_free() -> uvm_availmem(). 2019-12-31 13:07:09 +00:00
ad
b78a6618bd Rename uvm_page_locked_p() -> uvm_page_owner_locked_p() 2019-12-31 12:40:27 +00:00
ad
9344a5952a pagedaemon:
- Use marker pages to keep place in the queue when scanning, rather than
  relying on assumptions.

- In uvmpdpol_balancequeue(), lock the object once instead of twice.

- When draining pools, the situation is getting desperate, but try to avoid
  saturating the system with xcall, lock and interrupt activity by sleeping
  for 1 clock tick if being continually awoken and all pools have been
  cycled through at least once.

- Pause & resume the freelist cache during pool draining.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
2019-12-30 18:08:37 +00:00
ad
5056cf4d85 Fix merge error - don't init uvmpd_lock twice. 2019-12-21 16:10:20 +00:00
ad
479b92623d Detangle the pagedaemon from uvm_fpageqlock:
- Have a single lock (uvmpd_lock) to protect pagedaemon state that was
  previously covered by uvmpd_pool_drain_lock plus uvm_fpageqlock.
- Don't require any locks be held when calling uvm_kick_pdaemon().
- Use uvm_free().
2019-12-21 14:50:34 +00:00
ad
2073010ee4 uvm_reclaimable(): need to sum the per-CPU values for filepages/execpages. 2019-12-21 11:41:18 +00:00
ad
7121cb9fbf The uvmexp.pdpending change was incorrect - revert for now. 2019-12-14 21:36:00 +00:00
ad
6c10122b55 Adjust pdpending in uvm_pageout_start() and uvm_pageout_done() to avoid
the value going temporarily negative.
2019-12-14 15:04:47 +00:00
ad
5978ddc663 Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code.  Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
2019-12-13 20:10:21 +00:00
ad
221d5f982e - Adjust uvmexp.swpgonly with atomics, and make uvm_swap_data_lock static.
- A bit more __cacheline_aligned on mutexes.
2019-12-01 14:40:31 +00:00
chs
e880b3aa3c in uvm_wait(), panic if the pagedaemon thread does not exist.
this avoids a hang if the system runs out of memory before
the mechanisms for reclaiming memory have been set up.
2019-10-01 17:40:22 +00:00
chs
3449aefe5a Draining pools from the pagedaemon thread can deadlock, because draining
a pool can involve taking a lock which can be held by a thread which is
blocked waiting for memory.  Avoid this by moving the pool-draining work
to a separate worker thread.
2019-04-21 15:32:18 +00:00
pgoyette
cb32a134a5 Update the kernhist(9) kernel history code to address issues identified
in PR kern/52639, as well as some general cleaning-up...

(As proposed on tech-kern@ with additional changes and enhancements.)

Details of changes:

* All history arguments are now stored as uintmax_t values[1], both in
  the kernel and in the structures used for exporting the history data
  to userland via sysctl(9).  This avoids problems on some architectures
  where passing a 64-bit (or larger) value to printf(3) can cause it to
  process the value as multiple arguments.  (This can be particularly
  problematic when printf()'s format string is not a literal, since in
  that case the compiler cannot know how large each argument should be.)

* Update the data structures used for exporting kernel history data to
  include a version number as well as the length of history arguments.

* All [2] existing users of kernhist(9) have had their format strings
  updated.  Each format specifier now includes an explicit length
  modifier 'j' to refer to numeric values of the size of uintmax_t.

* All [2] existing users of kernhist(9) have had their format strings
  updated to replace uses of "%p" with "%#jx", and the pointer
  arguments are now cast to (uintptr_t) before being subsequently cast
  to (uintmax_t).  This is needed to avoid compiler warnings about
  casting "pointer to integer of a different size."

* All [2] existing users of kernhist(9) have had instances of "%s" or
  "%c" format strings replaced with numeric formats; several instances
  of mis-match between format string and argument list have been fixed.

* vmstat(1) has been modified to handle the new size of arguments in the
  history data as exported by sysctl(9).

* vmstat(1) now provides a warning message if the history requested with
  the -u option does not exist (previously, this condition was silently
  ignored, with only a single blank line being printed).

* vmstat(1) now checks the version and argument length included in the
  data exported via sysctl(9) and exits if they do not match the values
  with which vmstat was built.

* The kernhist(9) man-page has been updated to note the additional
  requirements imposed on the format strings, along with several other
  minor changes and enhancements.

[1] It would have been possible to use an explicit length (for example,
    uint64_t) for the history arguments.  But that would require another
    "rototill" of all the users in the future when we add support for an
    architecture that supports a larger size.  Also, the printf(3) format
    specifiers for explicitly-sized values, such as "%"PRIu64, are much
    more verbose (and less aesthetically appealing, IMHO) than simply
    using "%ju".

[2] I've tried very hard to find "all [the] existing users of kernhist(9)"
    but it is possible that I've missed some of them.  I would be glad to
    update any stragglers that anyone identifies.
2017-10-28 00:37:11 +00:00
martin
419cac9e69 Mark a diagnostic-only variable 2013-10-25 20:28:33 +00:00
matt
fd2366536d -fno-common broke kernhist since it used commons.
Add a KERNHIST_DEFINE which is define the kernel history.
Change UVM to deal with the new usage.
2012-07-30 23:56:48 +00:00
jym
57d7988f76 Now that pool_cache_invalidate() is synchronous and can handle per-CPU
caches, merge together pool_drain_start() and pool_drain_end() into

bool pool_drain(struct pool **ppp);

"bool" value indicates whether reclaiming was fully done (true) or not (false)
"ppp" will contain a pointer to the pool that was drained (optional).

See http://mail-index.netbsd.org/tech-kern/2012/06/04/msg013287.html
2012-06-05 22:51:47 +00:00
para
e253ed8e30 allocate uareas and buffers from kernel_map again
add code to drain pools if kmem_arena runs out of space
2012-02-01 23:43:49 +00:00
para
e62ee4d475 extending vmem(9) to be able to allocated resources for it's own needs.
simplifying uvm_map handling (no special kernel entries anymore no relocking)
make malloc(9) a thin wrapper around kmem(9)
(with private interface for interrupt safety reasons)

releng@ acknowledged
2012-01-27 19:48:38 +00:00
rmind
e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
chuck
40ec801a13 udpate license clauses on my code to match the new-style BSD licenses.
based on second diff that rmind@ sent me.

no functional change with this commit.
2011-02-02 15:25:27 +00:00
pooka
cbed8ae4b7 it's a wonderful static 2010-06-02 15:48:49 +00:00
rmind
40cf6f3659 Remove uarea swap-out functionality:
- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code.  Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
2009-10-21 21:11:57 +00:00
yamt
f97310f398 whitespace fixes. no functional changes. 2009-08-18 02:43:49 +00:00
haad
b760fc6e71 Add uvm_reclaim_hooks support for reclaiming kernel KVA space and memory.
This is used only by zfs where uvm_reclaim hook is added from arc cache.

Oked ad@.
2009-08-10 23:17:29 +00:00
ad
9d315c5bc9 PR 40027/pagedaemon loops on memory shortage
uvmpd_scan_queue:

- Fix a bug that prevented the pagedaemon from making forward progress
  if (a) swap was full (b) the first 16 pages on the inactive list were
  unbusy anons not already backed by swap.

- Remove redundant uvm_swapisfull() check and just try to allocate a slot.
  If it fails we know swap is full.
2008-12-13 11:26:57 +00:00
ad
7a2060a7f5 Make adjustment of uvm_extrapages atomic since it's done without a lock.
XXX This is still a hack.
2008-12-03 11:43:51 +00:00
ad
04b3e89c3f uvmpd_tune: make the adjustments to individual variables atomic. 2008-12-02 10:46:43 +00:00
ad
79d9beffc8 - If the system encounters a severe memory shortage, start unloading
unused kernel modules.
- Try to unload any autoloaded kernel modules 10 seconds after their
  load was successful.
- Keep a counter to track module load/unload events.
2008-11-14 23:06:45 +00:00
ad
3689374b96 - Make free target 0.5%, but limit to between 128k and 1024k.
- Scale free target by number of CPUs.
- Prefer pageing to swapping.

Proposed on tech-kern.
2008-09-23 08:55:52 +00:00
yamt
19b9de9868 uvm_swap_io: if pagedaemon, don't wait for iobuf. 2008-02-29 20:35:23 +00:00
yamt
7608362b58 swapcluster_flush: handle nused==0, which can happen if swapcluster_add failed.
PR/37669 from Andrew Doran.
2008-02-07 12:24:16 +00:00
yamt
52838e34f5 remove a special allocator for uareas, which is no longer necessary.
use pool_cache instead.
2008-01-28 12:22:46 +00:00
ad
4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
ad
d18c6ca4de Merge from vmlocking:
- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
2007-11-07 00:23:13 +00:00
ad
4688843d2b Merge unobtrusive locking changes from the vmlocking branch. 2007-07-21 19:21:53 +00:00
ad
88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
ad
6820f4664f Add a sysctl to disable swapout of kernel stacks. Discussed on tech-kern@. 2007-06-15 18:28:39 +00:00
thorpej
b3667ada6d TRUE -> true, FALSE -> false 2007-02-22 06:05:00 +00:00