Commit Graph

77 Commits

Author SHA1 Message Date
ad
8545b637a5 - If the hardware provided NUMA info, then use it to decide how to set up
the allocator's buckets, instead of doing round robin distribution.  There
  are open questions here but this is better than doing nothing.

- Kernel reserve pages are for the kernel not realtime threads.
2020-05-17 15:11:57 +00:00
ad
5972ba1600 Make page waits (WANTED vs BUSY) interlocked by pg->interlock. Gets RW
locks out of the equation for sleep/wakeup, and allows observing+waiting
for busy pages when holding only a read lock.  Proposed on tech-kern.
2020-03-14 20:23:51 +00:00
ad
d2a0ebb67a UVM locking changes, proposed on tech-kern:
- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart.  v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap.  Others to follow later.
2020-02-23 15:46:38 +00:00
chs
5232c510c9 remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.
2020-02-18 20:23:17 +00:00
ad
94843b1390 - Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks.  Require that the page interlock be held over calls to
  uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state.  Rather than
  updating the global state synchronously, set an intended state on
  individual pages (active, inactive, enqueued, dequeued) while holding the
  page interlock.  After the interlock is released put the pages on a 128
  entry per-CPU queue for their state changes to be made real in batch.
  This results in in a ~400 fold decrease in contention on my test system.
  Proposed on tech-kern but modified to use the page interlock rather than
  atomics to synchronise as it's much easier to maintain that way, and
  cheaper.
2019-12-31 22:42:50 +00:00
ad
364cbbd32e Nothing uses uvm.cpus any more, and we can do the same with cpu_lookup(),
so get rid of it.
2019-12-27 13:19:24 +00:00
ad
9b1e2fa25c Redo the page allocator to perform better, especially on multi-core and
multi-socket systems.  Proposed on tech-kern.  While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.
2019-12-27 12:51:56 +00:00
ad
5978ddc663 Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code.  Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
2019-12-13 20:10:21 +00:00
ad
221d5f982e - Adjust uvmexp.swpgonly with atomics, and make uvm_swap_data_lock static.
- A bit more __cacheline_aligned on mutexes.
2019-12-01 14:40:31 +00:00
cherry
609524cc7d Move sys/uvm/uvm_physseg.h inclusion to within _KERNEL only. 2017-01-02 20:08:32 +00:00
cherry
8af61ebf0c Use uvm_physseg.h:uvm_page_physload() instead of uvm_extern.h
For this, include uvm_physseg.h in the build and include tree, make a
cosmetic modification to the prototype for uvm_page_physload().
2016-12-22 13:26:24 +00:00
riastradh
f9d2ab17e3 Limit <sys/rndsource.h> include to kernel. 2015-04-13 22:04:44 +00:00
riastradh
67d6ba47fb Convert remaining MI <sys/rnd.h> stragglers. Many MD ones left. 2015-04-13 16:46:33 +00:00
tls
ea6af427bd Merge tls-earlyentropy branch into HEAD. 2014-08-10 16:44:32 +00:00
tls
7b0b7dedd9 Entropy-pool implementation move and cleanup.
1) Move core entropy-pool code and source/sink/sample management code
   to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
   source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
   avoid expensive operations on disabled entropy sources; make the
   rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
   have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
   system events, and skew between clocks, with a sample implementation
   for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files).  Tested with release
builds on amd64 and evbarm and live testing on amd64.
2012-02-02 19:42:57 +00:00
mrg
8169e46991 move and rename the uvm history code out of uvm_stat to "kernhist".
rename "UVMHIST" option to enable the uvm histories.

TODO:
- make UVMHIST properly depend upon KERNHIST
- enable dynamic registration of histories.  this is mostly just
  allocating something in a bitmap, and is only for viewing multiple
  histories in a merged form.


tested on amd64 and sparc64.
2011-05-17 04:18:05 +00:00
chuck
beb929a933 udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.
2011-02-02 17:53:41 +00:00
chuck
3ba477b154 udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.
2011-02-02 15:13:33 +00:00
uebayasi
565a3d3094 Make UVM_PAGE_TRKOWN a real flag. 2010-12-09 01:48:05 +00:00
ad
b746352090 Reduce memory spent on bookkeeping for large values of MAXCPUS. 2010-04-25 15:54:14 +00:00
rmind
40cf6f3659 Remove uarea swap-out functionality:
- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code.  Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
2009-10-21 21:11:57 +00:00
rmind
5c68e5d0ee Ephemeral mapping (emap) implementation. Concept is based on the idea that
activity of other threads will perform the TLB flush for the processes using
emap as a side effect.  To track that, global and per-CPU generation numbers
are used.  This idea was suggested by Andrew Doran; various improvements to
it by me.  Notes:

- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.

Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.
2009-06-28 15:18:50 +00:00
ad
7a34cb95f0 Replace the global vm_page hash with a per vm_object rbtree.
Proposed on tech-kern@.
2008-06-04 15:06:04 +00:00
ad
cbbf514e2c - vm_page: put listq, pageq into a union alongside a LIST_ENTRY, so we can
use both types of list.

- Make page coloring and idle zero state per-CPU.

- Maintain per-CPU page freelists. When freeing, put pages onto the local
  CPU's lists and the global lists. When allocating, prefer to take pages
  from the local CPU. If none are available take from the global list as
  done now. Proposed on tech-kern@.
2008-06-04 12:45:28 +00:00
ad
4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
ad
4688843d2b Merge unobtrusive locking changes from the vmlocking branch. 2007-07-21 19:21:53 +00:00
ad
88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
ad
6820f4664f Add a sysctl to disable swapout of kernel stacks. Discussed on tech-kern@. 2007-06-15 18:28:39 +00:00
thorpej
dd962f8680 Pick up some additional files that were missed before due to conflicts
with newlock2 merge:

Replace the Mach-derived boolean_t type with the C99 bool type.  A
future commit will replace use of TRUE and FALSE with true and false.
2007-02-21 23:48:10 +00:00
thorpej
712239e366 Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
2007-02-21 22:59:35 +00:00
ad
7df3c36a6c uvm_kick_scheduler(): do nothing until the swap subsystem is initialized. 2007-02-19 01:35:19 +00:00
ad
d91014721f Add uvm_kick_scheduler() (MP safe) to replace wakeup(&proc0). 2007-02-15 20:21:13 +00:00
yamt
8bf7662829 merge yamt-splraiseipl branch.
- finish implementing splraiseipl (and makeiplcookie).
	  http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
	- complete workqueue(9) and fix its ipl problem, which is reported
	  to cause audio skipping.
	- fix netbt (at least compilation problems) for some ports.
	- fix PR/33218.
2006-12-21 15:55:21 +00:00
yamt
9d3e3eab23 merge yamt-pdpolicy branch.
- separate page replacement policy from the rest of kernel
	- implement an alternative replacement policy
2006-09-15 15:51:12 +00:00
yamt
a3af4c1530 remove the following options. no objections on tech-kern@.
UVM_PAGER_INLINE
	UVM_AMAP_INLINE
	UVM_PAGE_INLINE
	UVM_MAP_INLINE
2006-02-11 12:45:07 +00:00
yamt
52f0a62851 read-ahead statistics. 2005-11-29 15:45:28 +00:00
yamt
94ce3d822f don't include uvm_*_i.h unless needed,
to reduce bogus header dependencies.
2005-10-30 11:56:51 +00:00
yamt
662ada8f7a allocate anons on-demand, rather than reserving static amount of
them on boot/swapon.
2005-05-11 13:02:25 +00:00
yamt
1207308b90 for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
  remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
  PR/24039.
2005-01-01 21:00:06 +00:00
yamt
f25d78c712 introduce UVMHIST_LOANHIST and sprinkle UVMHIST_LOGs. 2004-11-23 04:51:56 +00:00
matt
a78a1b0777 Back out the changes in
http://mail-index.netbsd.org/source-changes/2004/01/29/0027.html
since they don't really fix the problem.

Incorpate one fix:  Mark uvm_map_entry's that were created with
UVM_FLAG_NOMERGE so that they will not be used as future merge
candidates.
2004-02-10 01:30:49 +00:00
yamt
20c5bc5099 - split uvm_map() into two functions for the followings.
- for in-kernel maps, disable map entry merging so that
  unmap operations won't block. (workaround for PR/24039)
- for in-kernel maps, allocate kva for vm_map_entry from
  the map itsself and eliminate MAX_KMAPENT and
  uvm_map_entry_kmem_pool.
2004-01-29 12:06:02 +00:00
matt
00ed0b8fb8 Reorder things so that with multiple inclusion protection that optional
definitions are outside the protection checks.
2002-12-01 22:58:43 +00:00
perry
fbf4988104 gah. reversed a test. 2002-11-02 16:50:18 +00:00
perry
dd07fed86d /*CONTCOND*/, and protect UVMHIST_DECL with #ifdef UVMHIST 2002-11-02 07:37:14 +00:00
thorpej
3479cf6ba9 Protect "struct uvm" with _KERNEL. 2002-09-15 01:01:32 +00:00
chs
64c6d1d2dc a whole bunch of changes to improve performance and robustness under load:
- remove special treatment of pager_map mappings in pmaps.  this is
   required now, since I've removed the globals that expose the address range.
   pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
   no longer any need to special-case it.
 - eliminate struct uvm_vnode by moving its fields into struct vnode.
 - rewrite the pageout path.  the pager is now responsible for handling the
   high-level requests instead of only getting control after a bunch of work
   has already been done on its behalf.  this will allow us to UBCify LFS,
   which needs tighter control over its pages than other filesystems do.
   writing a page to disk no longer requires making it read-only, which
   allows us to write wired pages without causing all kinds of havoc.
 - use a new PG_PAGEOUT flag to indicate that a page should be freed
   on behalf of the pagedaemon when it's unlocked.  this flag is very similar
   to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
   pageout fails due to eg. an indirect-block buffer being locked.
   this allows us to remove the "version" field from struct vm_page,
   and together with shrinking "loan_count" from 32 bits to 16,
   struct vm_page is now 4 bytes smaller.
 - no longer use PG_RELEASED for swap-backed pages.  if the page is busy
   because it's being paged out, we can't release the swap slot to be
   reallocated until that write is complete, but unlike with vnodes we
   don't keep a count of in-progress writes so there's no good way to
   know when the write is done.  instead, when we need to free a busy
   swap-backed page, just sleep until we can get it busy ourselves.
 - implement a fast-path for extending writes which allows us to avoid
   zeroing new pages.  this substantially reduces cpu usage.
 - encapsulate the data used by the genfs code in a struct genfs_node,
   which must be the first element of the filesystem-specific vnode data
   for filesystems which use genfs_{get,put}pages().
 - eliminate many of the UVM pagerops, since they aren't needed anymore
   now that the pager "put" operation is a higher-level operation.
 - enhance the genfs code to allow NFS to use the genfs_{get,put}pages
   instead of a modified copy.
 - clean up struct vnode by removing all the fields that used to be used by
   the vfs_cluster.c code (which we don't use anymore with UBC).
 - remove kmem_object and mb_object since they were useless.
   instead of allocating pages to these objects, we now just allocate
   pages with no object.  such pages are mapped in the kernel until they
   are freed, so we can use the mapping to find the page to free it.
   this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
2001-09-15 20:36:31 +00:00
thorpej
27f66d3bf7 Macro'ize the code that checks the free and inactive thresholds and
wakes the pagedaemon.
2001-06-27 21:18:34 +00:00
chs
821ec03ed9 replace vm_map{,_entry}_t with struct vm_map{,_entry} *. 2001-06-02 18:09:08 +00:00
mrg
67afbd6270 use _KERNEL_OPT 2001-05-30 11:57:16 +00:00