- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.
pages worked great with uvm_pageqlock, but it doesn't buy anything any more,
because now the busy pages are likely in a per-CPU queue somewhere waiting
to be processed, and changing the intent on those queued pages costs next
to nothing. Remove this and get back all the bits in pg->pqflags.
- Reduce unnecessary page scan in putpages esp. when an object has a ton of
pages cached but only a few of them are dirty.
- Reduce the number of pmap operations by tracking page dirtiness more
precisely in uvm layer.
of page interlocks. Require that the page interlock be held over calls to
uvm_pageactivate(), uvm_pagewire() and similar.
- Solve the concurrency problem with page replacement state. Rather than
updating the global state synchronously, set an intended state on
individual pages (active, inactive, enqueued, dequeued) while holding the
page interlock. After the interlock is released put the pages on a 128
entry per-CPU queue for their state changes to be made real in batch.
This results in in a ~400 fold decrease in contention on my test system.
Proposed on tech-kern but modified to use the page interlock rather than
atomics to synchronise as it's much easier to maintain that way, and
cheaper.
uvmpdpol at the start of the structure, so that while under global lock we
need only touch one cache line for each vm_page. There is still the problem
of vm_page not being aligned, but this seems to drop lock wait time for
(a modified) uvmpdpol and the allocator by 20-30% in a quick test.
something else soon and TBH it matches what this macro does better.
- Add inlines to set/get locator values in the unused lower bits of
pg->phys_addr. Begin by using it to cache the freelist index, because
computing it is expensive and that shows up during profiling. Discussed
on tech-kern.
rbtree page lookup was introduced during the NetBSD 5.0 development cycle to
bypass lock contention problems with the (then) global page hash, and was a
temporary solution to allow us to make progress.radixtree is the intended
replacement.
Ok yamt@.
lock for use of the pagedaemon policy code. Discussed on tech-kern.
PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
of contents of uvm pages without mapping them into kernel, using
direct map or moral equivalent; pmaps supporting the interface need
to provide pmap_direct_process() and define PMAP_DIRECT
implement the new interface for amd64; I hear alpha and mips might be relatively
easy to add too, but I lack the knowledge
part of resolution for PR kern/53124
Introduce uvm_hotplug(9) to the kernel.
Many thanks, in no particular order to:
TNF, for funding the project.
Chuck Silvers - for multiple API reviews and feedback.
Nick Hudson - for testing on multiple architectures and bugfix patches.
Everyone who helped with boot testing.
KeK (http://www.kek.org.in) for hosting the primary developers.
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
Maintain an array of pointer to struct vm_physseg, instead of struct
array. So that VM subsystem can take its pointer safely. Pointer
to this struct will replace raw paddr_t usage in the future.
Dynamic removal is not supported yet.
Only MD data structure changes, no kernel bump needed.
Tested on i386, amd64, powerpc/ibm40x, arm11.
vm_page *) "reverse" lookup code from uvm_page.h to uvm_page.c, to
help migration to not do that.
Likewise move per-page metadata (struct vm_page *) -> physical
address "forward" conversion code into *.c too. This is called
only low-layer VM and MD code.
lookup code from uvm_page.h to uvm_page.c.
This code is used by some pmaps to lookup per-page state (PV) from
per-segment metadata (struct vm_physseg). This is not needed if
UVM looks up physical segment once in fault handler, then directly
passes it to pmap. This change helps transition to that model.
The only users of vm_physseg_find() are pmap_motorola.c and
powerpc/ibm4xx/pmap.c.
Tested By: Compiling and running powerpc/ibm4xx/pmap.c
(evbppc/conf/OPENBLOCKS266)
in genfs_do_putpages() and uao_put().
Use 'v_uobj.uo_npages' to check for an empty memq.
Put some assertions where these marker pages may not appear.
Ok: YAMAMOTO Takashi <yamt@netbsd.org>
use both types of list.
- Make page coloring and idle zero state per-CPU.
- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.
- Use high and low water marks to try and reduce power consumption.
- Do trylock on uvm_fpageqlock, and bail if we can't get it.
- Only run on one CPU at a time.