NetBSD

Commit Graph

Author	SHA1	Message	Date
riastradh	943e1fb0b5	uvm: Eliminate __HAVE_ATOMIC_AS_MEMBAR conditionals. Discussed on tech-kern: https://mail-index.netbsd.org/tech-kern/2023/02/23/msg028729.html	2023-02-24 11:03:13 +00:00
andvar	ff23aff6ad	fix various typos in comments, documentation and messages.	2022-05-31 08:43:13 +00:00
riastradh	ef3476fb57	sys: Use membar_release/acquire around reference drop. This just goes through my recent reference count membar audit and changes membar_exit to membar_release and membar_enter to membar_acquire -- this should make everything cheaper on most CPUs without hurting correctness, because membar_acquire is generally cheaper than membar_enter.	2022-04-09 23:38:31 +00:00
riastradh	122a3e8a60	sys: Membar audit around reference count releases. If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object. Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing. Consider: Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); / / membar_exit(); / atomic_dec -- not last atomic_dec -- last / membar_enter(); / KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); The memory barriers ensure that obj->foo = 42; mumble(&obj->bar); in thread A happens before KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); in thread B. Without them, this ordering is not guaranteed. So in general it is necessary to do membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter(); to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).) I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability. I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_, making the naming annoying) and get rid of all the ifdefs.	2022-03-12 15:32:30 +00:00
skrll	e9de112945	Consistently use %#jx instead of 0x%jx or just %jx in UVMHIST_LOG formats	2021-03-13 15:29:55 +00:00
chs	9133d44ed0	In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails then rather than do more try-locks and eventually sleep for a tick, take a hold on the current owner's lock, drop the page interlock, and acquire the lock that we took the hold on in a blocking fashion. After we get the lock, check if the lock that we acquired is still the lock for the owner of the page that we're interested in. If the owner hasn't changed then can proceed with this page, otherwise we will skip this page and move on to a different page. This dramatically reduces the amount of time that the pagedaemon sleeps trying to get locks, since even 1 tick is an eternity to sleep in this context and it was easy to trigger that case in practice, and with this new method the pagedaemon only very rarely actually blocks to acquire the lock that it wants since the object locks are adaptive, and when the pagedaemon does block then the amount of time it spends sleeping will be generally be much less than 1 tick.	2020-11-04 01:30:19 +00:00
chs	a8aa7072a7	in uao_get(), if we unlock the uobj to read a page from swap, we must clear the cached page array because it is now stale. also add a missing call to uvm_page_array_fini() if the I/O fails. fixes PR 55493.	2020-08-19 15:36:41 +00:00
simonb	bf74807839	Remove trailing \n from UVMHIST_LOG() format strings.	2020-08-19 07:29:00 +00:00
skrll	f3bd60e230	Consistently use UVMHIST(__func__) Convert UVMHIST_{CALLED,LOG} into UVMHIST_CALLARGS	2020-07-09 05:57:15 +00:00
skrll	4e62681fb0	Trailing whitespace	2020-07-08 13:26:22 +00:00
ad	a6b947c515	uao_get(): in the PGO_SYNCIO case use uvm_page_array and simplify control flow a little bit.	2020-05-25 22:04:51 +00:00
ad	4bfe043955	- Alter the convention for uvm_page_array slightly, so the basic search parameters can't change part way through a search: move the "uobj" and "flags" arguments over to uvm_page_array_init() and store those with the array. - With that, detect when it's not possible to find any more pages in the tree with the given search parameters, and avoid repeated tree lookups if the caller loops over uvm_page_array_fill_and_peek().	2020-05-25 21:15:10 +00:00
ad	48081d9a3b	PR kern/55300: ubciomove triggers page not dirty assertion If overwriting an existing page, mark it dirty since there may be no managed mapping to track the modification.	2020-05-25 20:13:00 +00:00
ad	0fd3595baf	uao_get(): handle PGO_OVERWRITE.	2020-05-22 19:02:59 +00:00
hannken	b616d2aaa2	Suppress GCC warnings and fix a UVMHIST_LOG() statement. Kernels ALL/amd64 and ALL/i386 and port sparc64 build again.	2020-05-20 12:47:36 +00:00
ad	812b46df75	PR kern/32166: pgo_get protocol is ambiguous Also problems with tmpfs+nfs noted by hannken@. Don't pass PGO_ALLPAGES to pgo_get, and ignore PGO_DONTCARE in the !PGO_LOCKED case. In uao_get() have uvm_pagealloc() take care of page zeroing and release busy pages on error.	2020-05-19 22:22:15 +00:00
ad	ff872804dc	Start trying to reduce cache misses on vm_page during fault processing. - Make PGO_LOCKED getpages imply PGO_NOBUSY and remove the latter. Mark pages busy only when there's actually I/O to do. - When doing COW on a uvm_object, don't mess with neighbouring pages. In all likelyhood they're already entered. - Don't mess with neighbouring VAs that have existing mappings as replacing those mappings with same can be quite costly. - Don't enqueue pages for neighbour faults unless not enqueued already, and don't activate centre pages unless uvmpdpol says its useful. Also: - Make PGO_LOCKED getpages on UAOs work more like vnodes: do gang lookup in the radix tree, and don't allocate new pages. - Fix many assertion failures around faults/loans with tmpfs.	2020-05-17 19:38:16 +00:00
ad	9c9ebb954c	PR kern/55268: tmpfs is slow uao_get(): in the PGO_LOCKED case, we're okay to allocate a new page as long as the caller holds a write lock. PGO_NOBUSY doesn't put a stop to that.	2020-05-15 22:27:04 +00:00
ad	1d7848ad43	Process concurrent page faults on individual uvm_objects / vm_amaps in parallel, where the relevant pages are already in-core. Proposed on tech-kern. Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until adjustments are made to their pmaps.	2020-03-22 18:32:41 +00:00
ad	1912643ff9	Tweak the March 14th change to make page waits interlocked by pg->interlock. Remove unneeded changes and only deal with the PQ_WANTED flag, to exclude possible bugs.	2020-03-17 18:31:38 +00:00
ad	5972ba1600	Make page waits (WANTED vs BUSY) interlocked by pg->interlock. Gets RW locks out of the equation for sleep/wakeup, and allows observing+waiting for busy pages when holding only a read lock. Proposed on tech-kern.	2020-03-14 20:23:51 +00:00
rin	4c1762c6a4	0x%#x --> %#x for non-external codes. Also, stop mixing up 0x%x and %#x in single files as far as possible.	2020-02-24 12:38:57 +00:00
ad	d2a0ebb67a	UVM locking changes, proposed on tech-kern: - Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock. - Break v_interlock and vmobjlock apart. v_interlock remains a mutex. - Do partial PV list locking in the x86 pmap. Others to follow later.	2020-02-23 15:46:38 +00:00
ad	05a3457e85	Merge from yamt-pagecache (after much testing): - Reduce unnecessary page scan in putpages esp. when an object has a ton of pages cached but only a few of them are dirty. - Reduce the number of pmap operations by tracking page dirtiness more precisely in uvm layer.	2020-01-15 17:55:43 +00:00
ad	94843b1390	- Add and use wrapper functions that take and acquire page interlocks, and pairs of page interlocks. Require that the page interlock be held over calls to uvm_pageactivate(), uvm_pagewire() and similar. - Solve the concurrency problem with page replacement state. Rather than updating the global state synchronously, set an intended state on individual pages (active, inactive, enqueued, dequeued) while holding the page interlock. After the interlock is released put the pages on a 128 entry per-CPU queue for their state changes to be made real in batch. This results in in a ~400 fold decrease in contention on my test system. Proposed on tech-kern but modified to use the page interlock rather than atomics to synchronise as it's much easier to maintain that way, and cheaper.	2019-12-31 22:42:50 +00:00
ad	881d12e6f2	Merge from yamt-pagecache: - do gang lookup of pages using radixtree. - remove now unused uvm_object::uo_memq and vm_page::listq.queue.	2019-12-15 21:11:34 +00:00
ad	5978ddc663	Break the global uvm_pageqlock into a per-page identity lock and a private lock for use of the pagedaemon policy code. Discussed on tech-kern. PR kern/54209: NetBSD 8 large memory performance extremely low PR kern/54210: NetBSD-8 processes presumably not exiting PR kern/54727: writing a large file causes unreasonable system behaviour	2019-12-13 20:10:21 +00:00
ad	f36df6629d	Avoid calling pmap_page_protect() while under uvm_pageqlock.	2019-12-01 20:31:40 +00:00
ad	221d5f982e	- Adjust uvmexp.swpgonly with atomics, and make uvm_swap_data_lock static. - A bit more __cacheline_aligned on mutexes.	2019-12-01 14:40:31 +00:00
msaitoh	8ccde42c84	Avoid undefined behavior in uao_pagein_page(). Found by kUBSan. OK'd by riastradh. I think this is a real bug on amd64 at least.	2019-07-28 05:28:53 +00:00
chs	0656708806	allow tmpfs files to be larger than 4GB.	2018-05-28 21:04:35 +00:00
pgoyette	cb32a134a5	Update the kernhist(9) kernel history code to address issues identified in PR kern/52639, as well as some general cleaning-up... (As proposed on tech-kern@ with additional changes and enhancements.) Details of changes: * All history arguments are now stored as uintmax_t values[1], both in the kernel and in the structures used for exporting the history data to userland via sysctl(9). This avoids problems on some architectures where passing a 64-bit (or larger) value to printf(3) can cause it to process the value as multiple arguments. (This can be particularly problematic when printf()'s format string is not a literal, since in that case the compiler cannot know how large each argument should be.) * Update the data structures used for exporting kernel history data to include a version number as well as the length of history arguments. * All [2] existing users of kernhist(9) have had their format strings updated. Each format specifier now includes an explicit length modifier 'j' to refer to numeric values of the size of uintmax_t. * All [2] existing users of kernhist(9) have had their format strings updated to replace uses of "%p" with "%#jx", and the pointer arguments are now cast to (uintptr_t) before being subsequently cast to (uintmax_t). This is needed to avoid compiler warnings about casting "pointer to integer of a different size." * All [2] existing users of kernhist(9) have had instances of "%s" or "%c" format strings replaced with numeric formats; several instances of mis-match between format string and argument list have been fixed. * vmstat(1) has been modified to handle the new size of arguments in the history data as exported by sysctl(9). * vmstat(1) now provides a warning message if the history requested with the -u option does not exist (previously, this condition was silently ignored, with only a single blank line being printed). * vmstat(1) now checks the version and argument length included in the data exported via sysctl(9) and exits if they do not match the values with which vmstat was built. * The kernhist(9) man-page has been updated to note the additional requirements imposed on the format strings, along with several other minor changes and enhancements. [1] It would have been possible to use an explicit length (for example, uint64_t) for the history arguments. But that would require another "rototill" of all the users in the future when we add support for an architecture that supports a larger size. Also, the printf(3) format specifiers for explicitly-sized values, such as "%"PRIu64, are much more verbose (and less aesthetically appealing, IMHO) than simply using "%ju". [2] I've tried very hard to find "all [the] existing users of kernhist(9)" but it is possible that I've missed some of them. I would be glad to update any stragglers that anyone identifies.	2017-10-28 00:37:11 +00:00
chs	b4dc18ca94	add assertions that would have caught the recent audio mmap bugs.	2017-05-30 17:09:17 +00:00
martin	07832dcc0c	PR kern/51371: fix misleading indentation	2016-07-28 07:52:06 +00:00
pooka	d8e04c9094	to garnish, dust with _KERNEL_OPT	2015-08-24 22:50:32 +00:00
riastradh	e951f85b1a	Allow VM_NFREELIST in uao_set_pgfl, meaning any freelist is OK.	2014-05-25 18:55:11 +00:00
riastradh	c20b71f6fe	Add uao_set_pgfl to limit a uvm_aobj's pages to a specified freelist. Brought up on tech-kern: https://mail-index.netbsd.org/tech-kern/2014/05/20/msg017095.html	2014-05-22 14:01:46 +00:00
martin	c9e83a001d	Mark a diagnostic-only variable	2013-10-25 20:22:55 +00:00
matt	76a03311f2	#include <sys/atomic.h>	2012-09-15 06:25:47 +00:00
rmind	1e84067639	- Manage anonymous UVM object reference count with atomic ops. - Fix an old bug of possible lock against oneself (uao_detach_locked() is called from uao_swap_off() with uao_list_lock acquired). Also removes the try-lock dance in uao_swap_off(), since the lock order changes.	2012-09-14 22:20:50 +00:00
rmind	8c26068070	- Describe uvm_aobj and the lock order. - Remove unnecessary uao_dropswap_range1() wrapper. - KNF. Sprinkle some __cacheline_aligned.	2012-09-14 18:56:15 +00:00
matt	3d901f1f5e	Allocate color appropriate pages.	2011-09-06 16:41:55 +00:00
rmind	e225b7bd09	Welcome to 5.99.53! Merge rmind-uvmplock branch: - Reorganize locking in UVM and provide extra serialisation for pmap(9). New lock order: [vmpage-owner-lock] -> pmap-lock. - Simplify locking in some pmap(9) modules by removing P->V locking. - Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs). - Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner. Add TLBSTATS option for x86 to collect statistics about TLB shootdowns. - Unify /dev/mem et al in MI code and provide required locking (removes kernel-lock on some ports). Also, avoid cache-aliasing issues. Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches formed the core changes of this branch.	2011-06-12 03:35:36 +00:00
rmind	c22a36981f	Replace "malloc" in comments, remove unnecessary header inclusions.	2011-04-23 18:14:12 +00:00
rmind	cdc76ff97e	Replace uvm_aobj_cache with kmem(9).	2011-02-11 00:21:18 +00:00
chuck	f9d8cc1a37	udpate license clauses on chuck^2 code to match the new-style BSD licenses. based on diff that rmind@ sent me (and confirmed with chs@ via email). no functional change with this commit.	2011-02-02 15:28:38 +00:00
enami	7f4cb8b53c	Remove nop code; the code is moved to uao_dropswap_range1() when it is introduced in rev. 1.75.	2011-01-25 03:34:29 +00:00
hannken	c84e81cad1	Add vm page flag PG_MARKER and use it to tag dummy marker pages in genfs_do_putpages() and uao_put(). Use 'v_uobj.uo_npages' to check for an empty memq. Put some assertions where these marker pages may not appear. Ok: YAMAMOTO Takashi <yamt@netbsd.org>	2010-07-29 10:54:50 +00:00
rmind	22d67cdea8	uvm_fault_{upper,lower}_done: move drop-swap outside the page-queues lock. Assert for object lock being held (or ref count 0) in uao_set_swslot().	2010-05-28 23:41:14 +00:00
rmind	40cf6f3659	Remove uarea swap-out functionality: - Addresses the issue described in PR/38828. - Some simplification in threading and sleepq subsystems. - Eliminates pmap_collect() and, as a side note, allows pmap optimisations. - Eliminates XS_CTL_DATA_ONSTACK in scsipi code. - Avoids few scans on LWP list and thus potentially long holds of proc_lock. - Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k. - Removes __SWAP_BROKEN cases. Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on acorn26 (thanks to <bjh21>). Discussed on <tech-kern>, reviewed by <ad>.	2009-10-21 21:11:57 +00:00

1 2 3 4

157 Commits