NetBSD

Author	SHA1	Message	Date
yamt	7f20b0c529	bump vnode hold count for page cache as well to resolve unfairness between page cache and traditional buffer cache. pointed by enami tsugutomo on current-users@.	2004-01-14 11:28:04 +00:00
yamt	0cad61498f	sysctl_vm_updateminmax: fix swapped filemin and execmin. the problem reported by Vesbula on current-users@.	2004-01-11 18:42:25 +00:00
yamt	7266a95907	store a i/o priority hint in struct buf for buffer queue discipline.	2004-01-10 14:39:50 +00:00
yamt	4b651870d9	#if 0 out unused ubc_flush().	2004-01-07 12:18:16 +00:00
yamt	59afac32fe	- get pages to loan out in uvm_loanuobjpages() rather than having caller (nfsd, in this case) do so. - tweak locking so that nfs loaned READ works on layered filesystems.	2004-01-07 12:17:10 +00:00
chs	7662f44874	fix lock initialization in uvm_anon_add(). from PR 23831.	2004-01-06 15:56:49 +00:00
jdolecek	089abdad44	Rearrange process exit path to avoid need to free resources from different process context ('reaper'). From within the exiting process context: * deactivate pmap and free vmspace while we can still block * introduce MD cpu_lwp_free() - this cleans all MD-specific context (such as FPU state), and is the last potentially blocking operation; all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free() * process is now immediatelly marked as zombie and made available for pickup by parent; the remaining last lwp continues the exit as fully detached * MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same for both 'process' and 'lwp' exit uvm_lwp_exit() is modified to never block; the u-area memory is now always just linked to the list of available u-areas. Introduce (blocking) uvm_uarea_drain(), which is called to release the excessive u-area memory; this is called by parent within wait4(), or by pagedaemon on memory shortage. uvm_uarea_free() is now private function within uvm_glue.c. MD process/lwp exit code now always calls lwp_exit2() immediatelly after switching away from the exiting lwp. g/c now unneeded routines and variables, including the reaper kernel thread	2004-01-04 11:33:29 +00:00
pk	70f20a1217	Replace the traditional buffer memory management -- based on fixed per buffer virtual memory reservation and a private pool of memory pages -- by a scheme based on memory pools. This allows better utilization of memory because buffers can now be allocated with a granularity finer than the system's native page size (useful for filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation of virtual to physical memory mappings (due to the former fixed virtual address reservation) resulting in better utilization of MMU resources on some platforms. Finally, the scheme is more flexible by allowing run-time decisions on the amount of memory to be used for buffers. On the other hand, the effectiveness of the LRU queue for buffer recycling may be somewhat reduced compared to the traditional method since, due to the nature of the pool based memory allocation, the actual least recently used buffer may release its memory to a pool different from the one needed by a newly allocated buffer. However, this effect will kick in only if the system is under memory pressure.	2003-12-30 12:33:13 +00:00
simonb	2b9ac03f55	No need to break a line - the full line is less than 80 chars long.	2003-12-21 11:38:46 +00:00
simonb	b9fbceaf46	Unindent a code block that doens't need to be indented.	2003-12-19 06:02:50 +00:00
pk	3c96ae431b	* Introduce uvm_km_kmemalloc1() which allows alignment and preferred offset to be passed to uvm_map(). * Turn all uvm_km_valloc*() macros back into (inlined) functions to retain binary compatibility with any 3rd party modules.	2003-12-18 15:02:04 +00:00
pk	60181444ca	Condense all existing variants of uvm_km_valloc into a single function: uvm_km_valloc1(), and use it to express all of uvm_km_valloc() uvm_km_valloc_wait() uvm_km_valloc_prefer() uvm_km_valloc_prefer_wait() uvm_km_valloc_align() in terms of it by macro expansion.	2003-12-18 08:15:42 +00:00
tsutsui	082caf94ca	Allow sysctl(8) to update vm.{anon,exec,file}{min,max}. XXX needs sysctl(9) man page to confirm this change..	2003-12-07 00:40:43 +00:00
atatat	13f8d2ce5f	Dynamic sysctl. Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(), vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all nodes are registered with the tree, and nodes can be added (or removed) easily, and I/O to and from the tree is handled generically. Since the nodes are registered with the tree, the mapping from name to number (and back again) can now be discovered, instead of having to be hard coded. Adding new nodes to the tree is likewise much simpler -- the new infrastructure handles almost all the work for simple types, and just about anything else can be done with a small helper function. All existing nodes are where they were before (numerically speaking), so all existing consumers of sysctl information should notice no difference. PS - I'm sorry, but there's a distinct lack of documentation at the moment. I'm working on sysctl(3/8/9) right now, and I promise to watch out for buses.	2003-12-04 19:38:21 +00:00
yamt	f7d48e3571	mincore: don't treat an aobj as a device mapping.	2003-11-29 19:06:48 +00:00
chs	e07f0b9362	eliminate uvm_useracc() in favor of checking the return value of copyin() or copyout(). uvm_useracc() tells us whether the mapping permissions allow access to the desired part of an address space, and many callers assume that this is the same as knowing whether an attempt to access that part of the address space will succeed. however, access to user space can fail for reasons other than insufficient permission, most notably that paging in any non-resident data can fail due to i/o errors. most of the callers of uvm_useracc() make the above incorrect assumption. the rest are all misguided optimizations, which optimize for the case where an operation will fail. we'd rather optimize for operations succeeding, in which case we should just attempt the access and handle failures due to insufficient permissions the same way we handle i/o errors. since there appear to be no good uses of uvm_useracc(), we'll just remove it.	2003-11-13 03:09:28 +00:00
chs	709a3b4e52	two changes in improve scalability: (1) split the single list of pages allocated to a pool into three lists: completely full, partially full, and completely empty. there is no longer any need to traverse any list looking for a certain type of page. (2) replace the 8-element hash table for out-of-page page headers with a splay tree. these two changes (together with the recent enhancements to the wait code) give us linear scaling for a fork+exit microbenchmark.	2003-11-13 02:44:01 +00:00
rearnsha	31a79ddeb0	In vm_phsyseg_find, use u_int for start, len and try when doing a binary search. Avoids the need for signed division by 2. Approved by thorpej.	2003-11-10 16:13:05 +00:00
yamt	8b91732e18	fix wrong assertions. they can be false due to alignment requiments (and PMAP_PREFER).	2003-11-06 12:45:26 +00:00
yamt	4a570157d7	add a missing pmap_update().	2003-11-05 15:45:54 +00:00
yamt	b479cef701	don't move hint backward.	2003-11-05 15:34:50 +00:00
yamt	171053e863	- fix a reversed comparison. - fix "nextgap" case. - make sure don't get addresses behind hint. - deal with integer wraparounds better. - assertions.	2003-11-05 15:09:09 +00:00
yamt	933834a7ae	revert rev.1.70 as it was not needed. uvm_map_lookup_entry() should handle addresses out of the map.	2003-11-03 04:39:11 +00:00
yamt	70538d0c22	add a DEBUG check if freed PG_ZERO pages are really zero-filled.	2003-11-03 03:58:28 +00:00
jdolecek	5e94c73334	kill unneded SYSVSHM includes use ANSI C function definition for uvm_lwp_exit()	2003-11-02 16:53:43 +00:00
yamt	c6d9c8814d	fix a wrong assertion. pointed by Christian Limpach.	2003-11-02 07:58:52 +00:00
yamt	142a2d4058	- update uvm_map::size fewer places. - add related assertions.	2003-11-01 19:56:09 +00:00
yamt	c45bf442f2	commit rest of the previous (rbtree). (i should check .rej files before commit, sorry)	2003-11-01 19:45:13 +00:00
yamt	d6dc30aeba	in uvm_pagefree and friends, if freed pages have been marked by PG_ZERO flag, put them to PGFL_ZEROS queue rather than default one so that we can re-use zero-filled pages efficiently.	2003-11-01 15:18:42 +00:00
yamt	57e554da69	track map entries and free spaces using red-black tree to improve scalability of operations on the map. originally done by Niels Provos for OpenBSD. tweaked for NetBSD by me with some advices from enami tsugutomo. discussed on tech-kern@ and tech-perform@.	2003-11-01 11:09:02 +00:00
yamt	922ad03e28	don't try to lookup addresses out of the map in uvm_coredump_walkmap().	2003-11-01 10:43:27 +00:00
yamt	2022580e89	uvm_loanzero: - after sleeping for memory, re-check if we have a page. - put the allocated page to pageq to appease UVM_PAGE_TRKOWN. - dequeue the page when doing ->K loan.	2003-10-27 12:47:33 +00:00
yamt	bfda434436	whitespace.	2003-10-26 16:04:00 +00:00
jdolecek	4e7d0870dc	update comment - kmem_map is created in kmeminit(), not uvm_km_init()	2003-10-26 08:05:00 +00:00
junyoung	b28a286e6a	KNF.	2003-10-25 23:05:45 +00:00
cl	e30be76fce	simplify tests: The case where l_stat == LSONPROC and l_cpu == curcpu cannot happen because the pagedaemon is the LWP on curcpu and the pagedaemon is a kernel thread and the code is only used by the pagedaemon. See also updated patch in PR kern/23095, which I ment to checkin originally.	2003-10-24 13:07:33 +00:00
cl	ed9c2d7075	don't uvm_swapout LWPs which are LSONPROC on another cpu. uvm_swapout_threads will swapout LWPs which are running on another CPU: - uvm_swapout_threads considers LWPs running on another CPU for swapout if their l_swtime is high - uvm_swapout_threads considers LWPs on the runqueue for swapout if their l_swtime is high but these LWPs might be running by the time uvm_swapout is called symptoms of failure: panic in setrunqueue fixes PR kern/23095	2003-10-19 17:45:35 +00:00
scw	4355b16f71	In uvm_lwp_fork(), check if PMAP_UAREA() is defined and if so, invoke it with the KVA of the newly-wired uarea. This is useful on some architectures (e.g. xscale) where the uarea mapping can be tweaked to use the mini-data cache instead of the main cache.	2003-10-13 20:43:03 +00:00
enami	57a6593f52	Fix indent.	2003-10-09 03:12:29 +00:00
atatat	d4de28f890	When pulling back an amap to cover the new allocation along with the previous entry, don't add the size to the extension -- it's already been added to the end of the previous entry.	2003-10-09 02:44:54 +00:00
thorpej	8655c7d7eb	Add a MAP_WIRED flag to mmap(2), which causes the new mapping to be wired as if by mlock(2).	2003-10-07 00:17:09 +00:00
enami	ae9b5cba84	Rewrite uvm_map_findspace() to improve readability and to fix a bug that it may return space already in use as free space under some condition. The symptom of the bug is that exec fails if stack is unlimited on topdown VM kernel.	2003-10-02 00:02:10 +00:00
enami	0ca733e759	Some whitespace fixes.	2003-10-01 23:08:32 +00:00
enami	aa87bee0c5	ansi'fy.	2003-10-01 22:50:15 +00:00
chs	066b5091f4	don't dereference a vm_page pointer after we free the page.	2003-09-26 04:03:39 +00:00
drochner	da03a1c8cf	Fix a reversed logic in swap deallocation which could lead to uvm_swap_free() being called with a zero slot; this might have been the reason for crashes with sysvshm and heavy swapping. (PR kern/22752 by Tom Spindler) Confirmed by Chuck Silvers.	2003-09-18 13:48:05 +00:00
enami	a396bb4713	Swap where the vm map's max and min offset are stored so that they can be used during map traversal.	2003-09-10 13:38:20 +00:00
pk	1d113dcde7	Can't rely on side-effects in KASSERT expressions which was pointed out to me by YAMAMOTO Takashi.	2003-09-01 14:20:57 +00:00
yamt	7f7c9a3509	remove an obsolete comment. (we now have only one inactive list.)	2003-09-01 12:16:17 +00:00
pk	9a4aea0127	When retiring a swap device with marked bad blocks on it we should update the `# swap page in use' and `# swap page only' counters. However, at the time of swap device removal we can no longer figure out how many of the bad swap pages are actually also `swap only' pages. So, on swap I/O errors arrange things to not include the bad swap pages in the `swpgonly' counter as follows: uvm_swap_markbad() decrements `swpgonly' by the number of bad pages, and the various VM object deallocation routines do not decrement `swpgonly' for swap slots marked as SWSLOT_BAD.	2003-08-28 13:12:17 +00:00
yamt	91161caf3c	use VM_PAGE_TO_PHYS macro instead of using phys_addr directly.	2003-08-26 15:12:18 +00:00
chs	12f04351ad	fix some indentation.	2003-08-24 18:12:25 +00:00
chs	939df36e55	add support for non-executable mappings (where the hardware allows this) and make the stack and heap non-executable by default. the changes fall into two basic catagories: - pmap and trap-handler changes. these are all MD: = alpha: we already track per-page execute permission with the (software) PG_EXEC bit, so just have the trap handler pay attention to it. = i386: use a new GDT segment for %cs for processes that have no executable mappings above a certain threshold (currently the bottom of the stack). track per-page execute permission with the last unused PTE bit. = powerpc/ibm4xx: just use the hardware exec bit. = powerpc/oea: we already track per-page exec bits, but the hardware only implements non-exec mappings at the segment level. so track the number of executable mappings in each segment and turn on the no-exec segment bit iff the count is 0. adjust the trap handler to deal. = sparc (sun4m): fix our use of the hardware protection bits. fix the trap handler to recognize text faults. = sparc64: split the existing unified TSB into data and instruction TSBs, and only load TTEs into the appropriate TSB(s) for the permissions. fix the trap handler to check for execute permission. = not yet implemented: amd64, hppa, sh5 - changes in all the emulations that put a signal trampoline on the stack. instead, we now put the trampoline into a uvm_aobj and map that into the process separately. originally from openbsd, adapted for netbsd by me.	2003-08-24 17:52:28 +00:00
chs	4ffa07757d	mprotect()'s "len" is really a size_t, and we can't do any useful bounds-checking on it.	2003-08-24 16:32:50 +00:00
pk	d022b5caad	uao_pagein_page() & anon_pagein(): * return failure if the page cannot be retrieved. * wakeup any waiters when releasing a page after successful page in.	2003-08-11 16:54:10 +00:00
pk	96f1796f30	Only deactivate pages if their wired count is zero.	2003-08-11 16:48:05 +00:00
pk	3bef941831	Make sure to call uvm_swap_free() and uvm_swap_markbad() with valid (i.e. positive) slot numbers.	2003-08-11 16:44:35 +00:00
pk	5869d91cb9	Introduce uvm_swapisfull(), which computes the available swap space by taking into account swap devices that are in the process of being removed.	2003-08-11 16:33:30 +00:00
agc	aad01611e7	Move UCB-licensed code from 4-clause to 3-clause licence. Patches provided by Joel Baker in PR 22364, verified by myself.	2003-08-07 16:26:28 +00:00
drochner	9c0942bc88	sync comments with reality	2003-08-02 14:12:51 +00:00
mrg	79eaf7449f	de-__P()ify.	2003-07-21 00:54:43 +00:00
christos	b6dc1230b9	PR/22062: Dheeraj S: Don't compare an integral type with NULL.	2003-07-06 16:19:18 +00:00
fvdl	d5aece61d6	Back out the lwp/ktrace changes. They contained a lot of colateral damage, and need to be examined and discussed more.	2003-06-29 22:28:00 +00:00
thorpej	a06b275edc	Undo part of the ktrace/lwp changes. In particular: * Remove the "lwp " argument that was added to vget(). Turns out that nothing actually used it! Remove the "lwp " arguments that were added to VFS_ROOT(), VFS_VGET(), and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted above, didn't use it). Remove all of the "lwp *" arguments to internal functions that were added just to appease the above.	2003-06-29 18:43:21 +00:00
darrenr	960df3c8d1	Pass lwp pointers throughtout the kernel, as required, so that the lwpid can be inserted into ktrace records. The general change has been to replace "struct proc " with "struct lwp " in various function prototypes, pass the lwp through and use l_proc to get the process pointer when needed. Bump the kernel rev up to 1.6V	2003-06-28 14:20:43 +00:00
christos	40e148ef6b	PR/21948: Todd Vierling: Implement MAP_TRYFIXED for linux emulation.	2003-06-23 21:32:33 +00:00
wiz	efa11218e8	Fix typo in panic message. From miod@openbsd.	2003-06-01 09:26:10 +00:00
simonb	88bd53e829	Consistancy nit- use parentheses around return argument.	2003-05-25 13:00:40 +00:00
thorpej	36da248c07	Back out the following chagne: http://mail-index.netbsd.org/source-changes/2003/05/08/0068.html There were some side-effects that I didn't anticipate, and fixing them is proving to be more difficult than I thought, do just eject for now. Maybe one day we can look at this again. Fixes PR kern/21517.	2003-05-10 21:10:23 +00:00
thorpej	b77900c3c2	Simplify the way the bounds of the managed kernel virtual address space is advertised to UVM by making virtual_avail and virtual_end first-class exported variables by UVM. Machine-dependent code is responsible for initializing them before main() is called. Anything that steals KVA must adjust these variables accordingly. This reduces the number of instances of this info from 3 to 1, and simplifies the pmap(9) interface by removing the pmap_virtual_space() function call, and removing two arguments from pmap_steal_memory(). This also eliminates some kludges such as having to burn kernel_map entries on space used by the kernel and stolen KVA. This also eliminates use of VM_{MIN,MAX}_KERNEL_ADDRESS from MI code, this giving MD code greater flexibility over the bounds of the managed kernel virtual address space if a given port's specific platforms can vary in this regard (this is especially true of the evb* ports).	2003-05-08 18:13:12 +00:00
gmcgarry	9b44363ff5	Don't use overloaded term "comm". From Greg A. Woods in PR#17394.	2003-05-04 01:54:26 +00:00
wiz	6a8058b52a	Misc fixes from jmc@openbsd.	2003-05-03 19:01:05 +00:00
yamt	4f0cb3e45a	fix ubc pager to take care of loan_count.	2003-05-03 18:05:16 +00:00
yamt	719bd826c5	use uvm_loanbreak in uvm_fault.	2003-05-03 17:57:50 +00:00
yamt	dea6f8bc9a	- export raw page loan out routine as uvm_loanuobjpages. (for nfsd) - put code for loan-breaking into a function, uvm_loanbreak.	2003-05-03 17:54:32 +00:00
tls	85c8cfb533	Correct use of MAXBSIZE where MAXPHYS was intended. This is a necessary first step towards per-device MAXPHYS, and has the beneficial side effect of allowing clustering to MAXPHYS even on systems that need to run with a reduced MAXBSIZE to get more metadata buffers.	2003-04-23 00:55:17 +00:00
yamt	d99d457173	correct accounting of {exec,file}pages. they are not updated correctly when breaking loan.	2003-04-22 14:28:15 +00:00
christos	59833f145c	PR/2931: Eric Beltensen: Move boolean_t and TRUE/FALSE from uvm_param.h to types.h	2003-04-19 21:42:46 +00:00
yamt	f8b7159909	unbusy a page after put it on the queue. fix a panic with UVM_PAGE_TRKOWN when doing swapoff.	2003-04-12 14:36:43 +00:00
thorpej	03befad98b	In uvm_map_clean(), only call pgo_put if the object has one. From Quentin Garnier <quatriemek.com!netbsd>.	2003-04-09 21:39:29 +00:00
thorpej	7360657293	Tweak the way the pagesize-related variables are set: * Remove DEFAULT_PAGE_SIZE. We don't use PAGE_SIZE the way Mach did. * In uvm_setpagesize(), if we are called with uvmexp.pagesize == 0, then assert that PAGE_SIZE != 0 (i.e. a constant), and set uvmexp.pagesize accordingly. * Provide defaults for MIN_PAGE_SIZE and MAX_PAGE_SIZE if not defined by <machine/vmparam.h>. If PAGE_SIZE is not a constant, MIN_PAGE_SIZE and MAX_PAGE_SIZE must be provided. * If MIN_PAGE_SIZE and MAX_PAGE_SIZE are not equal (i.e. PAGE_SIZE may not be a constant in all configurations), then ensure that PAGE_SIZE and friends expand to variable references for LKMs.	2003-04-09 16:34:10 +00:00
matt	7876614463	Nuke mem_size global since nothing in the kernel actually refers to it. (mmm lint).	2003-03-14 08:35:05 +00:00
thorpej	2d5e311009	Make PGALLOC_VERBOSE compile where size_t != int.	2003-03-10 19:52:24 +00:00
thorpej	2a493af5b0	For PMAP_CACHE_VIVT platforms, make UBC_RELEASE_UNMAP evaluate to TRUE, and add a comment explaining why. Reviewed by Chuq Silvers.	2003-03-10 15:07:17 +00:00
tsutsui	004ae00514	Use cpu_number() in UVMHIST_LOG() rather than non-public ci_cpuid member in struct cpu_info.	2003-03-08 15:17:23 +00:00
matt	6074e25e08	Add support for mmap(2) to be able to return memory aligned on a 2^n boundary.	2003-03-06 00:41:51 +00:00
thorpej	72dd57106c	Implement a minimal pager for the uvm_loanzero_object, which simply has a "put" method which reactivates or dequeues the page. Need for pager pointed out by enami tsugutomo.	2003-03-05 01:52:41 +00:00
thorpej	d3f54e81dd	Fix the following pathological scanario: * User allocates ZFOD region, but does not actually touch the buffer to fault in the pages. * In a loop, user writes this buffer to a network socket, triggering sosend_loan(). * uvm_loan() calls uvm_loanzero() once for each page in the loaned region (since the pages have not yet faulted in). This causes a page to be allocated and zero'd. The result is the kernel spends a lot of time allocating and zero'ing pages. This fixes creates a special object which owns a single zero'd page. This single zero'd page is used to satisfy all loans of non-resident ZFOD mappings. Thanks to Allen Briggs for discovering the problem and for providing an initial patch.	2003-03-04 06:18:54 +00:00
matt	76dd2c90fa	In uvm_map_space, if the current entry is above the new space use the previous entry. (not if the current entry starts at the end of the new space; that case doesn't take into account if the new space had a specified alignment).	2003-03-02 08:57:49 +00:00
matt	d6729b1f53	When finding an aligned block, we need to truncate in topdown, not roundup.	2003-03-02 02:55:03 +00:00
thorpej	eb14e86676	Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and use it. This fixes a few places where either b_dep or b_interlock were not properly initialized.	2003-02-25 20:35:31 +00:00
simonb	a2bdcc915e	Cast result of pgo_put() to (void) as is the style with other calls to pgo_put() in UVM. Pointed out by Andrew Brown.	2003-02-25 00:22:20 +00:00
pk	2931081a79	Make updating a file's reference and use count MP-safe.	2003-02-23 14:37:32 +00:00
simonb	0b2b1cc0cc	Remove assigned-to but not used variable.	2003-02-23 04:53:51 +00:00
matt	23b48be61f	fix a tpyo in a comment.	2003-02-21 16:38:44 +00:00
atatat	df0a9badc6	Introduce "top down" memory management for mmap()ed allocations. This means that the dynamic linker gets mapped in at the top of available user virtual memory (typically just below the stack), shared libraries get mapped downwards from that point, and calls to mmap() that don't specify a preferred address will get mapped in below those. This means that the heap and the mmap()ed allocations will grow towards each other, allowing one or the other to grow larger than before. Previously, the heap was limited to MAXDSIZ by the placement of the dynamic linker (and the process's rlimits) and the space available to mmap was hobbled by this reservation. This is currently only enabled via an option for the i386 platform (though other platforms are expected to follow). Add "options USE_TOPDOWN_VM" to your kernel config file, rerun config, and rebuild your kernel to take advantage of this. Note that the pmap_prefer() interface has not yet been modified to play nicely with this, so those platforms require a bit more work (most notably the sparc) before they can use this new memory arrangement. This change also introduces a VM_DEFAULT_ADDRESS() macro that picks the appropriate default address based on the size of the allocation or the size of the process's text segment accordingly. Several drivers and the SYSV SHM address assignment were changed to use this instead of each one picking their own "default".	2003-02-20 22:16:05 +00:00
perseant	b397c875ae	Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now (there are still some details to work out) but expect that to go away soon. To support these basic changes (creation of lfs_putpages, lfs_gop_write, mods to lfs_balloc) several other changes were made, to wit: * Create a writer daemon kernel thread whose purpose is to handle page writes for the pagedaemon, but which also takes over some of the functions of lfs_check(). This thread is started the first time an LFS is mounted. * Add a "flags" parameter to GOP_SIZE. Current values are GOP_SIZE_READ, meaning that the call should return the size of the in-core version of the file, and GOP_SIZE_WRITE, meaning that it should return the on-disk size. One of GOP_SIZE_READ or GOP_SIZE_WRITE must be specified. * Instead of using malloc(...M_WAITOK) for everything, reserve enough resources to get by and use malloc(...M_NOWAIT), using the reserves if necessary. Use the pool subsystem for structures small enough that this is feasible. This also obsoletes LFS_THROTTLE. And a few that are not strictly necessary: * Moves the LFS inode extensions off onto a separately allocated structure; getting closer to LFS as an LKM. "Welcome to 1.6O." * Unified GOP_ALLOC between FFS and LFS. * Update LFS copyright headers to correct values. * Actually cast to unsigned in lfs_shellsort, like the comment says. * Keep track of which segments were empty before the previous checkpoint; any segments that pass two checkpoints both dirty and empty can be summarily cleaned. Do this. Right now lfs_segclean still works, but this should be turned into an effectless compatibility syscall.	2003-02-17 23:48:08 +00:00
atatat	a57bcda26a	Rework the way in which the map is traversed when dumping core. Now we read-lock the map and call uvm_map_lookup_entry() instead of simply walking from the header to the next and to the next, etc. Dumping from sparsely populated amaps could cause faults that would result in amaps being split, which (in turn) resulted in the core dumping routines dumping some regions of memory twice. This makes the core file too large, the headers not match, gdb not work properly, and so on. Addresses PR 19260.	2003-02-14 16:25:12 +00:00
pk	ff65229410	Include CPU number in UVM history logs.	2003-02-09 22:33:18 +00:00
pk	c7cbbfeead	uvm_fault: case 1B: lock page queue before calling uvm_pageactivate().	2003-02-09 22:32:21 +00:00
pk	9d4b10800c	uao_put: release uvm object's lock only after we're done with its page list.	2003-02-09 22:28:40 +00:00
pk	338f31f581	Make the buffer cache code MP-safe.	2003-02-05 21:38:38 +00:00
thorpej	b193480908	Add extensible malloc types, adapted from FreeBSD. This turns malloc types into a structure, a pointer to which is passed around, instead of an int constant. Allow the limit to be adjusted when the malloc type is defined, or with a function call, as suggested by Jonathan Stone.	2003-02-01 06:23:35 +00:00
pk	ac1bea60c1	amap_copy: remove stray amap_unlock().	2003-01-27 22:14:48 +00:00
enami	c3d0a7a93b	uvm_page_unbusy should skip PGO_DONTCARE page; e.g., locked pgo_getpages request may contain PGO_DONTCARE and nfs_getpages may unbusy them on error. Fix is provided in PR#20028 by YAMAMOTO Takashi. (and same one is approved by chuq while ago in private mail). It was my fault to forget to commit.	2003-01-27 02:10:20 +00:00
yamt	41ad61ee76	make KSTACK_CHECK_* compile after sa merge.	2003-01-22 12:52:14 +00:00
christos	5c729d909f	finally: step 5: disable a KASSERT() if we are doing_shutdown. now sync from ddb should work as badly as before the nathanw_sa merge.	2003-01-21 00:03:07 +00:00
thorpej	b78f59b443	Merge the nathanw_sa branch.	2003-01-18 08:51:40 +00:00
atatat	84a6247a30	Properly set page references counts at the start of the newly allocated ppref data to zero in the case of an amap that has empty space at the front. Don't set anything in the ppref array if "len" is zero. Many thanks to Sami Kantoluoto for providing gdb access to a machine that would reliably crash with problems related to the above, and to Stephan Thesing for corroborating that the patch properly addressed the problem. Note that the ar_pageoff (and related variables) types must be changed soon. The use of "int" here is not theoretically sufficient.	2002-12-20 18:21:13 +00:00
thorpej	130e5c278b	UVM_KMF_NOWAIT -> UVM_FLAG_NOWAIT	2002-12-11 07:14:28 +00:00
thorpej	8ae922d8a7	Define a UVM_FLAG_NOWAIT, which indicates that we're not allowed to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT. From Manuel Bouyer. Fixes a problem where any mapping with read protection was created in a "nowait" context, causing spurious failures.	2002-12-11 07:10:20 +00:00
matt	00ed0b8fb8	Reorder things so that with multiple inclusion protection that optional definitions are outside the protection checks.	2002-12-01 22:58:43 +00:00
bouyer	d986226518	Change uvm_km_kmemalloc() to accept flag UVM_KMF_NOWAIT and pass it to uvm_map(). Change uvm_map() to honnor UVM_KMF_NOWAIT. For this, change amap_extend() to take a flags parameter instead of just boolean for direction, and introduce AMAP_EXTEND_FORWARDS and AMAP_EXTEND_NOWAIT flags (AMAP_EXTEND_BACKWARDS is still defined as 0x0, to keep the code easier to read). Add a flag parameter to uvm_mapent_alloc(). This solves a problem a pool_get(PR_NOWAIT) could trigger a pool_get(PR_WAITOK) in uvm_mapent_alloc(). Thanks to Chuck Silvers, enami tsugutomo, Andrew Brown and Jason R Thorpe for feedback.	2002-11-30 18:28:04 +00:00
lukem	0635de35a3	Remove KDIR=, since SYS_INCLUDE=symlinks and KDIR are not supported any more.	2002-11-26 23:30:07 +00:00
scw	e591e98c92	Quell uninitialised variable warnings.	2002-11-24 11:50:32 +00:00
chs	4b2625143d	change uvm_uarea_alloc() to indicate whether the returned uarea is already backed by physical pages (ie. because it reused a previously-freed one), so that we can skip a bunch of useless work in that case. this fixes the underlying problem behind PR 18543, and also speeds up fork() quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.	2002-11-17 08:32:43 +00:00
atatat	966f9caaed	Properly free "newppref", instead of "amap->am_ppref" (oops), and delay freeing the old am_ppref so that if we bail early due to malloc() failures, valid ppref data hasn't been freed for no reason. Based on comments from enami.	2002-11-15 17:30:35 +00:00
atatat	42c2fe641b	Implement backwards extension of amaps. There are three cases to deal with: Case #1 -- adjust offset: The slot offset in the aref can be decremented to cover the required size addition. Case #2 -- move pages and adjust offset: The slot offset is not large enough, but the amap contains enough inactive space after the mapped pages to make up the difference, so active slots are slid to the "end" of the amap, and the slot offset is, again, adjusted to cover the required size addition. This optimizes for hitting case #1 again on the next small extension. Case #3 -- reallocate, move pages, and adjust offset: There is not enough inactive space in the amap, so the arrays are reallocated, and the active pages are copied again to the "end" of the amap, and the slot offset is adjusted to cover the required size. This also optimizes for hitting case #1 on the next backwards extension. This provides the missing piece in the "forward extension of vm_map_entries" logic, so the merge failure counters have been removed. Not many applications will make any use of this at this time (except for jvms and perhaps gcc3), but a "top-down" memory allocator will use it extensively.	2002-11-14 17:58:48 +00:00
thorpej	ff114c4a59	Fix signed/unsigned comparison warnings.	2002-11-09 20:06:07 +00:00
enami	b7ac697dae	s/than than/than/.	2002-11-08 02:05:16 +00:00
perry	fbf4988104	gah. reversed a test.	2002-11-02 16:50:18 +00:00
perry	bbad42171f	/CONTCOND/ while (0)'ed macros	2002-11-02 07:40:47 +00:00
perry	e6873029ee	/CONSTCOND/	2002-11-02 07:38:42 +00:00
perry	dd07fed86d	/CONTCOND/, and protect UVMHIST_DECL with #ifdef UVMHIST	2002-11-02 07:37:14 +00:00
yamt	3a7bfaf54e	change "uoff" to voff_t from vaddr_t as it's offset within uvm object. fix PR/18855.	2002-10-30 05:24:33 +00:00
simonb	a426ccb272	Fix whitespace bogon.	2002-10-30 02:48:28 +00:00
chs	e60ad901b2	examine the B_ERROR flag instead of the b_error field to determine whether or not an error has occured. pointed out by Stephan Uphoff.	2002-10-27 16:53:20 +00:00
atatat	68277bb301	In the case of a double amap_extend() (during a forward merge after a back merge), don't abort the allocation if the second extend fails, just abort the forward merge and finish the allocation. Code reviewed by thorpej.	2002-10-24 22:22:28 +00:00
atatat	2d6863ada3	Call amap_extend() a second time in the case of a bimerge (both backwards and forwards) if the previous entry was backed by an amap. Fixes pr kern/18789, where netscape 7 + a java applet actually manage to incur forward and bimerges in userspace. Code reviewed by fvdl and thorpej.	2002-10-24 20:37:59 +00:00
jdolecek	e0cc03a09b	merge kqueue branch into -current kqueue provides a stateful and efficient event notification framework currently supported events include socket, file, directory, fifo, pipe, tty and device changes, and monitoring of processes and signals kqueue is supported by all writable filesystems in NetBSD tree (with exception of Coda) and all device drivers supporting poll(2) based on work done by Jonathan Lemon for FreeBSD initial NetBSD port done by Luke Mewburn and Jason Thorpe	2002-10-23 09:10:23 +00:00
atatat	94ef8e0795	Add an implementation of forward merging of new map entries. Most new allocations can be merged either forwards or backwards, meaning no new entries will be added to the list, and some can even be merged in both directions, resulting in a surplus entry. This code typically reduces the number of map entries in the kernel_map by an order of magnitude or more. It also makes possible recovery from the pathological case of "5000 processes created and then killed", which leaves behind a large number of map entries. The only forward merge case not covered is the instance of an amap that has to be extended backwards (WIP). Note that this only affects processes, not the kernel (the kernel doesn't use amaps), and that merge opportunities like this come up very rarely, if at all. Eg, after being up for eight days, I see only three failures in this regard, and even those are most likely due to programs I'm developing to exercise this case. Code reviewed by thorpej, matt, christos, mrg, chuq, chuck, perry, tls, and probably others. I'd like to thank my mother, the Hollywood Foreign Press...	2002-10-18 13:18:42 +00:00
oster	7eac5bf44e	Garbage collect some leftover (and unneeded) code. OK'ed by chs.	2002-10-05 17:26:06 +00:00
chs	b1e734dd18	uao_find_swslot()'s second argument is in units of pages, not bytes. spotted by Doug Donsbach.	2002-10-01 07:52:30 +00:00
mycroft	3c7847ff41	#if 0 the call to uvm_map_checkprot() in sys_munmap() -- it's not documented, and programs do not expect it. Also fixes memory leaks in dlopen()/dlclose().	2002-09-27 19:13:29 +00:00
provos	0f09ed48a5	remove trailing \n in panic(). approved perry.	2002-09-27 15:35:29 +00:00
chs	94a62d45d6	add a new flag VM_MAP_DYING, which is set before we start tearing down a vm_map. use this to skip the pmap_update() at the end of all the removes, which allows pmaps to optimize pmap tear-down. also, use the new pmap_remove_all() hook to let the pmap implemenation know what we're up to.	2002-09-22 07:21:29 +00:00
chs	2b73cf7ece	encapsulate knowledge of uarea allocation in some new functions.	2002-09-22 07:20:29 +00:00
chs	55e1f79335	add pmap_remove_all() hook (empty on most platforms so far).	2002-09-22 07:17:08 +00:00
chs	208b369512	add missing anon lock around call to uvm_anon_lockloanpg().	2002-09-21 06:16:07 +00:00
chs	9672ac098f	add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to return failure if swap is full and there are no free physical pages. have malloc() use this flag if M_CANFAIL is passed to it. use M_CANFAIL to allow amap_extend() to fail when memory is scarce. this should prevent most of the remaining hangs in low-memory situations.	2002-09-15 16:54:26 +00:00
thorpej	3479cf6ba9	Protect "struct uvm" with _KERNEL.	2002-09-15 01:01:32 +00:00
gehenna	77a6b82b27	Merge the gehenna-devsw branch into the trunk. This merge changes the device switch tables from static array to dynamically generated by config(8). - All device switches is defined as a constant structure in device drivers. - The new grammer ``device-major'' is introduced to ``files''. device-major <prefix> char <num> [block <num>] [<rules>] - All device major numbers must be listed up in port dependent majors.<arch> by using this grammer. - Added the new naming convention. The name of the device switch must be <prefix>_[bc]devsw for auto-generation of device switch tables. - The backward compatibility of loading block/character device switch by LKM framework is broken. This is necessary to convert from block/character device major to device name in runtime and vice versa. - The restriction to assign device major by LKM is completely removed. We don't need to reserve LKM entries for dynamic loading of device switch. - In compile time, device major numbers list is packed into the kernel and the LKM framework will refer it to assign device major number dynamically.	2002-09-06 13:18:43 +00:00
thorpej	ae8d1b60df	When breaking an loan due to a page fault, check to see if the other kind of reference-holder (anon or object) is referencing the page. If not, then the page must be removed from the pageq's. Reviewed by Chuck Silvers.	2002-09-02 21:09:50 +00:00
drochner	77944bfa08	call cpu_dumpconf() after dumpdev change, so that the global dumpsize/dumplo get updated	2002-08-31 17:07:59 +00:00
chs	d510857ed4	be sure that the page we allocate to break a loan is put on a paging queue. fixes PR 18037.	2002-08-29 05:03:30 +00:00
matt	78581fe411	In amap_ref, only increment the amap's refcnt after we have established the ppref array. Otherwise, the newly ref'ed pages will be doubly counted and thus never freed because the pprefcnt can't fall to 0.	2002-08-22 23:39:37 +00:00
thorpej	95cb683cfb	Don't pass VM_PROT_EXEC to pmap_kenter_pa().	2002-08-14 15:21:31 +00:00
chs	ebe4c850ef	allocate the bufq after zeroing the swapdev structure, not before.	2002-07-27 14:37:00 +00:00
hannken	7de36862a8	Rename bufq_init() to bufq_alloc(). Add bufq_free() to remove a buffer queue. Avoid MALLOC while holding a spinlock. From Chuck Silvers.	2002-07-21 15:32:17 +00:00
hannken	d4c062b4cc	Convert to new device buffer queue interface.	2002-07-19 16:26:01 +00:00
chs	b1b5af79c2	when dropping a kernel loan, if this was the last loan-to-kernel but the page is still loaned to an anon, we should put the page back on a paging queue. this is because while pages loaned to the kernel really do need to stay resident (since the kernel is accessing the physical memory directly), pages loaned to anons can be paged out just fine. (the page will be paged out twice, first to the object and then again to the anon, but after that the page can be reused.)	2002-07-14 23:53:41 +00:00
yamt	d96bff0e27	add KSTACK_CHECK_MAGIC. discussed on tech-kern.	2002-07-02 20:27:44 +00:00
chs	cfefc92864	rearrange a few lines to appease an assertion.	2002-06-29 18:27:30 +00:00
drochner	e14af78731	Big cleanup and speed improvements to pglist_alloc code: -pass vm_physseg* instead of physseg index, and PFN (int) instead of physical address (could be done even more) -simplify detection of boundary crossing and behave more intelligently in this case -take stuff out of the inner loops, or put into "#ifdef DEBUG" (because we move along physsegs we don't need to check that the pages are physically contigous) -make the "simple" and "contigous" branches look more uniform; at least the outer loops might coalesce one day	2002-06-27 18:05:29 +00:00
chs	faab7dbb46	count aobj pages (most notably kernel stack pages) as anon pages for memory usage-balancing purposes.	2002-06-20 15:05:29 +00:00
enami	d76e8e25bc	Shift by PAGE_SHIFT instead of dividing by PAGE_SIZE.	2002-06-20 08:24:22 +00:00
wrstuden	2eb3dc82d8	Fix recent bugs seen on Performa 4400 macppc's by Makoto Fujiwara <makoto@ki.nu> and Manuel Bouyer <bouyer@netbsd.org>. Help from Allen Briggs, Jason Thorpe, and Matt Thomas. We need to call cpu_cache_probe() early in boot (machdep.c). Add 603 info for completeness, and use NBPG not PAGESIZE, as the latter relies on uvm being setup (cpu_subr.c). Let uvm_page_recolor() be called before uvm has been set up; just note the page coloring value (uvm_page.c).	2002-06-19 17:01:18 +00:00
drochner	cb01228cf4	Make the DMA memory allocators (uvm_pglistalloc()) obey the preferences expressed by freelist assignment, to avoid wasting valuable "low memory" to devices which don't really need it. comments: -I'm not sure searching the physsegs within a freelist beginning with the biggest is the right thing. This is what the "memory steal" code in uvm_page.c does, so keep it consistent. -There seems to be some confusion whether the upper address limit passed is inclusive or not. Stays on the save side, possibly leaving one page out. -The boundary/pagemask check can be simplified, also some arguments passed are only used for diagnostic checks. -Integration with UVM_PAGE_TRKOWN???	2002-06-18 15:49:48 +00:00
drochner	d2b9876081	move initialization of the "struct pglist" returned by uvm_pglistalloc() from the calling code into uvm_pglistalloc() itself for consistency and easier error handling	2002-06-02 14:44:35 +00:00
atatat	6c03c181d2	"offest" -> "offset" in a comment	2002-05-31 16:49:50 +00:00
drochner	f452b252a8	Add another allocator to uvm_pglistalloc() which is used in the case where no alignment / boundary / nsegs restrictions apply. This one doesn't insist in a contigous range, and it honours the "waitok" flag, thus succeeds in situations which were hopeless with the existing one. (A solution which searches for a minimum number of contiguous ranges using some best-fit or so algorithm would be expensive to implement; I believe the "either-or" done here does reflect the current use by bus_dma quite well.) Now agp memory allocation is robust for me. (tested on i810)	2002-05-29 19:20:11 +00:00
enami	9e1deeab34	Add missing pageq lock while uvm_pagefree() is called (either directly or indirectly). Reviewed by chuq.	2002-05-29 11:04:39 +00:00
enami	2afb4efc4c	Make uvn_findpages to return number of pages found so that caller can easily check if all requested pages are found or not.	2002-05-17 22:00:50 +00:00
matt	357945ce6f	When core dumping a process, don't dump maps backed up by the device pager. (move the pagerops externs to uvm_object.h and out the C files).	2002-05-15 06:57:49 +00:00
enami	694a0fec54	When loaned page become ownerless as a result of freeing, it should be dequeue'ed from pageq. Fix provided by chuq.	2002-05-15 00:19:12 +00:00
fredette	c857df5775	When preparing to swap to a miniroot partition, add a little padding to our estimate of the miniroot's size, to avoid overwriting it.	2002-05-09 21:43:44 +00:00
enami	fabaf9a730	- In genfs_putpages(), no need to restrict the cluster within the given region. - In uvm_aio_aiodone(), remove assertions no longer true.	2002-05-09 07:14:37 +00:00
enami	6ceef3fc14	In uao_put(), if we wait for the busy page owned by someone else, we can't simply reuse the pointor to the page. Instead, we need to acquire it again. So, rearrange the loop like genfs_putpages() does. Reviewed by chuq.	2002-05-09 07:04:23 +00:00
enami	c4e1385f55	Fetch the right page from a file even if it is mapped from middle of it. This makes `tail -<N> <FILE> \| cat > file' correctly, where <FILE> is a regular file larger than 10Mbytes (makes tail to map part of file) and <N> is big enough to produce output larger than 8kbytes (makes pipe to use page loan facility). Problem reported by FUKAUMI Naoki on japanese local mailing list.	2002-05-07 02:29:52 +00:00
chs	988df8394c	look in the right flags field for PQ_INACTIVE. make uvmpd_scan_inactive() return void since its return value is ignored.	2002-05-05 16:26:17 +00:00
thorpej	338e636672	Allow pmap_copy_page() and pmap_zero_page() to be #define'd in <machine/pmap.h>.	2002-04-10 00:40:45 +00:00
manu	8645636b04	Updated comment to reflect the creation of uvm_swap_stats()	2002-04-01 12:24:11 +00:00
nathanw	a1be32226e	In amap_pp_adjref(), avoid incorrectly merging the first two chunks in a ppref array when the range being adjusted includes the beginning of the array.	2002-03-28 06:06:29 +00:00
manu	da6e8ccbe8	Don't allocate struct swapent when we only need a struct oswapent.	2002-03-26 11:50:26 +00:00
chs	f80ed5892c	remove PGO_WEAK, it isn't needed anymore.	2002-03-25 02:08:09 +00:00
chs	76cacb8710	when processing PG_RDONLY, mask off VM_PROT_WRITE instead of hard-wiring VM_PROT_READ (since we might have VM_PROT_EXEC too). this fixes problems running binaries out of NFS on macppc. yet another fix courtesy of enami.	2002-03-25 01:56:48 +00:00
darrenr	256089809f	Return EFBIG from mmap() if we try to map too much data and in the fixed address allocation, return EOVERFLOW to match with the non-fixed error.	2002-03-22 11:06:33 +00:00
manu	2debbde786	Move swapctl(SWAP_STATS) implementation to a separate function called uvm_swap_stats(). This is done in order to allow COMPAT_* swapctl() emulation to use it directly without going through sys_swapctl(). The problem with using sys_swapctl() there is that it involves copying the swapent array to the stackgap, and this array's size is not known at build time. Hence it would not be possible to ensure it would fit in the stackgap in any case.	2002-03-18 11:43:01 +00:00
thorpej	1f6482dd1e	Remove PR_MALLOCOK.	2002-03-09 07:28:20 +00:00
chs	87185156fd	a vm_prot_t is a bit-mask, fix an assertion which was treating one more like an enumerated type.	2002-03-09 04:29:03 +00:00
thorpej	a180cee23b	Pool deals fairly well with physical memory shortage, but it doesn't deal with shortages of the VM maps where the backing pages are mapped (usually kmem_map). Try to deal with this: * Group all information about the backend allocator for a pool in a separate structure. The pool references this structure, rather than the individual fields. * Change the pool_init() API accordingly, and adjust all callers. * Link all pools using the same backend allocator on a list. * The backend allocator is responsible for waiting for physical memory to become available, but will still fail if it cannot callocate KVA space for the pages. If this happens, carefully drain all pools using the same backend allocator, so that some KVA space can be freed. * Change pool_reclaim() to indicate if it actually succeeded in freeing some pages, and use that information to make draining easier and more efficient. * Get rid of PR_URGENT. There was only one use of it, and it could be dealt with by the caller. From art@openbsd.org.	2002-03-08 20:48:27 +00:00
thorpej	26b2d2217b	If the bootstrapping process didn't actually use any KVA space, don't reserve size of 0 in kernel_map. From OpenBSD.	2002-03-07 20:15:32 +00:00
simonb	c7339f8919	Include <sys/kernel.h> if UVMHIST is defined - the "cold" variable is used in the UVMHIST_LOG macro. Breakage reported by Chuck Silvers in private mail.	2002-03-05 05:45:54 +00:00
simonb	64c7743a05	Don't "extern int cold;" - this is in <sys/kernel.h>.	2002-03-04 02:19:07 +00:00
christos	894ca870b3	use the <sys/conf.h> macro to get the mmap footprint.	2002-02-28 21:00:23 +00:00
chs	2cce3ebcba	honor the PG_RDONLY flag (so that NFS can clear the PG_NEEDCOMMIT flag when page with it set is modified again). fixes PR 15733.	2002-02-27 16:02:03 +00:00
chs	811c8fad2b	in amap_pp_adjref(), avoid unnecessary fragmentation of the am_ppref array by merging the first changed chunk with the last unchanged chunk if possible.	2002-02-25 00:39:16 +00:00
enami	bb41d19bca	In the function uvm_page_own(), clear owner_tag after assertion so that we can see the owner when assertion failed. Some indentation fix while I'm here.	2002-02-20 07:06:56 +00:00
simonb	fbaba2a978	Add a space after a comma in a few places (KNF).	2002-02-15 17:45:05 +00:00
wiz	b36c0a5406	deamon -> daemon	2002-01-21 14:42:26 +00:00
chs	b263a7eb4d	add a new flag PMAP_CACHE_VIVT for the pmap to inform the MI code that that the cache is virtually-indexed and virtually-tagged (such as on the ARM), and use this flag in the UBC code to be more friendly to those caches.	2002-01-19 16:55:20 +00:00
chs	e9a82c88ce	in uvm_fault_unwire_locked(), if we find that a pmap entry is missing, just skip that page. this situation can arise legitimately when a file with a wired mapping is truncated so that a wired page is no longer part of the file.	2002-01-02 01:10:36 +00:00
chs	a7ec5b4144	redo part of the last commit.	2002-01-01 22:18:39 +00:00
chs	43973be0c5	introduce a new UVM fault type, VM_FAULT_WIREMAX. this is different from VM_FAULT_WIRE in that when the pages being wired are faulted in, the simulated fault is at the maximum protection allowed for the mapping instead of the current protection. use this in uvm_map_pageable{,_all}() to fix the problem where writing via ptrace() to shared libraries that are also mapped with wired mappings in another process causes a diagnostic panic when the wired mapping is removed. this is a really obscure problem so it deserves some more explanation. ptrace() writing to another process ends up down in uvm_map_extract(), which for MAP_PRIVATE mappings (such as shared libraries) will cause the amap to be copied or created. then the amap is made shared (ie. the AMAP_SHARED flag is set) between the kernel and the ptrace()d process so that the kernel can modify pages in the amap and have the ptrace()d process see the changes. then when the page being modified is actually faulted on, the object pages (from the shared library vnode) is copied to a new anon page and inserted into the shared amap. to make all the processes sharing the amap actually see the new anon page instead of the vnode page that was there before, we need to invalidate all the pmap-level mappings of the vnode page in the pmaps of the processes sharing the amap, but we don't have a good way of doing this. the amap doesn't keep track of the vm_maps which map it. so all we can do at this point is to remove all the mappings of the page with pmap_page_protect(), but this has the unfortunate side-effect of removing wired mappings as well. removing wired mappings with pmap_page_protect() is a legitimate operation, it can happen when a file with a wired mapping is truncated. so the pmap has no way of knowing whether a request to remove a wired mapping is normal or when it's due to this weird situation. so the pmap has to remove the weird mapping. the process being ptrace()d goes away and life continues. then, much later when we go to unwire or remove the wired vm_map mapping, we discover that the pmap mapping has been removed when it should still be there, and we panic. so where did we go wrong? the problem is that we don't have any way to update just the pmap mappings that need to be updated in this scenario. we could invent a mechanism to do this, but that is much more complicated than this change and it doesn't seem like the right way to go in the long run either. the real underlying problem here is that wired pmap mappings just aren't a good concept. one of the original properties of the pmap design was supposed to be that all the information in the pmap could be thrown away at any time and the VM system could regenerate it all through fault processing, but wired pmap mappings don't allow that. a better design for UVM would not require wired pmap mappings, and Chuck C. and I are talking about this, but it won't be done anytime soon, so this change will do for now. this change has the effect of causing MAP_PRIVATE mappings to be copied to anonymous memory when they are mlock()d, so that uvm_fault() doesn't need to copy these pages later when called from ptrace(), thus avoiding the call to pmap_page_protect() and the panic that results from this when the mlock()d region is unlocked or freed. note that this change doesn't help the case where the wired mapping is MAP_SHARED. discussed at great length with Chuck Cranor. fixes PRs 10363, 12554, 12604, 13041, 13487, 14580 and 14853.	2001-12-31 22:34:39 +00:00
chs	23c75a9a98	in uvm_map_clean(), add PGO_CLEANIT to the flags passed to an object's pager. we need to make sure that vnode pages are written to disk at least once, otherwise processes could gain access to whatever data was previously stored in disk blocks which are freshly allocated to a file.	2001-12-31 20:34:01 +00:00
chs	ef57a67ca1	fix locking for loaning. in general we should be looking at the page's uobject and uanon pointers rather than at the PQ_ANON flag to determine which lock to hold, since PQ_ANON can be clear even when the anon's lock is the one which we should hold (if the page was loaned from an object and then freed by the object).	2001-12-31 19:21:36 +00:00
chs	4d069e8517	in uvm_vnp_setsize(), wait for any i/o in progress on pages that we free.	2001-12-31 07:00:15 +00:00
enami	d3efa85632	G/C no longer used saved credential for file i/o.	2001-12-16 04:51:34 +00:00
chs	4923ddfdda	in sys_mincore(), check the return value of uvm_vslock() to determine if the vec pointer is valid rather than using uvm_useracc(). uvm_useracc() just tells you if the permissions of a user mapping allow the desired access, not whether faulting on that mapping will succeed.	2001-12-14 04:21:22 +00:00
thorpej	06920aef28	Move the code that walks the process's VM map during a coredump into uvm_coredump_walkmap(), and use callbacks into the coredump routine to do something with each section.	2001-12-10 01:52:26 +00:00
chs	8e9cdbbd63	replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names.	2001-12-09 03:07:43 +00:00
chs	849c9b2bfd	add {anon,file,exec}max as a upper bound on the amount of memory that will be allocated for the respective usage types when there is contention for memory. replace "vnode" and "vtext" with "file" and "exec" in uvmexp field names and sysctl names.	2001-12-09 03:07:19 +00:00
thorpej	205c159f0e	Make the coredump routine exec-format/emulation specific. Split out traditional NetBSD coredump routines into core_netbsd.c and netbsd32_core.c (for COMPAT_NETBSD32).	2001-12-08 00:35:25 +00:00
enami	76858f7620	When initially allocating or extending arrays in struct uvm_amap, adjust allocation size using malloc_roundup(). This eliminates many unnecessary malloc/memcpy calls.	2001-12-05 01:33:09 +00:00
enami	fbfa7f8e61	No need to zero clear after amap->am_bckptr[amap->am_nslot], since we're clearing corresponding elements in an array amap->am_anon[].	2001-12-05 00:34:05 +00:00
chuck	00168f4ce0	fix bug in amap_wiperange() detected by enami tsugutomo. loop control was wrong in one case.	2001-12-01 22:11:13 +00:00
chs	1b8f294146	disallow mapping negative offsets for both regular files and block devices.	2001-11-25 06:42:47 +00:00
enami	b55b4c7df5	Zero clear an array of vm_page * before passing it to VOP_GETPAGES().	2001-11-19 02:44:27 +00:00
lukem	b616d1ca1d	add RCSIDs, and in some cases, slightly cleanup #include order	2001-11-10 07:36:59 +00:00
chs	1d7213c91a	only acquire the lock for swpgonly if we actually need to adjust it.	2001-11-07 14:07:23 +00:00
chs	ac48df1681	only acquire the lock for swpgonly if we actually need to adjust it.	2001-11-07 08:43:32 +00:00
chs	2ed88fe090	several changes prompted by loaning problems: - fix the loaned case in uvm_pagefree(). - redo uvmexp.swpgonly accounting to work with page loaning. add an assertion before each place we adjust uvmexp.swpgonly. - fix uvm_km_pgremove() to always free any swap space associated with the range being removed. - get rid of UVM_LOAN_WIRED flag. instead, we just make sure that pages loaned to the kernel are never on the page queues. this allows us to assert that pages are not loaned and wired at the same time. - add yet more assertions.	2001-11-06 08:07:49 +00:00
simonb	82649768b7	Change some unsigned int variables and parameters to plain ints so that all usages of those agree on unsigned vs. signed.	2001-11-06 06:31:06 +00:00
simonb	819bb532e6	Remove some variables that are set but never used.	2001-11-06 06:28:22 +00:00
chs	6e1dd2fa31	add an assert and rename some variables.	2001-11-06 05:44:25 +00:00
chs	d8cbdbb0da	in uvm_exit(), don't bother to unwire the uarea before we free it, the pages will be freed anyway.	2001-11-06 05:34:42 +00:00
chs	07d2ec83fe	don't call pmap_copy() from uvmspace_fork(). a new process is very likely to call execve() immediately after fork(), so most of the time copying the pmap mappings is wasted effort.	2001-11-06 05:27:17 +00:00
chs	550caf0ce3	allow SWAP_GETDUMPDEV for all users. use {LIST,TAILQ}_FOREACH where appropriate.	2001-11-01 03:49:30 +00:00
thorpej	f67e15c839	uvm_map_protect(): Don't allow VM_PROT_EXECUTE to be set on entries (either the current protection or the max protection) that reference vnodes associated with a file system mounted with the NOEXEC option. uvm_mmap(): Don't allow PROT_EXEC mappings to be established of vnodes which are associated with a file system mounted with the NOEXEC option.	2001-10-30 19:05:26 +00:00
thorpej	a2cd7623d4	Correct a comment.	2001-10-30 18:52:17 +00:00
thorpej	e8ee04475d	- Add a new vnode flag VEXECMAP, which indicates that a vnode has executable mappings. Stop overloading VTEXT for this purpose (VTEXT also has another meaning). - Rename vn_marktext() to vn_markexec(), and use it when executable mappings of a vnode are established. - In places where we want to set VTEXT, set it in v_flag directly, rather than making a function call to do this (it no longer makes sense to use a function call, since we no longer overload VTEXT with VEXECMAP's meaning). VEXECMAP suggested by Chuq Silvers.	2001-10-30 15:32:01 +00:00
thorpej	7285b2c290	uvm_mmap(): If a vnode mapping is established with PROT_EXEC, mark the vnode as VTEXT. uvm_map_protect(): When VM_PROT_EXECUTE is added to a VA range, mark all the vnodes mapped by the range as VTEXT.	2001-10-29 23:06:03 +00:00
chs	dcd9e4a1ee	add some missing spinlocks.	2001-10-21 00:04:42 +00:00
chs	4b887dad17	it is with great chagrin that I must fix yet another 64-bit math bug.	2001-10-16 05:56:23 +00:00
chs	1c97701b8b	fix an uninitialized-variable problem in an error case. pointed out by Simon Burge.	2001-10-15 00:37:51 +00:00
christos	7e19baba28	protect against traditional macro expansion.	2001-10-03 13:32:23 +00:00
chs	3aea6d69ad	skip the MADV_SEQUENTIAL processing if we refault. fixes PR 14060.	2001-10-03 05:17:58 +00:00
chs	0c3dfee2f8	skip the swap-out code if there's no swap space configured. avoid some hangs in low-memory situations.	2001-09-30 02:57:34 +00:00
chs	80373b7e54	don't depend on other headers to include sys/proc.h for us.	2001-09-28 11:59:51 +00:00
chs	365f4c4313	change the names of the arguments to uvn_put() to match their usage.	2001-09-26 07:23:51 +00:00
chs	e37c6bf037	move call to pool_drain() outside the pageq lock.	2001-09-26 07:08:41 +00:00
chs	a467bddfdc	bump the rusage counter for "swaps" when we swap out a process. addresses PR 6170.	2001-09-23 07:10:08 +00:00
chs	2adcba997b	make pmap_resident_count() non-optional.	2001-09-23 06:35:30 +00:00
sommerfeld	cc8633edd3	VOP_PUTPAGES must release the uobj's lock for us, so ensure it's locked beforehand and unlocked afterwards using LOCK_ASSERT().	2001-09-22 22:33:16 +00:00
jdolecek	8573719e3d	add new UVM_LOAN_WIRED flag - the memory pages loaned in TOPAGE case are only wired if this flag is present (i.e. they are not wired by default now) loaned pages are unloaned via new uvm_unloan(), uvm_unloananon() and uvm_unloanpage() are no longer exported adjust uvm_unloanpage() to unwire the pages if UVM_LOAN_WIRED is specified mark uvm_loanuobj() and uvm_loanzero() static also in function implementation kern/sys_pipe.c: uvm_unloanpage() --> uvm_unloan()	2001-09-22 05:58:04 +00:00
chs	a548bfb584	add an assert.	2001-09-21 07:57:35 +00:00
chs	20a658f0ab	work around swap-space/extent performance problem which causes long pauses when processes with lots of swapped-out pages exit.	2001-09-19 03:41:46 +00:00
chs	64c6d1d2dc	a whole bunch of changes to improve performance and robustness under load: - remove special treatment of pager_map mappings in pmaps. this is required now, since I've removed the globals that expose the address range. pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's no longer any need to special-case it. - eliminate struct uvm_vnode by moving its fields into struct vnode. - rewrite the pageout path. the pager is now responsible for handling the high-level requests instead of only getting control after a bunch of work has already been done on its behalf. this will allow us to UBCify LFS, which needs tighter control over its pages than other filesystems do. writing a page to disk no longer requires making it read-only, which allows us to write wired pages without causing all kinds of havoc. - use a new PG_PAGEOUT flag to indicate that a page should be freed on behalf of the pagedaemon when it's unlocked. this flag is very similar to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the pageout fails due to eg. an indirect-block buffer being locked. this allows us to remove the "version" field from struct vm_page, and together with shrinking "loan_count" from 32 bits to 16, struct vm_page is now 4 bytes smaller. - no longer use PG_RELEASED for swap-backed pages. if the page is busy because it's being paged out, we can't release the swap slot to be reallocated until that write is complete, but unlike with vnodes we don't keep a count of in-progress writes so there's no good way to know when the write is done. instead, when we need to free a busy swap-backed page, just sleep until we can get it busy ourselves. - implement a fast-path for extending writes which allows us to avoid zeroing new pages. this substantially reduces cpu usage. - encapsulate the data used by the genfs code in a struct genfs_node, which must be the first element of the filesystem-specific vnode data for filesystems which use genfs_{get,put}pages(). - eliminate many of the UVM pagerops, since they aren't needed anymore now that the pager "put" operation is a higher-level operation. - enhance the genfs code to allow NFS to use the genfs_{get,put}pages instead of a modified copy. - clean up struct vnode by removing all the fields that used to be used by the vfs_cluster.c code (which we don't use anymore with UBC). - remove kmem_object and mb_object since they were useless. instead of allocating pages to these objects, we now just allocate pages with no object. such pages are mapped in the kernel until they are freed, so we can use the mapping to find the page to free it. this allows us to remove splvm() protection in several places. The sum of all these changes improves write throughput on my decstation 5000/200 to within 1% of the rate of NetBSD 1.5 and reduces the elapsed time for "make release" of a NetBSD 1.5 source tree on my 128MB pc to 10% less than a 1.5 kernel took.	2001-09-15 20:36:31 +00:00
chris	0e7661f023	Update pmap_update to now take the updated pmap as an argument. This will allow improvements to the pmaps so that they can more easily defer expensive operations, eg tlb/cache flush, til the last possible moment. Currently this is a no-op on most platforms, so they should see no difference. Reviewed by Jason.	2001-09-10 21:19:08 +00:00
chs	2133049a7c	create a new pool for map entries, allocated from kmem_map instead of kernel_map. use this instead of the static map entries when allocating map entries for kernel_map. this greatly reduces the number of static map entries used and should eliminate the problems with running out.	2001-09-09 19:38:22 +00:00
lukem	53156d96d0	let user know current value of MAX_KMAPENT in panic	2001-09-07 00:50:54 +00:00
chuck	2dec1a929d	handle a locking problem where the second (or later) call in the loanentry loop returns 0. loanentry was returning >0, but was unlocking the maps (because of the zero). reworked to avoid this. problem reported by chuck silvers. also clarify a comment that jdolecek asked about.	2001-08-27 02:34:29 +00:00
chs	a65671c2a9	don't mess with vnode holds or buffer lists for swap i/os. fixes problems with leaked vnode holds.	2001-08-26 00:43:53 +00:00
chs	ed1e153702	use the correct symbol for multi-include protection.	2001-08-25 20:37:46 +00:00
wiz	c52d355d71	"wierd" is weird.	2001-08-20 12:20:01 +00:00
chs	2233d77cac	when fetching an object page to loan out, do so synchronously.	2001-08-18 05:51:44 +00:00
chs	2c441082d4	allow mappings of VBLK vnodes.	2001-08-17 05:53:02 +00:00
chs	37f6c5155d	call VOP_MMAP() before allowing mappings of vnodes to allow filesystems which do not support memory mapped access to cause mmap() of their vnodes to fail.	2001-08-17 05:52:46 +00:00
chs	e9fbc91f95	user maps are always pageable.	2001-08-16 01:37:50 +00:00
matt	cce919e025	Don't include <machine/pmap.h> and <machine/vmparam.h> if _KERNEL isn't defined. Include them explicitly in the few kvm_arch.c that need them.	2001-08-05 03:33:15 +00:00

... 3 4 5 6 7 ...

935 Commits