Commit Graph

269 Commits

Author SHA1 Message Date
petrov 5f4709f782 Replace uvm counters with evcnt, initialize them through __link_set (from Matt Thomas),
disable counters by default and add configuration option UVMMAP_COUNTERS.
2004-05-01 19:40:39 +00:00
junyoung 9262158d3e Fix typo in comments. 2004-04-27 09:50:43 +00:00
junyoung f539f210cc FINDSPACE_FIXED -> UVM_FLAG_FIXED in comment. 2004-04-27 09:45:02 +00:00
simonb b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
yamt dabac1bc03 uvm_map_findspace: don't return unaligned address if alignment is specified.
discussed on tech-kern@.
2004-03-30 12:59:09 +00:00
junyoung 7e0c058612 Drop trailing spaces. 2004-03-24 07:47:32 +00:00
mycroft 2fef5d8dfc Something I posted to tech-kern a long time ago...
Slightly simplify uvm_map_extract() slightly by eliminating "oldstart".
2004-03-17 23:58:12 +00:00
pooka c5e500a486 Reflect dropping mappings in map_size.
Avoids panic on DIAGNOSTIC kernels.

ok by chs
2004-03-11 15:03:47 +00:00
matt a78a1b0777 Back out the changes in
http://mail-index.netbsd.org/source-changes/2004/01/29/0027.html
since they don't really fix the problem.

Incorpate one fix:  Mark uvm_map_entry's that were created with
UVM_FLAG_NOMERGE so that they will not be used as future merge
candidates.
2004-02-10 01:30:49 +00:00
yamt 1e18e59746 - borrow vmspace0 in uvm_proc_exit instead of uvmspace_free.
the latter is not a appropriate place to do so and it broke vfork.
- deactivate pmap before calling cpu_exit() to keep a balance of
  pmap_activate/deactivate.
2004-02-09 13:11:21 +00:00
yamt 8fb96e0be4 introduce a new patchable variable, uvm_debug_check_rbtree,
which is zero by default.
perform rbtree sanity checks only when it isn't zero
because the check is very heavy weight especially when
there're many entries.
2004-02-07 13:22:19 +00:00
yamt a45adbd9c7 don't deactivate pmap in exit1 because we'll touch the pmap later.
instead, borrow vmspace0 immediately before destroying the pmap
in uvmspace_free.
2004-02-07 10:05:52 +00:00
yamt 4124096ea8 uvm_kmapent_alloc:
in the case that there's no cached entries,
if kmem_map is already up, allocate a entry from it
so that we won't try to vm_map_lock recursively.
XXX assuming usage pattern of kmem_map.
2004-02-07 08:02:21 +00:00
he be19fc25f3 Since the playstation2 port still uses a variant of gcc 2.95.2,
change to use a zero-sized array instead of c99 flexible array
member in a struct.

OK'ed by yamt.
2004-02-02 23:13:44 +00:00
yamt c5ffc97d8e remove wrong assertions.
sparc's alloc_cpuinfo_global_va() partially unmaps kva range in kernel_map.

noted by Juergen Hannken-Illjes on current-users@.
2004-01-30 11:56:39 +00:00
yamt d6e6e2e5c8 some English fixes from Soren Jacobsen. 2004-01-29 12:07:29 +00:00
yamt 20c5bc5099 - split uvm_map() into two functions for the followings.
- for in-kernel maps, disable map entry merging so that
  unmap operations won't block. (workaround for PR/24039)
- for in-kernel maps, allocate kva for vm_map_entry from
  the map itsself and eliminate MAX_KMAPENT and
  uvm_map_entry_kmem_pool.
2004-01-29 12:06:02 +00:00
simonb b9fbceaf46 Unindent a code block that doens't need to be indented. 2003-12-19 06:02:50 +00:00
chs 709a3b4e52 two changes in improve scalability:
(1) split the single list of pages allocated to a pool into three lists:
     completely full, partially full, and completely empty.
     there is no longer any need to traverse any list looking for a
     certain type of page.

 (2) replace the 8-element hash table for out-of-page page headers
     with a splay tree.

these two changes (together with the recent enhancements to the wait code)
give us linear scaling for a fork+exit microbenchmark.
2003-11-13 02:44:01 +00:00
yamt 8b91732e18 fix wrong assertions.
they can be false due to alignment requiments (and PMAP_PREFER).
2003-11-06 12:45:26 +00:00
yamt b479cef701 don't move hint backward. 2003-11-05 15:34:50 +00:00
yamt 171053e863 - fix a reversed comparison.
- fix "nextgap" case.
- make sure don't get addresses behind hint.
- deal with integer wraparounds better.
- assertions.
2003-11-05 15:09:09 +00:00
yamt c6d9c8814d fix a wrong assertion. pointed by Christian Limpach. 2003-11-02 07:58:52 +00:00
yamt 142a2d4058 - update uvm_map::size fewer places.
- add related assertions.
2003-11-01 19:56:09 +00:00
yamt c45bf442f2 commit rest of the previous (rbtree).
(i should check .rej files before commit, sorry)
2003-11-01 19:45:13 +00:00
yamt 57e554da69 track map entries and free spaces using red-black tree
to improve scalability of operations on the map.

originally done by Niels Provos for OpenBSD.
tweaked for NetBSD by me with some advices from enami tsugutomo.
discussed on tech-kern@ and tech-perform@.
2003-11-01 11:09:02 +00:00
junyoung b28a286e6a KNF. 2003-10-25 23:05:45 +00:00
enami 57a6593f52 Fix indent. 2003-10-09 03:12:29 +00:00
atatat d4de28f890 When pulling back an amap to cover the new allocation along with the
previous entry, don't add the size to the extension -- it's already
been added to the end of the previous entry.
2003-10-09 02:44:54 +00:00
enami ae9b5cba84 Rewrite uvm_map_findspace() to improve readability and to fix a bug that
it may return space already in use as free space under some condition.
The symptom of the bug is that exec fails if stack is unlimited on
topdown VM kernel.
2003-10-02 00:02:10 +00:00
enami 0ca733e759 Some whitespace fixes. 2003-10-01 23:08:32 +00:00
enami aa87bee0c5 ansi'fy. 2003-10-01 22:50:15 +00:00
yamt 91161caf3c use VM_PAGE_TO_PHYS macro instead of using phys_addr directly. 2003-08-26 15:12:18 +00:00
thorpej 03befad98b In uvm_map_clean(), only call pgo_put if the object has one.
From Quentin Garnier <quatriemek.com!netbsd>.
2003-04-09 21:39:29 +00:00
matt 76dd2c90fa In uvm_map_space, if the current entry is above the new space use the
previous entry.  (not if the current entry starts at the end of the new
space; that case doesn't take into account if the new space had a specified
alignment).
2003-03-02 08:57:49 +00:00
matt d6729b1f53 When finding an aligned block, we need to truncate in topdown, not roundup. 2003-03-02 02:55:03 +00:00
simonb 0b2b1cc0cc Remove assigned-to but not used variable. 2003-02-23 04:53:51 +00:00
matt 23b48be61f fix a tpyo in a comment. 2003-02-21 16:38:44 +00:00
atatat df0a9badc6 Introduce "top down" memory management for mmap()ed allocations. This
means that the dynamic linker gets mapped in at the top of available
user virtual memory (typically just below the stack), shared libraries
get mapped downwards from that point, and calls to mmap() that don't
specify a preferred address will get mapped in below those.

This means that the heap and the mmap()ed allocations will grow
towards each other, allowing one or the other to grow larger than
before.  Previously, the heap was limited to MAXDSIZ by the placement
of the dynamic linker (and the process's rlimits) and the space
available to mmap was hobbled by this reservation.

This is currently only enabled via an *option* for the i386 platform
(though other platforms are expected to follow).  Add "options
USE_TOPDOWN_VM" to your kernel config file, rerun config, and rebuild
your kernel to take advantage of this.

Note that the pmap_prefer() interface has not yet been modified to
play nicely with this, so those platforms require a bit more work
(most notably the sparc) before they can use this new memory
arrangement.

This change also introduces a VM_DEFAULT_ADDRESS() macro that picks
the appropriate default address based on the size of the allocation or
the size of the process's text segment accordingly.  Several drivers
and the SYSV SHM address assignment were changed to use this instead
of each one picking their own "default".
2003-02-20 22:16:05 +00:00
thorpej b193480908 Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant.  Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
2003-02-01 06:23:35 +00:00
christos 5c729d909f finally: step 5: disable a KASSERT() if we are doing_shutdown.
now sync from ddb should work as badly as before the nathanw_sa merge.
2003-01-21 00:03:07 +00:00
thorpej b78f59b443 Merge the nathanw_sa branch. 2003-01-18 08:51:40 +00:00
thorpej 130e5c278b UVM_KMF_NOWAIT -> UVM_FLAG_NOWAIT 2002-12-11 07:14:28 +00:00
bouyer d986226518 Change uvm_km_kmemalloc() to accept flag UVM_KMF_NOWAIT and pass it to
uvm_map(). Change uvm_map() to honnor UVM_KMF_NOWAIT. For this, change
amap_extend() to take a flags parameter instead of just boolean for
direction, and introduce AMAP_EXTEND_FORWARDS and AMAP_EXTEND_NOWAIT flags
(AMAP_EXTEND_BACKWARDS is still defined as 0x0, to keep the code easier to
read).
Add a flag parameter to uvm_mapent_alloc().
This solves a problem a pool_get(PR_NOWAIT) could trigger a pool_get(PR_WAITOK)
in uvm_mapent_alloc().
Thanks to Chuck Silvers, enami tsugutomo, Andrew Brown and Jason R Thorpe
for feedback.
2002-11-30 18:28:04 +00:00
atatat 42c2fe641b Implement backwards extension of amaps. There are three cases to deal
with:

Case #1 -- adjust offset: The slot offset in the aref can be
decremented to cover the required size addition.

Case #2 -- move pages and adjust offset: The slot offset is not large
enough, but the amap contains enough inactive space *after* the mapped
pages to make up the difference, so active slots are slid to the "end"
of the amap, and the slot offset is, again, adjusted to cover the
required size addition.  This optimizes for hitting case #1 again on
the next small extension.

Case #3 -- reallocate, move pages, and adjust offset: There is not
enough inactive space in the amap, so the arrays are reallocated, and
the active pages are copied again to the "end" of the amap, and the
slot offset is adjusted to cover the required size.  This also
optimizes for hitting case #1 on the next backwards extension.

This provides the missing piece in the "forward extension of
vm_map_entries" logic, so the merge failure counters have been
removed.

Not many applications will make any use of this at this time (except
for jvms and perhaps gcc3), but a "top-down" memory allocator will use
it extensively.
2002-11-14 17:58:48 +00:00
perry bbad42171f /*CONTCOND*/ while (0)'ed macros 2002-11-02 07:40:47 +00:00
atatat 68277bb301 In the case of a double amap_extend() (during a forward merge after a
back merge), don't abort the allocation if the second extend fails,
just abort the forward merge and finish the allocation.

Code reviewed by thorpej.
2002-10-24 22:22:28 +00:00
atatat 2d6863ada3 Call amap_extend() a second time in the case of a bimerge (both
backwards and forwards) if the previous entry was backed by an amap.

Fixes pr kern/18789, where netscape 7 + a java applet actually manage
to incur forward and bimerges in userspace.

Code reviewed by fvdl and thorpej.
2002-10-24 20:37:59 +00:00
atatat 94ef8e0795 Add an implementation of forward merging of new map entries. Most new
allocations can be merged either forwards or backwards, meaning no new
entries will be added to the list, and some can even be merged in both
directions, resulting in a surplus entry.

This code typically reduces the number of map entries in the
kernel_map by an order of magnitude or more.  It also makes possible
recovery from the pathological case of "5000 processes created and
then killed", which leaves behind a large number of map entries.

The only forward merge case not covered is the instance of an amap
that has to be extended backwards (WIP).  Note that this only affects
processes, not the kernel (the kernel doesn't use amaps), and that
merge opportunities like this come up *very* rarely, if at all.  Eg,
after being up for eight days, I see only three failures in this
regard, and even those are most likely due to programs I'm developing
to exercise this case.

Code reviewed by thorpej, matt, christos, mrg, chuq, chuck, perry,
tls, and probably others.  I'd like to thank my mother, the Hollywood
Foreign Press...
2002-10-18 13:18:42 +00:00
chs 94a62d45d6 add a new flag VM_MAP_DYING, which is set before we start
tearing down a vm_map.  use this to skip the pmap_update()
at the end of all the removes, which allows pmaps to optimize
pmap tear-down.  also, use the new pmap_remove_all() hook to
let the pmap implemenation know what we're up to.
2002-09-22 07:21:29 +00:00
chs 9672ac098f add a new km flag UVM_KMF_CANFAIL, which causes uvm_km_kmemalloc() to
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.
2002-09-15 16:54:26 +00:00
thorpej a180cee23b Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map).  Try to deal with this:

* Group all information about the backend allocator for a pool in a
  separate structure.  The pool references this structure, rather than
  the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
  to become available, but will still fail if it cannot callocate KVA
  space for the pages.  If this happens, carefully drain all pools using
  the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
  some pages, and use that information to make draining easier and more
  efficient.
* Get rid of PR_URGENT.  There was only one use of it, and it could be
  dealt with by the caller.

From art@openbsd.org.
2002-03-08 20:48:27 +00:00
chs 43973be0c5 introduce a new UVM fault type, VM_FAULT_WIREMAX. this is different
from VM_FAULT_WIRE in that when the pages being wired are faulted in,
the simulated fault is at the maximum protection allowed for the mapping
instead of the current protection.  use this in uvm_map_pageable{,_all}()
to fix the problem where writing via ptrace() to shared libraries that
are also mapped with wired mappings in another process causes a
diagnostic panic when the wired mapping is removed.

this is a really obscure problem so it deserves some more explanation.
ptrace() writing to another process ends up down in uvm_map_extract(),
which for MAP_PRIVATE mappings (such as shared libraries) will cause
the amap to be copied or created.  then the amap is made shared
(ie. the AMAP_SHARED flag is set) between the kernel and the ptrace()d
process so that the kernel can modify pages in the amap and have the
ptrace()d process see the changes.  then when the page being modified
is actually faulted on, the object pages (from the shared library vnode)
is copied to a new anon page and inserted into the shared amap.
to make all the processes sharing the amap actually see the new anon
page instead of the vnode page that was there before, we need to
invalidate all the pmap-level mappings of the vnode page in the pmaps
of the processes sharing the amap, but we don't have a good way of
doing this.  the amap doesn't keep track of the vm_maps which map it.
so all we can do at this point is to remove all the mappings of the
page with pmap_page_protect(), but this has the unfortunate side-effect
of removing wired mappings as well.  removing wired mappings with
pmap_page_protect() is a legitimate operation, it can happen when a file
with a wired mapping is truncated.  so the pmap has no way of knowing
whether a request to remove a wired mapping is normal or when it's due to
this weird situation.  so the pmap has to remove the weird mapping.
the process being ptrace()d goes away and life continues.  then,
much later when we go to unwire or remove the wired vm_map mapping,
we discover that the pmap mapping has been removed when it should
still be there, and we panic.

so where did we go wrong?  the problem is that we don't have any way
to update just the pmap mappings that need to be updated in this
scenario.  we could invent a mechanism to do this, but that is much
more complicated than this change and it doesn't seem like the right
way to go in the long run either.

the real underlying problem here is that wired pmap mappings just
aren't a good concept.  one of the original properties of the pmap
design was supposed to be that all the information in the pmap could
be thrown away at any time and the VM system could regenerate it all
through fault processing, but wired pmap mappings don't allow that.
a better design for UVM would not require wired pmap mappings,
and Chuck C. and I are talking about this, but it won't be done
anytime soon, so this change will do for now.

this change has the effect of causing MAP_PRIVATE mappings to be
copied to anonymous memory when they are mlock()d, so that uvm_fault()
doesn't need to copy these pages later when called from ptrace(), thus
avoiding the call to pmap_page_protect() and the panic that results
from this when the mlock()d region is unlocked or freed.  note that
this change doesn't help the case where the wired mapping is MAP_SHARED.

discussed at great length with Chuck Cranor.
fixes PRs 10363, 12554, 12604, 13041, 13487, 14580 and 14853.
2001-12-31 22:34:39 +00:00
chs 23c75a9a98 in uvm_map_clean(), add PGO_CLEANIT to the flags passed to an object's pager.
we need to make sure that vnode pages are written to disk at least once,
otherwise processes could gain access to whatever data was previously stored
in disk blocks which are freshly allocated to a file.
2001-12-31 20:34:01 +00:00
chs ef57a67ca1 fix locking for loaning. in general we should be looking at the page's
uobject and uanon pointers rather than at the PQ_ANON flag to determine
which lock to hold, since PQ_ANON can be clear even when the anon's lock
is the one which we should hold (if the page was loaned from an object
and then freed by the object).
2001-12-31 19:21:36 +00:00
lukem b616d1ca1d add RCSIDs, and in some cases, slightly cleanup #include order 2001-11-10 07:36:59 +00:00
chs 07d2ec83fe don't call pmap_copy() from uvmspace_fork().
a new process is very likely to call execve() immediately after fork(),
so most of the time copying the pmap mappings is wasted effort.
2001-11-06 05:27:17 +00:00
thorpej f67e15c839 uvm_map_protect(): Don't allow VM_PROT_EXECUTE to be set on entries
(either the current protection or the max protection) that reference
vnodes associated with a file system mounted with the NOEXEC option.

uvm_mmap(): Don't allow PROT_EXEC mappings to be established of vnodes
which are associated with a file system mounted with the NOEXEC option.
2001-10-30 19:05:26 +00:00
thorpej a2cd7623d4 Correct a comment. 2001-10-30 18:52:17 +00:00
thorpej e8ee04475d - Add a new vnode flag VEXECMAP, which indicates that a vnode has
executable mappings.  Stop overloading VTEXT for this purpose (VTEXT
  also has another meaning).
- Rename vn_marktext() to vn_markexec(), and use it when executable
  mappings of a vnode are established.
- In places where we want to set VTEXT, set it in v_flag directly, rather
  than making a function call to do this (it no longer makes sense to
  use a function call, since we no longer overload VTEXT with VEXECMAP's
  meaning).

VEXECMAP suggested by Chuq Silvers.
2001-10-30 15:32:01 +00:00
thorpej 7285b2c290 uvm_mmap(): If a vnode mapping is established with PROT_EXEC, mark the
vnode as VTEXT.

uvm_map_protect(): When VM_PROT_EXECUTE is added to a VA range, mark
all the vnodes mapped by the range as VTEXT.
2001-10-29 23:06:03 +00:00
chs 2adcba997b make pmap_resident_count() non-optional. 2001-09-23 06:35:30 +00:00
chs a548bfb584 add an assert. 2001-09-21 07:57:35 +00:00
chs 64c6d1d2dc a whole bunch of changes to improve performance and robustness under load:
- remove special treatment of pager_map mappings in pmaps.  this is
   required now, since I've removed the globals that expose the address range.
   pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
   no longer any need to special-case it.
 - eliminate struct uvm_vnode by moving its fields into struct vnode.
 - rewrite the pageout path.  the pager is now responsible for handling the
   high-level requests instead of only getting control after a bunch of work
   has already been done on its behalf.  this will allow us to UBCify LFS,
   which needs tighter control over its pages than other filesystems do.
   writing a page to disk no longer requires making it read-only, which
   allows us to write wired pages without causing all kinds of havoc.
 - use a new PG_PAGEOUT flag to indicate that a page should be freed
   on behalf of the pagedaemon when it's unlocked.  this flag is very similar
   to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
   pageout fails due to eg. an indirect-block buffer being locked.
   this allows us to remove the "version" field from struct vm_page,
   and together with shrinking "loan_count" from 32 bits to 16,
   struct vm_page is now 4 bytes smaller.
 - no longer use PG_RELEASED for swap-backed pages.  if the page is busy
   because it's being paged out, we can't release the swap slot to be
   reallocated until that write is complete, but unlike with vnodes we
   don't keep a count of in-progress writes so there's no good way to
   know when the write is done.  instead, when we need to free a busy
   swap-backed page, just sleep until we can get it busy ourselves.
 - implement a fast-path for extending writes which allows us to avoid
   zeroing new pages.  this substantially reduces cpu usage.
 - encapsulate the data used by the genfs code in a struct genfs_node,
   which must be the first element of the filesystem-specific vnode data
   for filesystems which use genfs_{get,put}pages().
 - eliminate many of the UVM pagerops, since they aren't needed anymore
   now that the pager "put" operation is a higher-level operation.
 - enhance the genfs code to allow NFS to use the genfs_{get,put}pages
   instead of a modified copy.
 - clean up struct vnode by removing all the fields that used to be used by
   the vfs_cluster.c code (which we don't use anymore with UBC).
 - remove kmem_object and mb_object since they were useless.
   instead of allocating pages to these objects, we now just allocate
   pages with no object.  such pages are mapped in the kernel until they
   are freed, so we can use the mapping to find the page to free it.
   this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
2001-09-15 20:36:31 +00:00
chris 0e7661f023 Update pmap_update to now take the updated pmap as an argument.
This will allow improvements to the pmaps so that they can more easily defer expensive operations, eg tlb/cache flush, til the last possible moment.

Currently this is a no-op on most platforms, so they should see no difference.

Reviewed by Jason.
2001-09-10 21:19:08 +00:00
chs 2133049a7c create a new pool for map entries, allocated from kmem_map instead of
kernel_map.  use this instead of the static map entries when allocating
map entries for kernel_map.  this greatly reduces the number of static
map entries used and should eliminate the problems with running out.
2001-09-09 19:38:22 +00:00
lukem 53156d96d0 let user know current value of MAX_KMAPENT in panic 2001-09-07 00:50:54 +00:00
wiz c52d355d71 "wierd" is weird. 2001-08-20 12:20:01 +00:00
chs e9fbc91f95 user maps are always pageable. 2001-08-16 01:37:50 +00:00
wiz a9356936b4 seperate -> separate 2001-07-22 13:33:58 +00:00
chs 821ec03ed9 replace vm_map{,_entry}_t with struct vm_map{,_entry} *. 2001-06-02 18:09:08 +00:00
chs 3845302904 remove trailing whitespace. 2001-05-25 04:06:11 +00:00
ross 892627dd05 Merge the swap-backed and object-backed inactive lists. 2001-05-22 00:44:44 +00:00
thorpej cda7baa0d5 Implement page coloring, using a round-robin bucket selection
algorithm (Solaris calls this "Bin Hopping").

This implementation currently relies on MD code to define a
constant defining the number of buckets.  This will change
reasonably soon (MD code will be able to dynamically size
the bucket array).
2001-04-29 04:23:20 +00:00
thorpej 1c3a62e066 Sprinkle pmap_update() calls after calls to:
- pmap_enter()
- pmap_remove()
- pmap_protect()
- pmap_kenter_pa()
- pmap_kremove()
as described in pmap(9).

These calls are relatively conservative.  It may be possible to
optimize these a little more.
2001-04-24 04:30:50 +00:00
chs ac3bc537bd eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS			0
KERN_INVALID_ADDRESS		EFAULT
KERN_PROTECTION_FAILURE		EACCES
KERN_NO_SPACE			ENOMEM
KERN_INVALID_ARGUMENT		EINVAL
KERN_FAILURE			various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE		ENOMEM
KERN_NOT_RECEIVER		<unused>
KERN_NO_ACCESS			<unused>
KERN_PAGES_LOCKED		<unused>
2001-03-15 06:10:32 +00:00
eeh 4589ac3292 When recycling a vm_map, resize it to the new process address space limits. 2001-02-11 01:34:23 +00:00
thorpej b016744976 Don't uvm_deallocate() the address space in exit1(). The address
space is already torn down in uvmspace_free() when the vmspace
refrence count reaches 0.  Move the shmexit() call into uvmspace_free().

Note that there is a beneficial side-effect of deferring the unmap
to uvmspace_free() -- on systems where TLB invalidations are
particularly expensive, the unmapping of the address space won't
have to cause TLB invalidations; uvmspace_free() is going to be
run in a context other than the exiting process's, so the "pmap is
active" test will evaluate to FALSE in the pmap module.
2001-02-10 05:05:27 +00:00
eeh 4380259bc7 Specify a process' address space limits for uvmspace_exec(). 2001-02-06 17:01:51 +00:00
chs 4d5451090e in uvm_map_clean(), fix the case where the start offset is within the last
entry in the map.  the old code would walk around the end of the linked list,
through the header entry, and keep going from the first map entry until it
found a gap in the map, at which point it would return an error.  if the map
had no gaps then it would loop forever.  reported by k-abe@cs.utah.edu.
while I'm here, clean up this function a bit.

also, use MIN() instead of min(), since the latter takes arguments of
type "int" but we're passing it values of type "vaddr_t", which can be
a larger size.
2001-02-05 11:29:54 +00:00
thorpej 1779f8f71b Page scanner improvements, behavior is actually a bit more like
Mach VM's now.  Specific changes:
- Pages now need not have all of their mappings removed before being
  put on the inactive list.  They only need to have the "referenced"
  attribute cleared.  This makes putting pages onto the inactive list
  much more efficient.  In order to eliminate redundant clearings of
  "refrenced", callers of uvm_pagedeactivate() must now do this
  themselves.
- When checking the "modified" attribute for a page (for clearing
  PG_CLEAN), make sure to only do it if PG_CLEAN is currently set on
  the page (saves a potentially expensive pmap operation).
- When scanning the inactive list, if a page is referenced, reactivate
  it (this part was actually added in uvm_pdaemon.c,v 1.27).  This
  now works properly now that pages on the inactive list are allowed to
  have mappings.
- When scanning the inactive list and considering a page for freeing,
  remove all mappings, and then check the "modified" attribute if the
  page is marked PG_CLEAN.
- When scanning the active list, if the page was referenced since its
  last sweep by the scanner, don't deactivate it.  (This part was
  actually added in uvm_pdaemon.c,v 1.28.)

These changes greatly improve interactive performance during
moderate to high memory and I/O load.
2001-01-28 23:30:42 +00:00
thorpej f4395a4eae splimp() -> splvm() 2001-01-14 02:10:01 +00:00
enami 4625dcde2e Use single const char array instead of over 200 string constant. 2000-12-13 08:06:11 +00:00
chs aeda8d3b77 Initial integration of the Unified Buffer Cache project. 2000-11-27 08:39:39 +00:00
chs 2ed28d2c7a lots of cleanup:
use queue.h macros and KASSERT().
address amap offsets in pages instead of bytes.
make amap_ref() and amap_unref() take an amap, offset and length
  instead of a vm_map_entry_t.
improve whitespace and comments.
2000-11-25 06:27:59 +00:00
thorpej 0a2fa5320b Back out rev. 1.83 -- it's causing problems with some pmap
implementations, so we'll have to spend a little more time
working on the problem.
2000-10-16 23:17:54 +00:00
thorpej 76589fafd4 - uvmspace_share(): If p2 has a vmspace already, make sure to deactivate
it and free it as appropriate.  Activate p2's new address space once
  it references p1's.
- uvm_fork(): Make sure the child's vmspace is NULL before calling
  uvmspace_share() (the child doens't have one already in this case).

These changes do not change the behavior for the current use of
uvmspace_share() (vfork(2)), but make it possible for an already
running process (such as a kernel thread) to properly attach to
another process's address space.
2000-10-11 17:27:58 +00:00
thorpej 47a2016cdc - Change SAVE_HINT() to take a "check" value. This value is compared
to the contents of the hint in the map, and the hint saved in the
  map only if the two values match.  When an unconditional save is
  required, the "check" value passed should be map->hint (and the
  compiler will optimize the test away).  When deleting a map entry,
  the new SAVE_HINT() will only change the hint if the entry being
  deleted was the hint value (thus preserving any meaningful hint
  that may have been there previously, rather than stomping on it).
- Add a missing hint update when deleting the map entry in
  uvm_map_entry_unlink().  This is the fix for kern/11125, from
  ITOH Yasufumi <itohy@netbsd.org>.
2000-10-11 17:21:11 +00:00
thorpej 72a24b4eae Add an align argument to uvm_map() and some callers of that
routine.  Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.
2000-09-13 15:00:15 +00:00
wiz be8ff811b7 Rename VM_INHERIT_* to MAP_INHERIT_* and move them to sys/sys/mman.h as
discussed on tech-kern.
Retire sys/uvm/uvm_inherit.h, update man page for minherit(2).
2000-08-01 00:53:07 +00:00
mrg dea44a9ec4 remove include of <vm/vm.h> 2000-06-27 17:29:17 +00:00
mrg 2f159a1bac remove/move more mach vm header files:
<vm/pglist.h> -> <uvm/uvm_pglist.h>
	<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
	<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
	<vm/vm_object.h> -> nothing
	<vm/vm_pager.h> -> into <uvm/uvm_pager.h>

also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.
2000-06-26 14:20:25 +00:00
chs e72214422a initialize aref.ar_pageoff even if there's no amap. 2000-06-13 04:10:47 +00:00
pk 36a1354bc6 Change previous to use `vm_map_min(dstmap)' instead of hard-coding
VM_MIN_KERNEL_ADDRESS.
2000-06-05 07:28:56 +00:00
pk 51ff5f7cd1 Let uvm_map_extract() set the lower bound on the kernel address range
itself, in stead of having its callers do that.
2000-06-02 12:02:43 +00:00
thorpej 646555bbd5 Clean up some indentation lossage in uvm_map_extract(). 2000-05-19 17:43:55 +00:00
thorpej 9ec517a68e Changes necessary to implement pre-zero'ing of pages in the idle loop:
- Make page free lists have two actual queues: known-zero pages and
  pages with unknown contents.
- Implement uvm_pageidlezero().  This function attempts to zero up to
  the target number of pages until the target has been reached (currently
  target is `all free pages') or until whichqs becomes non-zero (indicating
  that a process is ready to run).
- Define a new hook for the pmap module for pre-zero'ing pages.  This is
  used to zero the pages using uncached access.  This allows us to zero
  as many pages as we want without polluting the cache.

In order to use this feature, each platform must add the appropropriate
glue in their idle loop.
2000-04-24 17:12:00 +00:00
chs d444bb4032 undo rev 1.13, which is to say, don't block interrupts while deactivating
one pmap and activating another.  this isn't actually necessary (since
pmap_activate() and pmap_deactivate() affect only user-level mappings,
which cannot be accessed from interrupts anyway), and pmap_activate()
is very slow on old sun4c sparcs so we can't block interrupts for this long.
this fixes PR 8322.
2000-04-16 20:52:29 +00:00
chs 66014d2dff sparc -> __sparc__
print lock status in uvm_object_printit().
2000-04-10 02:21:26 +00:00
kleink 6e5b64c8a0 Merge parts of chs-ubc2 into the trunk:
Add a new type voff_t (defined as a synonym for off_t) to describe offsets
into uvm objects, and update the appropriate interfaces to use it, the
most visible effect being the ability to mmap() file offsets beyond
the range of a vaddr_t.

Originally by Chuck Silvers; blame me for problems caused by merging this
into non-UBC.
2000-03-26 20:54:45 +00:00
chs f3a668ed84 eliminate the PMAP_NEW option by making it required for all ports.
ports which previously had no support for PMAP_NEW now implement
the pmap_k* interfaces as wrappers around the non-k versions.
1999-09-12 01:16:55 +00:00
thorpej 23e83a7ac7 When handling the MADV_FREE case, if the amap or aobj has more than
one reference, go through the deactivate path; the page may actually
be in use by another process.

Fixes kern/8239.
1999-08-21 02:19:05 +00:00
thorpej 050aaac26e Fix the error recovery in uvm_map_pageable_all(). 1999-08-03 00:38:33 +00:00
thorpej 5310e69363 Fix PR #8023 from Bernd Ernesti: when MADV_FREE'ing a region which spanned
more than one VM map entry, a typo caused amap_unadd() to attempt to
remove anons from the wrong amap.  Fix that typo.
1999-07-19 17:45:23 +00:00
thorpej 5ee6f3960d Rework uvm_map_protect():
- Fix some locking bugs; a couple of places would return an error condition
  without unlocking the map.
- Deal with maps marked WIREFUTURE; if making an entry VM_PROT_NONE ->
  anything else, and it is not already marked as wired, wire it.
1999-07-18 00:41:56 +00:00
thorpej b6f435026c Add a set of "lockflags", which can control the locking behavior
of some functions.  Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.
1999-07-17 21:35:49 +00:00
thorpej 4ef1f3670d Fix a thinko which could cause a NULL pointer deref, in the PGO_FREE
case.
1999-07-07 21:51:35 +00:00
thorpej 62dcdc109b In the PGO_FREE case of uvm_map_clean()'s amap cleaning, skip wired
pages.

XXX This should be handled better in the future, probably by marking the
XXX page as released, and making uvm_pageunwire() free the page when
XXX the wire count on a released page reaches zero.
1999-07-07 21:04:22 +00:00
thorpej 4e398a6ded Add some more meat to madvise(2):
* Implement MADV_DONTNEED: deactivate pages in the specified range,
  semantics similar to Solaris's MADV_DONTNEED.
* Add MADV_FREE: free pages and swap resources associated with the
  specified range, causing the range to be reloaded from backing
  store (vnodes) or zero-fill (anonymous), semantics like FreeBSD's
  MADV_FREE and like Digital UNIX's MADV_DONTNEED (isn't it SO GREAT
  that madvise(2) isn't standardized!?)

As part of this, move the non-map-modifying advice handling out of
uvm_map_advise(), and into sys_madvise().

As another part, implement general amap cleaning in uvm_map_clean(), and
change uvm_map_clean() to only push dirty pages to disk if PGO_CLEANIT
is set in its flags (and update sys___msync13() accordingly).  XXX Add
a patchable global "amap_clean_works", defaulting to 1, which can disable
the amap cleaning code, just in case problems are unearthed; this gives
a developer/user a quick way to recover and send a bug report (e.g. boot
into DDB and change the value).

XXX Still need to implement a real uao_flush().

XXX Need to update the manual page.

With these changes, rebuilding libc will automatically cause the new
malloc(3) to use MADV_FREE to actually release pages and swap resources
when it decides that can be done.
1999-07-07 06:02:21 +00:00
thorpej 11c67d01a5 Fix a corner case locking error, which could lead to map corruption in
SMP environments.  See comments in <vm/vm_map.h> for details.
1999-07-01 20:07:05 +00:00
thorpej 9e9f068f43 Add the guts of mlockall(MCL_FUTURE). This requires that a process's
"memlock" resource limit to uvm_mmap().  Update all calls accordingly.
1999-06-18 05:13:45 +00:00
thorpej f274deb90a The i386 and pc532 pmaps are officially fixed. 1999-06-17 00:24:10 +00:00
thorpej b861180119 * Rename uvm_fault_unwire() to uvm_fault_unwire_locked(), and require that
the map be at least read-locked to call this function.  This requirement
  will be taken advantage of in a future commit.
* Write a uvm_fault_unwire() wrapper which read-locks the map and calls
  uvm_fault_unwire_locked().
* Update the comments describing the locking contraints of uvm_fault_wire()
  and uvm_fault_unwire().
1999-06-16 22:11:23 +00:00
thorpej 42c671ffba Modify uvm_map_pageable() and uvm_map_pageable_all() to follow POSIX 1003.1b
semantics.  That is, regardless of the number of mlock/mlockall calls,
an munlock/munlockall actually unlocks the region (i.e. sets wiring count
to 0).

Add a comment describing why uvm_map_pageable() should not be used for
transient page wirings (e.g. for physio) -- note, it's currently only
(ab)used in this way by a few pieces of code which are known to be
broken, i.e. the Amiga and Atari pmaps, and i386 and pc532 if PMAP_NEW is
not used.  The i386 GDT code uses uvm_map_pageable(), but in a safe
way, and could be trivially converted to use uvm_fault_wire() instead.
1999-06-16 19:34:24 +00:00
thorpej ee9703dea9 Add a macro to test if a map entry is wired. 1999-06-16 00:29:04 +00:00
thorpej c5a43ae10c Several changes, developed and tested concurrently:
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
  MCL_CURRENT is presently implemented.  MCL_FUTURE is not fully
  implemented.  Also, the same one-unlock-for-every-lock caveat
  currently applies here as it does to mlock(2).  This will be
  addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
  Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
  zero-fill region where none of the pages in that region are resident.
  [ This fix has been submitted for inclusion in 1.4.1 ]
1999-06-15 23:27:47 +00:00
thorpej 5de7bac9b1 Print the maps flags in "show map" from DDB. 1999-06-07 16:31:42 +00:00
thorpej 779ecdd773 Simplify the last even more; We downgraded to a shared (read) lock, so
setting recursive has no effect!  The kernel lock manager doesn't allow
an exclusive recursion into a shared lock.  This situation must simply
be avoided.  The only place where this might be a problem is the (ab)use
of uvm_map_pageable() in the Utah-derived pmaps for m68k (they should
either toss the iffy scheme they use completely, or use something like
uvm_fault_wire()).

In addition, once we have looped over uvm_fault_wire(), only upgrade to
an exclusive (write) lock if we need to modify the map again (i.e.
wiring a page failed).
1999-06-02 22:40:51 +00:00
thorpej 0723d57281 Clean up the locking mess in uvm_map_pageable() a little... Most importantly,
don't unlock a kernel map (!!!) and then relock it later; a recursive lock,
as it used in the user map case, is fine.  Also, don't change map entries
while only holding a read lock on the map.  Instead, if we fail to wire
a page, clear recursive locking, and upgrade back to a write lock before
dropping the wiring count on the remaining map entries.
1999-06-02 21:23:08 +00:00
mrg 2332079d3f unlock the map for unknown arguments to uvm_map_advise. from Soren S. Jorvang in PR kern/7681 1999-05-31 23:36:23 +00:00
thorpej fb36fe649a A little spring cleaning in the unwire case of uvm_map_pageable(). 1999-05-28 22:54:12 +00:00
thorpej 8d8badbd8f Make uvm_fault_unwire() take a vm_map_t, rather than a pmap_t, for
consistency.  Use this opportunity for checking for intrsafe map use
in this routine (which is illegal).
1999-05-28 20:49:51 +00:00
thorpej 108b13d5a9 Make "intrsafe" maps locked only by exclusive spin locks, never sleep
locks (and thus, never shared locks).  Move the "set/clear recursive"
functions to uvm_map.c, which is the only placed they're used (and
they should go away anyhow).  Delete some unused cruft.
1999-05-28 20:31:42 +00:00
thorpej 80de1e9903 Upon further investigation, in uvm_map_pageable(), entry->protection is the
right access_type to pass to uvm_fault_wire().  This way, if the entry has
VM_PROT_WRITE, and the entry is marked COW, the copy will happen immediately
in uvm_fault(), as if the access were performed.
1999-05-26 23:53:48 +00:00
thorpej 2580d306ab Change the vm_map's "entries_pageable" member to a r/o flags member, which
has PAGEABLE and INTRSAFE flags.  PAGEABLE now really means "pageable",
not "allocate vm_map_entry's from non-static pool", so update all map
creations to reflect that.  INTRSAFE maps are maps that are used in
interrupt context (e.g. kmem_map, mb_map), and thus use the static
map entry pool (XXX as does kernel_map, for now).  This will eventually
change now these maps are locked, as well.
1999-05-26 19:16:28 +00:00
thorpej 7b4db806b6 In uvm_map_pageable(), pass VM_PROT_NONE as access type to uvm_fault_wire()
for now.  XXX This needs to be reexamined.
1999-05-26 00:36:53 +00:00
thorpej 0ff8d3ac1a Define a new kernel object type, "intrsafe", which are used for objects
which can be used in an interrupt context.  Use pmap_kenter*() and
pmap_kremove() only for mappings owned by these objects.

Fixes some locking protocol issues related to MP support, and eliminates
all of the pmap_enter vs. pmap_kremove inconsistencies.
1999-05-25 20:30:08 +00:00
thorpej 85f8d1343c Macro'ize the test for "object is a kernel object". 1999-05-25 00:09:00 +00:00
mrg f1f95c374b implement madvice() for MADV_{NORMAL,RANDOM,SEQUENTIAL}, others are not yet done. 1999-05-23 06:27:13 +00:00
thorpej f311a1c308 Make a slight modification of pmap_growkernel() -- it now returns the
end of the mappable kernel virtual address space.  Previously, it would
get called more often than necessary, because the caller only new what
was requested.

Also, export uvm_maxkaddr so that uvm_pageboot_alloc() can grow the
kernel pmap if necessary, as well.  Note that pmap_growkernel() must
now be able to handle being called before pmap_init().
1999-05-20 23:03:23 +00:00
thorpej f5108f64e7 Add an optional pmap hook, pmap_fork(), to be called at the end of
uvmspace_fork().

pmap_fork() is used to "fork a pmap", that is copy data from one pmap
to the other that is NOT related to actual mappings in the pmap, but is
otherwise logically coupled to the address space.
1999-05-12 19:11:23 +00:00
mrg e378d35ade remove now-wrong comments. formatting nits. 1999-05-03 08:57:42 +00:00
chs 69ead14e9b in uvm_map_extract(), handle the case where the map entry being extracted
is large enough to cause the end address of the new entry to overflow.
1999-04-19 14:43:46 +00:00
mycroft 4831b815f5 Only turn off VM_PROT_WRITE for COW pages; not VM_PROT_EXECUTE. 1999-03-28 19:53:49 +00:00
mrg a0139bc39d remove now >1 year old pre-release message. 1999-03-25 18:48:49 +00:00
chuck 44f5fc2839 cleanup/reorg:
- break anon related functions out of uvm_amap.c and put them in their own
  file (uvm_anon.c).  includes break up uvm_anon_init into an amap and an
  an anon init function
- ensure that only functions within the amap module access amap structure
  fields (add macros to amap api as needed)
1999-01-24 23:53:14 +00:00
chuck 281eb8b87a remove bogus permission check in uvm_map_clean(). fixes mmap/msync
problem discussed/reported by jonathan and Andreas Wrede <andreas@planix.com>.
1998-11-15 04:38:19 +00:00
mrg bba8470ccb KNF a missing bit. remove register. 1998-10-24 13:32:34 +00:00
tron c71ccab136 Defopt SYSVMSG, SYSVSEM and SYSVSHM. 1998-10-19 22:21:19 +00:00
chs 549cd579e5 shift by PAGE_SHIFT instead of multiplying or dividing by PAGE_SIZE. 1998-10-18 23:49:59 +00:00
chuck 2d4c15ebc9 remove unused share map code from UVM:
- replace map checks with submap checks
 - get rid of unused 'mainonly' arg in uvm_unmap/uvm_unmap_remove, simplify
	code.   update all calls to reflect this.
 - don't worry about unmapping or changing the protection of shared share
	map mappings (is_main_map no longer used).
 - remove unused uvm_map_sharemapcopy() function from fork code.
1998-10-11 23:14:47 +00:00
thorpej d865961d77 Back out previous; I should have instrumented the benefit of this one
first.
1998-08-31 01:54:14 +00:00
thorpej 7338d4e403 Use the pool allocator and the "nointr" pool page allocator for vm_map's. 1998-08-31 01:50:08 +00:00
thorpej be8d09cda3 Use the pool allocator and the "nointr" pool page allocator for dynamically
allocated vm_map_entry's.
1998-08-31 01:10:15 +00:00
thorpej 99626224a7 Use the pool allocator and the "nointr" pool page allocator for vmspace
structures.
1998-08-31 00:20:26 +00:00
eeh a2dd74ed79 Merge paddr_t changes into the main branch. 1998-08-13 02:10:37 +00:00
perry 2c8717021d bzero->memset, bcopy->memcpy, bcmp->memcmp 1998-08-09 22:36:37 +00:00
thorpej 7fd701e0fa Add support for multiple memory free lists. There is at least one
default free list, and 0 - N additional free list, in order of descending
priority.

A new page allocation function, uvm_pagealloc_strat(), has been added,
providing three page allocation strategies:

	- normal: high -> low priority free list walk, taking the
	  page off the first free list that has one.

	- only: attempt to allocate a page only from the specified free
	  list, failing if that free list has none available.

	- fallback: if `only' fails, fall back on `normal'.

uvm_pagealloc(...) is provided for normal use (and is a synonym for
uvm_pagealloc_strat(..., UVM_PGA_STRAT_NORMAL, 0); the free list argument
is ignored for the `normal' case).

uvm_page_physload() now specified which free list the pages will be
loaded onto.  This means that some platforms which have multiple physical
memory segments may define additional vm_physsegs if they wish to break
individual physical segments into differing priorities.

Machine-dependent code must define _at least_ the following constants
in <machine/vmparam.h>:

	VM_NFREELIST: the number of free lists the system will have

	VM_FREELIST_DEFAULT: the default freelist (should always be 0,
	but is defined in machdep code so that it's with all of the
	other free list-related constants).

Additional free list names may be defined by machine-dependent code, but
they will only be used by machine-dependent code (e.g. for loading the
vm_physsegs).
1998-07-08 04:28:27 +00:00
jonathan 466e784ee1 defopt DDB. 1998-07-04 22:18:13 +00:00
chuck 08a4f7fa4c fix bug in uvm_map_extract, remove case. make sure we update the loop
variable before removing the entry from the map.
[bug was not causing problems because the remove case isn't currently
 being used ...]
1998-05-22 02:01:54 +00:00