allocated ppref data to zero in the case of an amap that has empty
space at the front.
Don't set anything in the ppref array if "len" is zero.
Many thanks to Sami Kantoluoto for providing gdb access to a machine
that would reliably crash with problems related to the above, and to
Stephan Thesing for corroborating that the patch properly addressed
the problem.
Note that the ar_pageoff (and related variables) types must be changed
soon. The use of "int" here is not theoretically sufficient.
to sleep. Define UVM_KMF_NOWAIT in terms of UVM_FLAG_NOWAIT.
From Manuel Bouyer. Fixes a problem where any mapping with
read protection was created in a "nowait" context, causing
spurious failures.
uvm_map(). Change uvm_map() to honnor UVM_KMF_NOWAIT. For this, change
amap_extend() to take a flags parameter instead of just boolean for
direction, and introduce AMAP_EXTEND_FORWARDS and AMAP_EXTEND_NOWAIT flags
(AMAP_EXTEND_BACKWARDS is still defined as 0x0, to keep the code easier to
read).
Add a flag parameter to uvm_mapent_alloc().
This solves a problem a pool_get(PR_NOWAIT) could trigger a pool_get(PR_WAITOK)
in uvm_mapent_alloc().
Thanks to Chuck Silvers, enami tsugutomo, Andrew Brown and Jason R Thorpe
for feedback.
backed by physical pages (ie. because it reused a previously-freed one),
so that we can skip a bunch of useless work in that case.
this fixes the underlying problem behind PR 18543, and also speeds up fork()
quite a bit (eg. 7% on my pc, 1% on my ultra2) when we get a cache hit.
delay freeing the old am_ppref so that if we bail early due to
malloc() failures, valid ppref data hasn't been freed for no reason.
Based on comments from enami.
with:
Case #1 -- adjust offset: The slot offset in the aref can be
decremented to cover the required size addition.
Case #2 -- move pages and adjust offset: The slot offset is not large
enough, but the amap contains enough inactive space *after* the mapped
pages to make up the difference, so active slots are slid to the "end"
of the amap, and the slot offset is, again, adjusted to cover the
required size addition. This optimizes for hitting case #1 again on
the next small extension.
Case #3 -- reallocate, move pages, and adjust offset: There is not
enough inactive space in the amap, so the arrays are reallocated, and
the active pages are copied again to the "end" of the amap, and the
slot offset is adjusted to cover the required size. This also
optimizes for hitting case #1 on the next backwards extension.
This provides the missing piece in the "forward extension of
vm_map_entries" logic, so the merge failure counters have been
removed.
Not many applications will make any use of this at this time (except
for jvms and perhaps gcc3), but a "top-down" memory allocator will use
it extensively.
backwards and forwards) if the previous entry was backed by an amap.
Fixes pr kern/18789, where netscape 7 + a java applet actually manage
to incur forward and bimerges in userspace.
Code reviewed by fvdl and thorpej.
kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals
kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)
based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
allocations can be merged either forwards or backwards, meaning no new
entries will be added to the list, and some can even be merged in both
directions, resulting in a surplus entry.
This code typically reduces the number of map entries in the
kernel_map by an order of magnitude or more. It also makes possible
recovery from the pathological case of "5000 processes created and
then killed", which leaves behind a large number of map entries.
The only forward merge case not covered is the instance of an amap
that has to be extended backwards (WIP). Note that this only affects
processes, not the kernel (the kernel doesn't use amaps), and that
merge opportunities like this come up *very* rarely, if at all. Eg,
after being up for eight days, I see only three failures in this
regard, and even those are most likely due to programs I'm developing
to exercise this case.
Code reviewed by thorpej, matt, christos, mrg, chuq, chuck, perry,
tls, and probably others. I'd like to thank my mother, the Hollywood
Foreign Press...
tearing down a vm_map. use this to skip the pmap_update()
at the end of all the removes, which allows pmaps to optimize
pmap tear-down. also, use the new pmap_remove_all() hook to
let the pmap implemenation know what we're up to.
return failure if swap is full and there are no free physical pages.
have malloc() use this flag if M_CANFAIL is passed to it.
use M_CANFAIL to allow amap_extend() to fail when memory is scarce.
this should prevent most of the remaining hangs in low-memory situations.
This merge changes the device switch tables from static array to
dynamically generated by config(8).
- All device switches is defined as a constant structure in device drivers.
- The new grammer ``device-major'' is introduced to ``files''.
device-major <prefix> char <num> [block <num>] [<rules>]
- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.
- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.
- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.
- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.
- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.
the page is still loaned to an anon, we should put the page back on a
paging queue. this is because while pages loaned to the kernel really
do need to stay resident (since the kernel is accessing the physical
memory directly), pages loaned to anons can be paged out just fine.
(the page will be paged out twice, first to the object and then again
to the anon, but after that the page can be reused.)
-pass vm_physseg* instead of physseg index, and PFN (int) instead
of physical address (could be done even more)
-simplify detection of boundary crossing and behave more intelligently
in this case
-take stuff out of the inner loops, or put into "#ifdef DEBUG"
(because we move along physsegs we don't need to check that the
pages are physically contigous)
-make the "simple" and "contigous" branches look more uniform; at
least the outer loops might coalesce one day
Makoto Fujiwara <makoto@ki.nu> and Manuel Bouyer <bouyer@netbsd.org>.
Help from Allen Briggs, Jason Thorpe, and Matt Thomas.
We need to call cpu_cache_probe() early in boot (machdep.c).
Add 603 info for completeness, and use NBPG not PAGESIZE, as the
latter relies on uvm being setup (cpu_subr.c).
Let uvm_page_recolor() be called before uvm has been set up; just
note the page coloring value (uvm_page.c).
obey the preferences expressed by freelist assignment,
to avoid wasting valuable "low memory" to devices which
don't really need it.
comments:
-I'm not sure searching the physsegs within a freelist
beginning with the biggest is the right thing. This is
what the "memory steal" code in uvm_page.c does, so
keep it consistent.
-There seems to be some confusion whether the upper
address limit passed is inclusive or not. Stays on
the save side, possibly leaving one page out.
-The boundary/pagemask check can be simplified, also some
arguments passed are only used for diagnostic checks.
-Integration with UVM_PAGE_TRKOWN???