- Fix some locking bugs; a couple of places would return an error condition
without unlocking the map.
- Deal with maps marked WIREFUTURE; if making an entry VM_PROT_NONE ->
anything else, and it is not already marked as wired, wire it.
of some functions. Use these flags in uvm_map_pageable() to determine
if the map is locked on entry (replaces an already present boolean_t
argument `islocked'), and if the function should return with the map
still locked.
pages.
XXX This should be handled better in the future, probably by marking the
XXX page as released, and making uvm_pageunwire() free the page when
XXX the wire count on a released page reaches zero.
* Implement MADV_DONTNEED: deactivate pages in the specified range,
semantics similar to Solaris's MADV_DONTNEED.
* Add MADV_FREE: free pages and swap resources associated with the
specified range, causing the range to be reloaded from backing
store (vnodes) or zero-fill (anonymous), semantics like FreeBSD's
MADV_FREE and like Digital UNIX's MADV_DONTNEED (isn't it SO GREAT
that madvise(2) isn't standardized!?)
As part of this, move the non-map-modifying advice handling out of
uvm_map_advise(), and into sys_madvise().
As another part, implement general amap cleaning in uvm_map_clean(), and
change uvm_map_clean() to only push dirty pages to disk if PGO_CLEANIT
is set in its flags (and update sys___msync13() accordingly). XXX Add
a patchable global "amap_clean_works", defaulting to 1, which can disable
the amap cleaning code, just in case problems are unearthed; this gives
a developer/user a quick way to recover and send a bug report (e.g. boot
into DDB and change the value).
XXX Still need to implement a real uao_flush().
XXX Need to update the manual page.
With these changes, rebuilding libc will automatically cause the new
malloc(3) to use MADV_FREE to actually release pages and swap resources
when it decides that can be done.
* Nothing currently uses this return value.
* It's arguably an abstraction violation.
Fix amap_unadd()'s API to be consistent w/ amap_add()'s: rather than
take a vm_amap * and a slot number, take a vm_aref * and an offset.
It's now actually possible to use amap_unadd() to remove an anon from
an amap.
> XXX (in)sanity check. We don't do proper datasize checking
> XXX for anonymous (or private writable) mmap(). However,
> XXX know that if we're trying to allocate more than the amount
> XXX remaining under our current data size limit, _that_ should
> XXX be disallowed.
This is one link on the chain of lossage known as PR#7897. It's
definitely not the right fix, but it's better than nothing.
sub-structure malloc() failed, it was quite likely that the function
would return success incorrectly. This is this direct cause of the bug
reported in PR#7897. (Thanks to chs for helping to track it down.)
- rather than treating MAP_COPY like MAP_PRIVATE by sheer virtue of it not
being MAP_SHARED, actually convert the MAP_COPY flag into MAP_PRIVATE.
- return EINVAL if MAP_SHARED and MAP_PRIVATE are both included in flags.
which use uvm_vslock() should now test the return value. If it's not
KERN_SUCCESS, wiring the pages failed, so the operation which is using
uvm_vslock() should error out.
XXX We currently just EFAULT a failed uvm_vslock(). We may want to do
more about translating error codes in the future.
pmap_change_wiring(...,FALSE) unless the map entry claims the address
is unwired. This fixes the following scenario, as described on
tech-kern@netbsd.org on Wed 6/16/1999 12:25:23:
- User mlock(2)'s a buffer, to guarantee it will never become
non-resident while he is using it.
- User then does physio to that buffer. Physio calls uvm_vslock()
to lock down the pages and ensure that page faults do not happen
while the I/O is in progress (possibly in interrupt context).
- Physio does the I/O.
- Physio calls uvm_vsunlock(). This calls uvm_fault_unwire().
>>> HERE IS WHERE THE PROBLEM OCCURS <<<
uvm_fault_unwire() calls pmap_change_wiring(..., FALSE),
which now gives the pmap free reign to recycle the mapping
information for that page, which is illegal; the mapping is
still wired (due to the mlock(2)), but now access of the
page could cause a non-protection page fault (disallowed).
NOTE: This could eventually lead to a panic when the user
subsequently munlock(2)'s the buffer and the mapping info
has been recycled for use by another mapping!
the map be at least read-locked to call this function. This requirement
will be taken advantage of in a future commit.
* Write a uvm_fault_unwire() wrapper which read-locks the map and calls
uvm_fault_unwire_locked().
* Update the comments describing the locking contraints of uvm_fault_wire()
and uvm_fault_unwire().
semantics. That is, regardless of the number of mlock/mlockall calls,
an munlock/munlockall actually unlocks the region (i.e. sets wiring count
to 0).
Add a comment describing why uvm_map_pageable() should not be used for
transient page wirings (e.g. for physio) -- note, it's currently only
(ab)used in this way by a few pieces of code which are known to be
broken, i.e. the Amiga and Atari pmaps, and i386 and pc532 if PMAP_NEW is
not used. The i386 GDT code uses uvm_map_pageable(), but in a safe
way, and could be trivially converted to use uvm_fault_wire() instead.
* Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls.
MCL_CURRENT is presently implemented. MCL_FUTURE is not fully
implemented. Also, the same one-unlock-for-every-lock caveat
currently applies here as it does to mlock(2). This will be
addressed in a future commit.
* Provide the mincore(2) system call, with the same semantics as
Solaris.
* Clean up the error recovery in uvm_map_pageable().
* Fix a bug where a process would hang if attempting to mlock a
zero-fill region where none of the pages in that region are resident.
[ This fix has been submitted for inclusion in 1.4.1 ]
looking up a kernel address, check to see if the address is on this
"interrupt-safe" list. If so, return failure immediately. This prevents
a locking screw if a page fault is taken on an interrupt-safe map in or
out of interrupt context.
setting recursive has no effect! The kernel lock manager doesn't allow
an exclusive recursion into a shared lock. This situation must simply
be avoided. The only place where this might be a problem is the (ab)use
of uvm_map_pageable() in the Utah-derived pmaps for m68k (they should
either toss the iffy scheme they use completely, or use something like
uvm_fault_wire()).
In addition, once we have looped over uvm_fault_wire(), only upgrade to
an exclusive (write) lock if we need to modify the map again (i.e.
wiring a page failed).
don't unlock a kernel map (!!!) and then relock it later; a recursive lock,
as it used in the user map case, is fine. Also, don't change map entries
while only holding a read lock on the map. Instead, if we fail to wire
a page, clear recursive locking, and upgrade back to a write lock before
dropping the wiring count on the remaining map entries.