- remove special treatment of pager_map mappings in pmaps. this is
required now, since I've removed the globals that expose the address range.
pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
no longer any need to special-case it.
- eliminate struct uvm_vnode by moving its fields into struct vnode.
- rewrite the pageout path. the pager is now responsible for handling the
high-level requests instead of only getting control after a bunch of work
has already been done on its behalf. this will allow us to UBCify LFS,
which needs tighter control over its pages than other filesystems do.
writing a page to disk no longer requires making it read-only, which
allows us to write wired pages without causing all kinds of havoc.
- use a new PG_PAGEOUT flag to indicate that a page should be freed
on behalf of the pagedaemon when it's unlocked. this flag is very similar
to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
pageout fails due to eg. an indirect-block buffer being locked.
this allows us to remove the "version" field from struct vm_page,
and together with shrinking "loan_count" from 32 bits to 16,
struct vm_page is now 4 bytes smaller.
- no longer use PG_RELEASED for swap-backed pages. if the page is busy
because it's being paged out, we can't release the swap slot to be
reallocated until that write is complete, but unlike with vnodes we
don't keep a count of in-progress writes so there's no good way to
know when the write is done. instead, when we need to free a busy
swap-backed page, just sleep until we can get it busy ourselves.
- implement a fast-path for extending writes which allows us to avoid
zeroing new pages. this substantially reduces cpu usage.
- encapsulate the data used by the genfs code in a struct genfs_node,
which must be the first element of the filesystem-specific vnode data
for filesystems which use genfs_{get,put}pages().
- eliminate many of the UVM pagerops, since they aren't needed anymore
now that the pager "put" operation is a higher-level operation.
- enhance the genfs code to allow NFS to use the genfs_{get,put}pages
instead of a modified copy.
- clean up struct vnode by removing all the fields that used to be used by
the vfs_cluster.c code (which we don't use anymore with UBC).
- remove kmem_object and mb_object since they were useless.
instead of allocating pages to these objects, we now just allocate
pages with no object. such pages are mapped in the kernel until they
are freed, so we can use the mapping to find the page to free it.
this allows us to remove splvm() protection in several places.
The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
adjusted via sysctl. file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value. the max number of FFS softdeps is also recalculated.
convert various file systems to use the <sys/queue.h> macros for
their hash tables.
the lower layer needs to have control over that flag.
that didn't solve the whole problem that it was trying to solve anyway.
(the issue is that if we have create mappings to the lower layer,
we need to get rid of those when we copy the file to the upper layer.)
we'll have to figure out some other way to handle this.
- We need to skip PGO_DONTCARE page also.
- ``npages'' returned by VOP_GETPAGES for lower vp doesn't count
those pages in this case. So, just loop ``npages'' times is
insufficient. Loop while there is real pages instead.
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.
For each leaf filesystem, add appropriate vfs_done routine.
not set, unlock the vnode before calling the device's close routine and
relock it after it returns. tty close routines will sleep waiting for
buffers to drain, which won't happen often times as the other side needs
to grab the vnode lock first.
Make all unmount routines lock the device vnode before calling VOP_CLOSE().
exists is bogus. The goal here is to produce a synthetic link count
which won't confuse fts and similar routines which "know" that
directories with a link count of 2 don't have subdirectories (and
thus, they can avoid having to stat every entry in the directory
looking for subdirectories which aren't there).
We know that non-UNIX filesystem implementations may return a link
count of `1' for directories with an indeterminate number of
subdirectories; if either the upper or lower layer returns a link
count of `1', return a link count of 1. If both layers return a link
count of 2, return a link count of 2; otherwise, return the sum of the
link count of both layers.
Also, fix PR7430: unionfs ignores read-only mounts. Check for
MNT_RDONLY in union_lookup (more-or-less as in layer_lookup) as well
as union_access() and union_setattr().
Note that a read-only union layer may still cause side effects on the
underlying filesystems... Most notably, we'll still attempt to create
shadow directories in the upper layer. Also, of course, we'll
side-effect atimes in the lower layer.
"panic: lockmgr: using decommisioned lock"
(only if DIAGNOSTIC)
The problem turned out to be due to the way LK_DRAIN was (not) handled
in union_lock; it just got passed through to the lock on the upper
vnode (which got marked as decommissioned, instead of that happening
to the union vnode. When the upper vnode was next locked (typically
when it was released), it went kaboom.
getnewvnode now checks this bit, and it if's set makes sure a vnode's not
locked before removing it from the free list.
Closes PR 7954 by Alan Barrett <apb@iafrica.com>.
Update coda to new struct lock in struct vnode.
make fdescfs, kernfs, portalfs, and procfs actually lock their vnodes.
It's not that hard.
Make unionfs set v_vnlock = NULL so any overlayed fs will call its
VOP_LOCK.
deadlock in VOP_FSYNC() if the unreferenced vnode picked for
reclamation happened to be stacked on top of a vnode the process
already had locked. This could happen if the same filesystem was
accessed both through a union mount and directly; it seemed to happen
most frequently when the direct access was through NFS.
Avoid this deadlock by changing vinvalbuf to pass a new FSYNC_RECLAIM
flag bit to VOP_FSYNC() to indicate that a reclaim is in progress and
only a `shallow' fsync is necessary.
Do nothing in *_fsync() in umapfs, nullfs, and unionfs when
FSYNC_RECLAIM is set; the underlying vnodes will shortly be released
in *_reclaim and may be reclaimed (and fsync'ed) later.
to pass down a locked node. Modify union_copyup() to call VOP_CLOSE
locked nodes.
Also fix a bug in union_copyup() where a lock on the lower vnode would
only be released if VOP_OPEN didn't fail.
as with user-land programs, include files are installed by each directory
in the tree that has includes to install. (This allows more flexibility
as to what gets installed, makes 'partial installs' easier, and gives us
more options as to which machines' includes get installed at any given
time.) The old SYS_INCLUDES={symlinks,copies} behaviours are _both_
still supported, though at least one bug in the 'symlinks' case is
fixed by this change. Include files can't be build before installation,
so directories that have includes as targets (e.g. dev/pci) have to move
those targets into a different Makefile.
(1) Fix a typo that caused a NULL pointer deref.
(2) union_copyup() locks the vnode, so unlock it before calling relookup().
PR #5272, MINOURA, Makoto <minoura@kw.netlaputa.ne.jp>.
vn, with a 0 component. If the upper fs was a unionfs,
union_whiteout() would deref compnent to get a struct proc, and panic.
struct proc was only being passed to FIXUP, which never used it. It
turns out this happened a lot. I ripped most of the unneeded code
out, and left in the few places that really did need the proc handle.