find.
The filesystem ones all call genfs_eopnotsupp - right now I am only
implementing the plumbing and we can implement fallocate and/or
fdiscard for files later.
The device ones call spec_fallocate (which is also genfs_eopnotsupp)
and spec_fdiscard, which dispatches to the device-level op.
The fifo ones all call vn_fifo_bypass, which also ends up being
EOPNOTSUPP.
the vnode operations vector for active vnodes is unsafe because it
is not known whether deadfs or the original file system will be
called.
- Pass down LK_RETRY to the lock operation (hint for deadfs only).
- Change deadfs lock operation to return ENOENT if LK_RETRY is unset.
- Change all other lock operations to check for dead vnode once
the vnode is locked and unlock and return ENOENT in this case.
With these changes in place vnode lock operations will never succeed
after vclean() has marked the vnode as VI_XLOCK and before vclean()
has changed the operations vector.
Adresses PR kern/37706 (Forced unmount of file systems is unsafe)
Discussed on tech-kern.
Welcome to 6.99.33
- fix writing to coda files. this is probably not the right way to do
this, but it satisfies the locking protocol:
1. Sometimes coda_open() is called with an unlocked vnode which
does not satisfy the locking protocol. Lock it for now. We
need to find out why this happens
2. VFS_VGET sometimes returns the container vnode unlocked. What
is the locking protocol for VFS_VGET? We also lock it here.
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)
This removes the need for the SAVENAME and HASBUF namei flags.
in the vnode. All LK_* flags move from sys/lock.h to sys/vnode.h. Calls
to vlockmgr() in file systems get replaced with VOP_LOCK() or VOP_UNLOCK().
Welcome to 5.99.34.
Discussed on tech-kern.
- VOP_LOCK(vp, flags): Limit the set of allowed flags to LK_EXCLUSIVE,
LK_SHARED and LK_NOWAIT. LK_INTERLOCK is no longer allowed as it
makes no sense here.
- VOP_ISLOCKED(vp): Remove the for some time unused return value
LK_EXCLOTHER. Mark this operation as "diagnostic only".
Making a lock decision based on this operation is no longer allowed.
Discussed on tech-kern.
check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr
All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.
XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.
quick consensus on tech-kern
v_interlock. They are actually the same lock, but the former protects
the uvm object associated with the vnode, and the latter vnode
reference counts. Explained to me by chs@.
obtaining interlock on container vnode in coda_{get,put}pages. This
is the only functional change in this commit.
Improve many comments. In particular, note that the relationship
between VOP_OPEN and obtaining a container file (e.g. for getpages for
executables) is messy.
Add printfs for 'internal open' cases in coda_rdwr. These have not
been triggered in my testing. Note an apparent vref leak.
does not trigger assertions in uvm_fault, and executing files from
coda works as well.
Code very lightly reviewed by wrstuden@; scrutiny by those who
understand vnode and especially {get,put}pages would be appreciated.
Re-enable mmap. The problem is how uvm_fault handles page faults from
coda vnodes via container files, and executing a program caused the
same problem so disabling mmap only helped cp(1).
coda_open:
rename variables to match vnode_if.src
better comments about lock/reference state of vnodes
keep lock on container file until after VOP_OPEN, which requires locked vp
remove #if 0'd code to PNBUF_PUT
coda_link:
rename variables to match vnode_if.src
error out early if vp == dvp
check return value on vn_lock, and add comment questoining the lock
clarify lock handling, but unchanged logic
remove #if 0'd code to PNBUF_PUT
coda_rmdir:
error out early if vp == dvp
remove #if 0'd code to PNBUF_PUT
coda_grab_vnode:
add comments, and in particular question undocumented VFS_VGET semantics
coda_getpages:
question calling VOP_OPEN, which requires a locked vnode, with the
vnode we got (vop_getpages does not guarantee a locked vnode)
coda_putpages:
remove inexplicable simple_unlock(&vp->v_interlock);
add printf so we notice if this is ever called
add comment explaining that the implementation will lead to trouble,
because vnode_if.src says putpages is called with v_uobj.vmobjlock
held and is supposed to unlock it
With these changes and an uncommitted change to uvm_fault not to panic
if uvm objects are not equal, coda seems stable again.
got a panic in uvm_fault from ffs_write. I believe this is because cp
used mmap, the container file page was not in core, and uvm_fault
objected to the container file vnode and the coda vnode not matching.
I have long been plagued by crashes on cp from coda, and this was the
first time I got and understood a backtrace.
Clean up old comments that are no longer accurate.
Document refcounting better.
Note some questionable behaviors with XXX.
Clean up PNBUF_PUT and SAVESTART. Only do this where vnodeops(9) says
we should, and do it on error also.
In symlink, vput parent and free namebuf even in error cases.
the unlock parent, lock child, lock parent in the ISDOTDOT case.
Clean up and rewrite comments to match more closely current reality.
Sprinkle XXX where I'm not sure the current rules are being followed.
Reviewed by wrstuden@, who agreed that this is an improvement over the
current code, with concerns about LK_RETRY and whether the ISDOTDOT
locking is done soon enough.
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().