- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
add do_sys_sync() that takes an 'lwp' (for l_cred) as an argument.
Explicitly pass &lwp0 rather than NULL and expecting sys_sync to
substitute some random lwp.
already prevented). File systems are no longer responsible to check this.
Clean up and add asserts (note that dvp == vp cannot happen in vop_link).
OK dholland@
- Add fd_set_exclose() to encapsulate uses of FIO{,N}CLEX, O_CLOEXEC, F{G,S}ETFD
- Add a pipe1() function to allow passing flags to the fd's that pipe(2)
opens to ease implementation of linux pipe2(2)
- Factor out fp handling code from open(2) and fhopen(2)
to store disk quota usage and limits, integrated with ffs
metadata. Usage is checked by fsck_ffs (no more quotacheck)
and is covered by the WAPBL journal. Enabled with kernel
option QUOTA2 (added where QUOTA was enabled in kernel config files),
turned on with tunefs(8) on a per-filesystem
basis. mount_mfs(8) can also turn quotas on.
See http://mail-index.netbsd.org/tech-kern/2011/02/19/msg010025.html
for details.
calling VOP_ACCESS with flags 0 and something asserts deep in the
bowels of kauth. PR 44648 from Taylor Campbell. (I moved the check
earlier relative to the suggested patch.)
Pullup candidate.
parent dir) associated with SAVESTART in relookup().
Check all call sites to make sure that SAVESTART wasn't set while
calling relookup(); if it was, adjust the refcount behavior. Remove
related references to SAVESTART.
The only code that was reaching the extra ref was msdosfs_rename,
where the refcount behavior was already fairly broken and/or gross;
repair it.
Add a dummy 4th argument to relookup to make sure code that hasn't
been inspected won't compile. (This will go away next time the
relookup semantics change, which they will.)
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)
This removes the need for the SAVENAME and HASBUF namei flags.
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.
Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).
The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.
after the file system is setup by VFS_MOUNT(). This way recursive vnode
locks are no longer needed here and mounts on null mounts no longer fail
as described in PR #43439 (mount_null panic: lockdebug_wantlock: locking
against myself).
Based on a proposal from and
Reviewed by: David A. Holland <dholland@netbsd.org>
(for an unprivileged user) to force vfs modules to remain loaded
forever. Also, it's possible for an admin with fat fingers to have
to curse out loud (a lot) and reboot.
.. or at least fix things as much as seems to be possible without
involving 1000 zorkmids. do_sys_mount() takes either struct vfsops
(which hopefully came properly referenced) or a userspace string
for file system type. The standard in-kernel calling convention
of "do_sys_mount(l, vfs_getopsbyname("nfs"), NULL," is not to be
considered healthy, kosher, or even tasty (although if vfs_getopsbyname()
fails the whole thing *currently* fails without the program counter
pointing to hyperspace).
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase). Plenty of mix'n match upper/lowercase has creeped
into the tree since then. Nuke the macros and convert all callsites
to lowercase.
no functional change
Factor out common code of chroot-like syscalls into change_root() and export
that function for use in other parts of the kernel.
Rename change_dir() to chdir_lookup() as the latter describes better what
the function does. While there, move the namei_data initialisation into
chdir_lookup(), too. And export chdir_lookup().
check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr
All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.
XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.
a new struct mount-allocation routine, vfs_mountalloc(9). Documentation
updates will follow.
Attention: Synchronization Oversight Committee! In mount_domount(),
I postpone the call mutex_enter(&mp->mnt_updating) until right before
the VFS_MOUNT(9) call because (1) that looks to me like the earliest
possible opportunity for mp to become visible to any other LWP, because
it was just kmem_zalloc(9)'d and (2) it made extracting the common code
much easier. Tell me if my reasoning is faulty.
PR kern/16942 panic with softdep and quotas
PR kern/19565 panic: softdep_write_inodeblock: indirect pointer #1 mismatch
PR kern/26274 softdep panic: allocdirect_merge: ...
PR kern/26374 Long delay before non-root users can write to softdep partitions
PR kern/28621 1.6.x "vp != NULL" panic in ffs_softdep.c:4653 while unmounting a softdep (+quota) filesystem
PR kern/29513 FFS+Softdep panic with unfsck-able file-corruption
PR kern/31544 The ffs softdep code appears to fail to write dirty bits to disk
PR kern/31981 stopping scsi disk can cause panic (softdep)
PR kern/32116 kernel panic in softdep (assertion failure)
PR kern/32532 softdep_trackbufs deadlock
PR kern/37191 softdep: locking against myself
PR kern/40474 Kernel panic after remounting raid root with softdep
Retire softdep, pass 2. As discussed and later formally announced on the
mailing lists.
- vfs_syscalls.c rev. 1.342 fails to invert condition correcly when
then-clause and else-clause is swapped. Since then, revoke(2) fails
if it is issued by file owner.
- Probably since rev. 1.160 of genfs_vnops.c, revoke(2) fails if it is
applied to non-device file and drops kernel into ddb.
into modules. By and large this commit:
- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime