Commit Graph

127 Commits

Author SHA1 Message Date
rmind e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
rmind 4a4e52516e Remove cache_purge(9) calls from reclamation routines in the file systems,
as vclean(9) performs it for us since Lite2 merge.
2011-05-19 03:11:55 +00:00
hannken fb62bef947 Make holding v_interlock mandatory for callers of vget().
Announced some time ago on tech-kern.
2010-07-21 17:52:09 +00:00
hannken 245651a23d Remove vlockmgr(). Generic vnode lock operations now use a rwlock located
in the vnode.  All LK_* flags move from sys/lock.h to sys/vnode.h.  Calls
to vlockmgr() in file systems get replaced with VOP_LOCK() or VOP_UNLOCK().

Welcome to 5.99.34.

Discussed on tech-kern.
2010-07-01 13:00:54 +00:00
hannken 1423e65b26 Clean up vnode lock operations pass 2:
VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
2010-06-24 12:58:48 +00:00
pooka 242bf1c3e7 Stop exposing fifofs internals and leave only fifo_vnodeop_p visible. 2010-03-29 13:11:32 +00:00
mlelstv 61ec757e43 Drop two uses of disk label data.
msdosfs and cd9660 are the only filesystems that verify the filesystem
type in the label. This is the wrong place, sanity checks should only
rely on the inner structure of the filesystem (like signatures or
magic numbers).

msdosfs also used the device type information from the label to
deduce a filesystem parameter heuristically for the gemdos variant.
If there is no information inside the filesystem data itself, this
should be an explicit mount option.
2010-01-26 21:29:48 +00:00
pooka c3183f3251 The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase).  Plenty of mix'n match upper/lowercase has creeped
into the tree since then.  Nuke the macros and convert all callsites
to lowercase.

no functional change
2010-01-08 11:35:07 +00:00
tsutsui 3ef39e3a6a Apply a similar fix for mount function from ffs_vfsops.c rev 1.186:
Change cd9660_mount, in MNT_UPDATE case, to check dev_t's for equality
 instead of just vnode pointers.  Fixes erroneous "Invalid argument"
 errors from mount(8) with -u against cd9660 root in the presence of
 mfs or tmpfs /dev prepared after initial mountroot.

Tested on QEMU running cobalt Restore CD.
2009-10-19 17:53:36 +00:00
elad 009f5d2f88 Where possible, extract the file-system's access() routine to two internal
functions: the first checking if the operation is possible (regardless of
permissions), the second checking file-system permissions, ACLs, etc.

Mailing list reference:

	http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005311.html
2009-07-03 21:17:40 +00:00
dholland effcf1af5c Convert 67 namei call sites to use namei_simple, in these functions:
check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.
2009-06-29 05:08:15 +00:00
elad 870920260d Move the implementation of vaccess() to genfs_can_access(), in line with
the other routines of the same spirit.

Adjust file-system code to use it.

Keep vaccess() for KPI compatibility and to keep element of least
surprise. A "diagnostic" message warning that vaccess() is deprecated will
be printed when it's used (obviously, only in DIAGNOSTIC kernels).

No objections on tech-kern@:

	http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005310.html
2009-06-23 19:36:38 +00:00
elad 9670d2e41d Add genfs_can_mount() and use it to prevent some more code duplication of
the security checks when mounting a device (VOP_ACCESS() + kauth(9) call)).

Proposed with no objections on tech-kern@:

	http://mail-index.netbsd.org/tech-kern/2009/04/20/msg004859.html

The vnode is always expected to be locked, so no locking is done outside
the file-system code.
2009-04-25 18:53:43 +00:00
cegger 636f235e0d buildfix: re-adapt for major/minor returning 32bit value again. 2009-01-22 16:05:03 +00:00
cegger 08ebead94e make this compile 2009-01-11 09:51:38 +00:00
cegger 9b87d582bd kill MALLOC and FREE macros. 2008-12-17 20:51:31 +00:00
pooka b4099c3e1d Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.
2008-11-26 20:17:33 +00:00
rumble 28f5ebd853 Create sysctl entries during module initialisation and destroy them
appropriately.

Many of these file systems are now ready for modularisation.
2008-06-28 01:34:05 +00:00
hannken 5d2bff060a Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write.  Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn().  If set the caller
  intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
  may clear the buffer and runs copy-on-write.  Process possible errors
  from getblk() or fscow_run().  Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2008-05-16 09:21:59 +00:00
ad 42d0626726 PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
  sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
  and is only ever write locked in dounmount(). A write hold can't be taken
  on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
  example when going r/o -> r/w, and is only present to serialize updates.
  In order to take this lock, a read hold must first be taken on
  mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
2008-05-06 18:43:44 +00:00
ad e071d39c84 - Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
2008-05-05 17:11:16 +00:00
ad dae796ef8b Convert cd9660 to attach as a module. 2008-05-03 15:57:41 +00:00
ad 928a6b2096 PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
2008-04-30 12:49:16 +00:00
ad baa3395f8f PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
2008-04-29 18:18:08 +00:00
matt 8014191a2e Convert to ansi definitions from old-style definitons. 2008-02-27 19:43:36 +00:00
ad 25153c3ec9 PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a
  mount takes a reference, and in turn the mount takes a reference to the
  vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
  locking inherited from 4.4BSD with a recursable rwlock.
2008-01-30 11:46:59 +00:00
ad 3490efcc63 Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
2008-01-30 09:50:19 +00:00
dholland 717e1785a5 Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
2008-01-28 14:31:15 +00:00
ad 1997a1e1f4 Remove VOP_LEASE. Discussed on tech-kern. 2008-01-25 14:32:11 +00:00
ad 703069c0e9 specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
  vnode can describe a block device. Instead, prohibit concurrent opens of
  block devices. As a bonus remove the unreliable code that prevents
  multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
  goes away, instead of abusing vnode::v_usecount to tell if the device is
  open.
2008-01-24 17:32:52 +00:00
ad 42c90ece4c Fix dodgy tests of v_usecount. 2008-01-17 10:39:14 +00:00
ad 4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
pooka db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
ad 6a3b582fe3 Merge locking changes + fixes from the vmlocking branch. 2007-12-08 14:41:11 +00:00
pooka 61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
ad 7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad 9f56dfa520 Merge brelse() changes from the vmlocking branch. 2007-10-08 18:02:53 +00:00
pooka 8d1f899239 * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
  use VFS_PROTOS() instead of manually prototyping the methods
2007-07-31 21:14:15 +00:00
rumble 3ea6a6534e Use _DIRENT_MINSIZE when determining the number of NFS cookies to allocate,
rather than hard-coding 16.
2007-07-29 21:17:41 +00:00
ad a0d1fd8d0c It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
2007-07-29 13:31:07 +00:00
pooka d9970c8066 Use eopnotsupp() instead of vfs_stdsuspendctl() and retire the latter. 2007-07-26 22:57:36 +00:00
pooka 606670f3e8 Initialize size and/or writesize when creating a vnode. 2007-07-23 11:27:45 +00:00
pooka e24b0872a4 Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
2007-07-17 11:19:31 +00:00
dsl 2721ab6c7b Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
2007-07-12 19:35:32 +00:00
pooka 4b8065ff1e mntflushbuf() cargo cult comment mania cleanup. there is no mntflushbuf(). 2007-07-09 00:01:42 +00:00
pooka 835b0326c5 Using POOL_INIT here makes no sense, since file systems always have
an init method.  So get rid of it and #ifdef _LKM and just always
init in the init method.  Give malloc types the same treatment.
Makes file systems nicer to work with in linksetless environments
and fixes a few LKM discrepancies.
2007-06-30 09:37:53 +00:00
ad 59d979c5f1 Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
2007-03-12 18:18:22 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
thorpej 712239e366 Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
2007-02-21 22:59:35 +00:00
ad adbb9ec2fa Call genfs_node_destroy() where appropriate. 2007-02-20 16:21:03 +00:00