Commit Graph

110 Commits

Author SHA1 Message Date
pooka
0f85a5be83 Remove my development ifdefs. (hi simon!) 2008-08-20 14:06:35 +00:00
simonb
36d65f1138 Merge the simonb-wapbl branch. From the original branch commit:
Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
   journaling code.  Originally written by Darrin B. Jewell while
   at Wasabi and updated to -current by Antti Kantee, Andy Doran,
   Greg Oster and Simon Burge.

OK'd by core@, releng@.
2008-07-31 05:38:04 +00:00
ad
42d0626726 PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
  sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
  and is only ever write locked in dounmount(). A write hold can't be taken
  on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
  example when going r/o -> r/w, and is only present to serialize updates.
  In order to take this lock, a read hold must first be taken on
  mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
2008-05-06 18:43:44 +00:00
ad
d63f54a366 lookup: Do a vfs_trybusy(). If the file system is being unmounted, then
just fail the operation.
2008-05-06 15:04:00 +00:00
ad
928a6b2096 PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
2008-04-30 12:49:16 +00:00
ad
e3610f1886 kern/38135 vfs_busy/vfs_trybusy confusion
The symptom was that sometimes file systems would occasionally not appear
in output from 'df' or 'mount' if the system was busy. Resolution:

- Make mount locks work somewhat like vm_map locks.
- vfs_trybusy() now only fails if the mount is gone, or if someone is
  unmounting the file system. Simple contention on mnt_lock doesn't
  cause it to fail.
- vfs_busy() will wait even if the file system is being unmounted.
2008-04-29 23:51:04 +00:00
ad
25153c3ec9 PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a
  mount takes a reference, and in turn the mount takes a reference to the
  vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
  locking inherited from 4.4BSD with a recursable rwlock.
2008-01-30 11:46:59 +00:00
ad
2ecdf58c2c Remove systrace. Ok core@. 2007-12-31 15:31:24 +00:00
pooka
db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
mjf
d4a648c345 Implement a new magic string for magic symlinks, @ruid, which exapnds to the
real user id of the process and use this magic string for per-user tmp.
This should fix PR/35687

Kernel parts reviewed by wrstuden@
2007-12-04 22:09:01 +00:00
pooka
61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
ad
d18c6ca4de Merge from vmlocking:
- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
2007-11-07 00:23:13 +00:00
ad
7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad
63c4506184 Changes to make ktrace LKM friendly and reduce ifdef KTRACE. Proposed
on tech-kern.
2007-08-15 12:07:23 +00:00
pooka
e7c5957392 Revert code part of rev 1.95, yamt pointed out it changes NFS semantics. 2007-08-12 23:40:40 +00:00
pooka
67c57c75e0 CREATE is a write operation in my book, so check for that also when
checking for a readonly lookup.  This shouldn't make a difference
now, though, as the only RDONLY lookup is done by getcwd(), and
that a) doesn't create files b) calls LOOKUP directly anyway.

Also, fix comment I managed to miss in the previous commit (I didn't
expect the same comment to be there twice).
2007-08-12 19:42:09 +00:00
pooka
5450d0d87b cn_flags RDONLY brilliantly has nothing to do with the file system
itself being r/o, so fix a couple of misguided comments.
2007-08-12 19:31:12 +00:00
ad
88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
christos
09a50be501 - remove pathname_ interface.
- use macros to deal with pathnames in userspace, when veriexec is used.
- reorder the veriexec_ call arguments for consistency.
With help from elad@ finding the last bug.
2007-05-19 22:11:22 +00:00
dsl
e9a2689558 Since ktrace/systrace can sleep, move the VREF(dp) to before them. 2007-04-26 21:21:44 +00:00
dsl
41bef1b523 Be a little less over-zelous about converting ".." at the emulation root
to the real root.  Rather that do the check inside lookup() - where it
applies to to every ".." in a pathname, explicitly check the start of
the caller-supplied buffers and any absolute symbolic links.
Note that in the latter case the re-search from the real root is supressed.
Should fix PR kern/36225
2007-04-26 20:58:37 +00:00
dsl
9f6d43522e Pass the emulation root string into namei() from emul_find_interp() so that
the ktrace entries for lookups done during exec can have the full filename.
This is rather a hack :-)
2007-04-26 20:06:55 +00:00
dsl
7a81c4d42e Move the ktrace (and systrace) in namei() inside the retry loop for
emulation lookups.
If doing a lookup relative to the emulation root, prepend the emulation root
to the traced filename.
While here pass the filename length through to the ktrace code since namei()
knows the length and ktr_namei() would have to call strlen().
Note: that if namei() is being called during execve processing, the emulation
root name isn't available and "/emul/???" is used.  Also namei() has to use
strlen() to get the lenght on the emulatoon root - even though it is a
compile-time constant string.
2007-04-26 16:27:32 +00:00
dsl
47799dd2af Move the place where we convert the return value of emulation lookups that
would return the emulation-root to the real root to the main exit path.
Means that lookups of both "/" and "/." get converted from "/emul/xxx" to "/".
2007-04-25 20:41:42 +00:00
dsl
0182ef09da When we return the real root instead of the emulated root, we may
not have the parent vnode for the emulated root - so dont vput() it.
May fix PR kern/36197.
2007-04-23 07:04:30 +00:00
dsl
b8fbaf8c4b Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
  - which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
  the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
  during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
  search for absolute pathnames in the emulation root, if that fails it will
  retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
  of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
  relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
  inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
  the real root is returned instead (matching the behaviour of emul_lookup,
  but being a cheap comparison here) so that programs that scan "../.."
  looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
  CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
  TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
2007-04-22 08:29:55 +00:00
thorpej
4f3d5a9cc0 TRUE -> true, FALSE -> false 2007-02-22 06:34:42 +00:00
thorpej
712239e366 Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
2007-02-21 22:59:35 +00:00
pavel
934634a18c Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
2007-02-17 22:31:36 +00:00
chs
0507747213 more fixes for the new vnode locking scheme:
- don't use SAVESTART in calls to relookup() from unionfs,
   just vref() the desired vnode when we need to.
 - fix locking and refcounting in the unionfs EEXIST error cases.
 - release any vnode locks before calling VFS_ROOT(), vfs_busy() is enough.
   this allows us to simplify union_root() and fix PR 3006.
 - union_lock() doesn't handle shared lock requests correctly,
   so convert them to exclusive instead.  fixes PR 34775.
 - in relookup(), avoid reusing "dp" for different purposes,
   the error handling wasn't right.  (actually just get rid of dp.)
   also, change relookup() to ignore LOCKLEAF and always return the
   vnode locked since the callers already expect this.
2007-02-04 15:03:20 +00:00
elad
8b125f4fa5 PR/35524: Brian de Alwis: panic from free in pathname_get
Patch applied, thanks for the report!
2007-01-31 08:29:20 +00:00
pooka
38544312f7 update some comments for vnode locking smoergasbord change
amazing -- the description of VOP_LOOKUP is suddenly human-readable
2007-01-07 21:33:24 +00:00
pooka
fdac24081e Restore name caching behaviour accidentally removed in rev 1.73, using
variation suggested by yamt on tech-kern.

XXX: The exception is that this doesn't any longer prevent caching
of RENAME, which was implied in a weird weird way previously.  But
that's handled by the callers currently.
2007-01-07 20:43:59 +00:00
chs
7645d04974 fix two more problems in the recent changes to lookup():
- don't hold the parent directory vnode locked while traversing mount points.
   the fs that's mounted might be an NFS served by a userland process
   like the automounter, which might need to traverse the parent directory
   in order to complete the lookup.
 - in the ENAMETOOLONG case fixed in rev. 1.75, set ni_dvp to dp
   since we've logically moved on to using "dp" as the parent.
   the caller will then handle vput()ing it as normal.
   this fixes PR 35279.
2006-12-27 23:21:02 +00:00
elad
1124b0b8bc PR/35278: YAMAMOTO Takashi: veriexec sometimes feeds user va to log(9)
Introduce the (intentionally undocumented) pathname_get(), pathname_path(),
and pathname_put(), to deal with allocating and copying of pathnames from
either kernel- or user-space.
2006-12-24 08:54:55 +00:00
yamt
24d0248e5f lookup: add more missing vput(). 2006-12-13 13:36:19 +00:00
chs
e53d6bd175 in lookup(), vput() the starting vnode in the case where
we return with both ni_dvp and ni_vp being NULL.
2006-12-13 06:36:35 +00:00
chs
c398ae9734 a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
   these now always return the parent vnode locked.  namei() works as before.
   lookup() and various other paths no longer acquire vnode locks in the
   wrong order via vrele().  fixes PR 32535.
   as a nice side effect, path lookup is also up to 25% faster.
 - the above allows us to get rid of PDIRUNLOCK.
 - also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
 - remove an assumption in layer_node_find() that all file systems implement
   a recursive VOP_LOCK() (unionfs doesn't).
 - require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
   fill in eopnotsupp() for file systems that don't support being exported
   and remove the checks for NULL.  (layerfs calls these without checking.)
 - in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
   adjust which vnode is locked.  fixes PR 33374.
 - apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
2006-12-09 16:11:50 +00:00
elad
9477ac30bc Add "@uid" keyword translation, to translate effective user-id of the
process.
2006-11-04 10:14:00 +00:00
ad
f474dceb13 Use the LWP cached credentials where sane. 2006-07-23 22:06:03 +00:00
elad
215bd95ba4 integrate kauth. 2006-05-14 21:15:11 +00:00
rumble
f76a328cb8 Update namei(9) comments and man page to indicate that we operate on
vnodes, not inodes.
2006-03-03 16:15:11 +00:00
yamt
ec5a93183a merge yamt-uio_vmspace branch.
- use vmspace rather than proc or lwp where appropriate.
  the latter is more natural to specify an address space.
  (and less likely to be abused for random purposes.)
- fix a swdmover race.
2006-03-01 12:38:10 +00:00
chs
899d1b31b2 convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.
2006-02-12 01:32:06 +00:00
yamt
5a3e361753 for some random places, use PNBUF_GET/PUT rather than
- on-stack buffer
	- malloc(MAXPATHLEN)
2006-02-04 12:09:50 +00:00
chs
89a8f7b8c9 change errors returned for various operations on "/" to conform to SUSv3.
as discussed on tech-kern some time back.
2005-12-27 17:24:07 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
thorpej
d697722880 A few tweaks to magic symlinks:
- Add a @{var} syntax in addition to @var.  This allows for patterns like
  @{ostype}-@{osrelease}-@{machine_arch}.
- Add a @emul variable that expands to the process's emulation name
  (e.g. "netbsd", "netbsd32", "linux", etc.)
2005-07-06 18:53:00 +00:00
thorpej
e871a0392f Remove the last references to M_NAMEI; everything should be using PNBUF_*()
now (for a long time now).  Remove M_NAMEI, and bump the kernel version to
3.99.7 to reflect its removal.
2005-06-23 17:00:30 +00:00
thorpej
65412a2710 Implement expansion of special "magic" strings in symlinks into
system-specific values.  Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag.  It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

	@machine	value of MACHINE for the system
	@machine_arch	value of MACHINE_ARCH for the system
	@hostname	the system host name, as set with sethostname()
	@domainname	the system domain name, as set with setdomainname()
	@kernel_ident	the kernel config file name
	@osrelease	the releaes number of the OS
	@ostype		the name of the OS (always "NetBSD" for NetBSD)

Example usage:

	mkdir /arch/i386/bin
	mkdir /arch/sparc/bin
	ln -s /arch/@machine_arch/bin /bin
2005-06-23 00:30:28 +00:00