Commit Graph

348 Commits

Author SHA1 Message Date
dholland
d868f7242f Yet another rename workaround - this time check for . and .. early because
relookup() objects to being asked to handle them.
2008-03-28 05:02:08 +00:00
ad
8910b668ba mount_domount: hold an additional reference to the mountpoint across the
call to VFS_START. The file system can be unmounted before VFS_START
returns. Partially addresses PR kern/38291.
2008-03-25 22:13:32 +00:00
ad
a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
ad
25153c3ec9 PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a
  mount takes a reference, and in turn the mount takes a reference to the
  vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
  locking inherited from 4.4BSD with a recursable rwlock.
2008-01-30 11:46:59 +00:00
dholland
717e1785a5 Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
2008-01-28 14:31:15 +00:00
ad
1997a1e1f4 Remove VOP_LEASE. Discussed on tech-kern. 2008-01-25 14:32:11 +00:00
ad
703069c0e9 specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
  vnode can describe a block device. Instead, prohibit concurrent opens of
  block devices. As a bonus remove the unreliable code that prevents
  multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
  goes away, instead of abusing vnode::v_usecount to tell if the device is
  open.
2008-01-24 17:32:52 +00:00
ad
a38086a9b8 Remove hack that's no longer needed. 2008-01-10 19:04:23 +00:00
elad
4b3ae01c7d Refactor part of the sys_revoke() code so that it can be used in the
compat code. Allows for the removal of two redundant kauth(9) calls.

okay christos@.
2008-01-09 08:18:12 +00:00
dsl
8a62c0f2a5 Use FILE_LOCK() and FILE_UNLOCK() 2008-01-05 19:08:48 +00:00
ad
4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
ad
ea3f10f7e0 Merge more changes from vmlocking2, mainly:
- Locking improvements.
- Use pool_cache for more items.
2007-12-26 16:01:34 +00:00
ad
1af608286b Export do_sys_unlink, do_sys_rename to the rest of the kernel. 2007-12-24 15:04:19 +00:00
dsl
7e2790cf6f Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
    int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
2007-12-20 23:02:38 +00:00
pooka
db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
yamt
8ec86368e9 - reduce the number of VOP_ACCESS calls for O_RDWR. for nfs, it reduces
the number of rpcs.
- reduce code duplication.
2007-11-30 16:52:19 +00:00
pooka
61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
pooka
86beec80f0 80col & whitespace police. no functional change. 2007-10-24 15:28:55 +00:00
pooka
85f641f339 Don't take a reference to the vfsops structure in mount_domount().
It is now taken when the vfs structure is received instead of having
to randomly add references in random places.  Fixes at least vfs
lkm unload.
2007-10-23 16:16:26 +00:00
ad
7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad
451aacda90 Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
2007-10-08 15:12:05 +00:00
pooka
3f3cac88a3 Make bioops a pointer and point it to the softdeps struct in softdep
init.  Decouples "options SOFTDEP" from the main kernel and ffs code.
2007-09-01 23:40:21 +00:00
pooka
6ae9cab127 In quotactl, move vrele() to after the VFS call: protects the
mountpoint from being wiped under us better.

from David Holland
2007-08-28 09:28:10 +00:00
ad
63c4506184 Changes to make ktrace LKM friendly and reduce ifdef KTRACE. Proposed
on tech-kern.
2007-08-15 12:07:23 +00:00
pooka
8d1f899239 * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
  use VFS_PROTOS() instead of manually prototyping the methods
2007-07-31 21:14:15 +00:00
pooka
05ce20f4a0 Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
2007-07-22 19:16:04 +00:00
christos
3f6a396697 get rid of MFSNAMELEN 2007-07-17 21:15:41 +00:00
dsl
1490d3327f Version mount(2) so that the length of the 'data' buffer is passed into
the kernel.
2007-07-14 15:41:30 +00:00
dsl
2721ab6c7b Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
2007-07-12 19:35:32 +00:00
dsl
0699e1c8bf Move the point at which sys_readv and sys_preadv (and writev) get merged
so that the same common code can be used with a kernel-resident 'iov'
array from the 32-bit compat code (which currently has its own copy
of these routines.
2007-06-16 20:48:03 +00:00
hannken
6087f7cc14 Dounmount(): rearrange mountlist_slock. vfs_allocate_syncvnode() may sleep
getting a new vnode so it must not be called with this simple_lock taken.

Fixes PR #36395
2007-06-07 10:03:12 +00:00
tnn
6380d93405 When renaming, copy the new name into the designated memory area.
Tested by martti@
2007-05-22 10:39:10 +00:00
dsl
b113bdbde9 Fix logic inversion - probably PR kern/36284 2007-05-21 18:30:35 +00:00
christos
09a50be501 - remove pathname_ interface.
- use macros to deal with pathnames in userspace, when veriexec is used.
- reorder the veriexec_ call arguments for consistency.
With help from elad@ finding the last bug.
2007-05-19 22:11:22 +00:00
christos
50ab9d6934 - since mknod now can create regular files, make sure veriexec allows it.
Done in a way to minimize ifdefs. Per discussions with elad.
2007-05-17 00:46:30 +00:00
elad
6700cfccd6 Some Veriexec stuff that's been rotting in my tree for months.
Bug fixes:
  - Fix crash reported by Scott Ellis on current-users@.

  - Fix race conditions in enforcing the Veriexec rename and remove
    policies. These are NOT security issues.

  - Fix memory leak in rename handling when overwriting a monitored
    file.

  - Fix table deletion logic.

  - Don't prevent query requests if not in learning mode.


KPI updates:
  - fileassoc_table_run() now takes a cookie to pass to the callback.

  - veriexec_table_add() was removed, it is now done internally. As a
    result, there's no longer a need for VERIEXEC_TABLESIZE.

  - veriexec_report() was removed, it is now internal.

  - Perform sanity checks on the entry type, and enforce default type
    in veriexec_file_add() rather than in veriexecctl.

  - Add veriexec_flush(), used to delete all Veriexec tables, and
    veriexec_dump(), used to fill an array with all Veriexec entries.


New features:
  - Add a '-k' flag to veriexecctl, to keep the filenames in the kernel
    database. This allows Veriexec to produce slightly more accurate
    logs under certain circumstances. In the future, this can be either
    replaced by vnode->pathname translation, or combined with it.

  - Add a VERIEXEC_DUMP ioctl, to dump the entire Veriexec database.
    This can be used to recover a database if the file was lost.
    Example usage:

        # veriexecctl dump > /etc/signatures

    Note that only entries with the filename kept (that is, were loaded
    with the '-k' flag) will be dumped.

    Idea from Brett Lymn.

  - Add a VERIEXEC_FLUSH ioctl, to delete all Veriexec entries. Sample
    usage:

        # veriexecctl flush

  - Add a 'veriexec_flags' rc(8) variable, and make its default have
    the '-k' flag. On systems using the default signatures file
    (generaetd from running 'veriexecgen' with no arguments), this will
    use additional 32kb of kernel memory on average.

  - Add a '-e' flag to veriexecctl, to evaluate the fingerprint during
    load. This is done automatically for files marked as 'untrusted'.


Misc. stuff:
  - The code for veriexecctl was massively simplified as a result of
    eliminating the need for VERIEXEC_TABLESIZE, and now uses a single
    pass of the signatures file, making the loading somewhat faster.

  - Lots of minor fixes found using the (still under development)
    Veriexec regression testsuite.

  - Some of the messages Veriexec prints were improved.

  - Various documentation fixes.


All relevant man-pages were updated to reflect the above changes.

Binary compatibility with existing veriexecctl binaries is maintained.
2007-05-15 19:47:43 +00:00
dsl
c83f8a10ad Change the compat sys_[fl]utime code to not use the stackgap. 2007-05-12 17:28:19 +00:00
dsl
0df00dcf55 Split the statvfs functions so that the 'work' is done to a kernel buffer
which can either be copied directly to userspace, or converted then copied.
Saves replicating a lot of code in the compat functions (esp. for
getvfsstat) at a cast of an extra function call in the non-emulated case -
which is unlikely to be measurable given the other costs of the actions
involved (even on vax).
Remove dofhstat() and dofhstatvfs() (and the last caller).
Remove some redundant stackgap_init() calls.
2007-04-30 08:32:14 +00:00
dsl
b8fbaf8c4b Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
  - which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
  the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
  during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
  search for absolute pathnames in the emulation root, if that fails it will
  retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
  of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
  relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
  inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
  the real root is returned instead (matching the behaviour of emul_lookup,
  but being a cheap comparison here) so that programs that scan "../.."
  looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
  CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
  TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
2007-04-22 08:29:55 +00:00
pooka
f3fbb884a5 If mount(MNT_UPDATE) is called for a non-VROOT directory, don't vput()
the "mountpoint" vnode twice due to an error branch.

thanks go to Gert Doering for reporting the problem and testing the fix
and to Juergen Hannken-Illjes for much of the analysis work leading to
the discovery of the problem cause
2007-04-09 21:11:03 +00:00
hannken
fc6776f366 Remove now obsolete vn_start_write() and vn_finished_write() and
corresponding flags.

Revert softdep_trackbufs() to its state before vn_start_write() was added.

Remove from struct mount now unneeded flags IMNT_SUSPEND* and
members mnt_writeopcountupper, mnt_writeopcountlower and mnt_leaf.

Welcome to 4.99.17
2007-04-08 11:20:42 +00:00
hannken
13daf5bc6e Remove calls to now obsolete vn_start_write() and vn_finished_write(). 2007-04-01 10:15:01 +00:00
dsl
6d1bab1af3 Split the work for sys_stat, sys_lstat, sys_fstat and sys_fhstat out into
separate functions that don't do the copyout.
This allows all the compat_xxx versions to convert the 'struct stat' to
the correct format without using the 'stackgap'.
The stackgap isn't at all LWP friendly, and needs to be removed from
any compat functions that might involve threads (inc. clone()).
The code is still binary compatible with existing LKMs.
2007-03-10 16:50:01 +00:00
ad
c147748d84 - Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
2007-03-09 14:11:22 +00:00
pooka
f7ed04a6ff simplify previous a bit. no functional change. 2007-03-01 10:02:31 +00:00
pooka
428270cc03 avoid lock leak in error branch of sys_fchdir()
thanks to Tom Spindler and Greg Oster in helping find the cure
2007-02-28 20:39:06 +00:00
pooka
2da757310f if doing VOP_CREATE via sys_mknod, set va_rdev to VNOVAL instead of 0 2007-02-18 20:36:36 +00:00
pooka
2deb71d45f Support creating regular files with mknod(2) to match Linux/Solaris
behaviour.  This happens if mode contains S_IFREG.  mknod(2) is
still restricted to the superuser.

no objections from tech-kern
2007-02-18 19:57:29 +00:00
ad
b07ec3fc38 Merge newlock2 to head. 2007-02-09 21:55:00 +00:00
elad
9ac600139e Initialize pathname_t objects to NULL. 2007-02-04 20:33:02 +00:00