Commit Graph

322 Commits

Author SHA1 Message Date
christos 3f6a396697 get rid of MFSNAMELEN 2007-07-17 21:15:41 +00:00
dsl 1490d3327f Version mount(2) so that the length of the 'data' buffer is passed into
the kernel.
2007-07-14 15:41:30 +00:00
dsl 2721ab6c7b Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
2007-07-12 19:35:32 +00:00
dsl 0699e1c8bf Move the point at which sys_readv and sys_preadv (and writev) get merged
so that the same common code can be used with a kernel-resident 'iov'
array from the 32-bit compat code (which currently has its own copy
of these routines.
2007-06-16 20:48:03 +00:00
hannken 6087f7cc14 Dounmount(): rearrange mountlist_slock. vfs_allocate_syncvnode() may sleep
getting a new vnode so it must not be called with this simple_lock taken.

Fixes PR #36395
2007-06-07 10:03:12 +00:00
tnn 6380d93405 When renaming, copy the new name into the designated memory area.
Tested by martti@
2007-05-22 10:39:10 +00:00
dsl b113bdbde9 Fix logic inversion - probably PR kern/36284 2007-05-21 18:30:35 +00:00
christos 09a50be501 - remove pathname_ interface.
- use macros to deal with pathnames in userspace, when veriexec is used.
- reorder the veriexec_ call arguments for consistency.
With help from elad@ finding the last bug.
2007-05-19 22:11:22 +00:00
christos 50ab9d6934 - since mknod now can create regular files, make sure veriexec allows it.
Done in a way to minimize ifdefs. Per discussions with elad.
2007-05-17 00:46:30 +00:00
elad 6700cfccd6 Some Veriexec stuff that's been rotting in my tree for months.
Bug fixes:
  - Fix crash reported by Scott Ellis on current-users@.

  - Fix race conditions in enforcing the Veriexec rename and remove
    policies. These are NOT security issues.

  - Fix memory leak in rename handling when overwriting a monitored
    file.

  - Fix table deletion logic.

  - Don't prevent query requests if not in learning mode.


KPI updates:
  - fileassoc_table_run() now takes a cookie to pass to the callback.

  - veriexec_table_add() was removed, it is now done internally. As a
    result, there's no longer a need for VERIEXEC_TABLESIZE.

  - veriexec_report() was removed, it is now internal.

  - Perform sanity checks on the entry type, and enforce default type
    in veriexec_file_add() rather than in veriexecctl.

  - Add veriexec_flush(), used to delete all Veriexec tables, and
    veriexec_dump(), used to fill an array with all Veriexec entries.


New features:
  - Add a '-k' flag to veriexecctl, to keep the filenames in the kernel
    database. This allows Veriexec to produce slightly more accurate
    logs under certain circumstances. In the future, this can be either
    replaced by vnode->pathname translation, or combined with it.

  - Add a VERIEXEC_DUMP ioctl, to dump the entire Veriexec database.
    This can be used to recover a database if the file was lost.
    Example usage:

        # veriexecctl dump > /etc/signatures

    Note that only entries with the filename kept (that is, were loaded
    with the '-k' flag) will be dumped.

    Idea from Brett Lymn.

  - Add a VERIEXEC_FLUSH ioctl, to delete all Veriexec entries. Sample
    usage:

        # veriexecctl flush

  - Add a 'veriexec_flags' rc(8) variable, and make its default have
    the '-k' flag. On systems using the default signatures file
    (generaetd from running 'veriexecgen' with no arguments), this will
    use additional 32kb of kernel memory on average.

  - Add a '-e' flag to veriexecctl, to evaluate the fingerprint during
    load. This is done automatically for files marked as 'untrusted'.


Misc. stuff:
  - The code for veriexecctl was massively simplified as a result of
    eliminating the need for VERIEXEC_TABLESIZE, and now uses a single
    pass of the signatures file, making the loading somewhat faster.

  - Lots of minor fixes found using the (still under development)
    Veriexec regression testsuite.

  - Some of the messages Veriexec prints were improved.

  - Various documentation fixes.


All relevant man-pages were updated to reflect the above changes.

Binary compatibility with existing veriexecctl binaries is maintained.
2007-05-15 19:47:43 +00:00
dsl c83f8a10ad Change the compat sys_[fl]utime code to not use the stackgap. 2007-05-12 17:28:19 +00:00
dsl 0df00dcf55 Split the statvfs functions so that the 'work' is done to a kernel buffer
which can either be copied directly to userspace, or converted then copied.
Saves replicating a lot of code in the compat functions (esp. for
getvfsstat) at a cast of an extra function call in the non-emulated case -
which is unlikely to be measurable given the other costs of the actions
involved (even on vax).
Remove dofhstat() and dofhstatvfs() (and the last caller).
Remove some redundant stackgap_init() calls.
2007-04-30 08:32:14 +00:00
dsl b8fbaf8c4b Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
  - which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
  the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
  during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
  search for absolute pathnames in the emulation root, if that fails it will
  retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
  of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
  relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
  inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
  the real root is returned instead (matching the behaviour of emul_lookup,
  but being a cheap comparison here) so that programs that scan "../.."
  looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
  CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
  TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
2007-04-22 08:29:55 +00:00
pooka f3fbb884a5 If mount(MNT_UPDATE) is called for a non-VROOT directory, don't vput()
the "mountpoint" vnode twice due to an error branch.

thanks go to Gert Doering for reporting the problem and testing the fix
and to Juergen Hannken-Illjes for much of the analysis work leading to
the discovery of the problem cause
2007-04-09 21:11:03 +00:00
hannken fc6776f366 Remove now obsolete vn_start_write() and vn_finished_write() and
corresponding flags.

Revert softdep_trackbufs() to its state before vn_start_write() was added.

Remove from struct mount now unneeded flags IMNT_SUSPEND* and
members mnt_writeopcountupper, mnt_writeopcountlower and mnt_leaf.

Welcome to 4.99.17
2007-04-08 11:20:42 +00:00
hannken 13daf5bc6e Remove calls to now obsolete vn_start_write() and vn_finished_write(). 2007-04-01 10:15:01 +00:00
dsl 6d1bab1af3 Split the work for sys_stat, sys_lstat, sys_fstat and sys_fhstat out into
separate functions that don't do the copyout.
This allows all the compat_xxx versions to convert the 'struct stat' to
the correct format without using the 'stackgap'.
The stackgap isn't at all LWP friendly, and needs to be removed from
any compat functions that might involve threads (inc. clone()).
The code is still binary compatible with existing LKMs.
2007-03-10 16:50:01 +00:00
ad c147748d84 - Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
2007-03-09 14:11:22 +00:00
pooka f7ed04a6ff simplify previous a bit. no functional change. 2007-03-01 10:02:31 +00:00
pooka 428270cc03 avoid lock leak in error branch of sys_fchdir()
thanks to Tom Spindler and Greg Oster in helping find the cure
2007-02-28 20:39:06 +00:00
pooka 2da757310f if doing VOP_CREATE via sys_mknod, set va_rdev to VNOVAL instead of 0 2007-02-18 20:36:36 +00:00
pooka 2deb71d45f Support creating regular files with mknod(2) to match Linux/Solaris
behaviour.  This happens if mode contains S_IFREG.  mknod(2) is
still restricted to the superuser.

no objections from tech-kern
2007-02-18 19:57:29 +00:00
ad b07ec3fc38 Merge newlock2 to head. 2007-02-09 21:55:00 +00:00
elad 9ac600139e Initialize pathname_t objects to NULL. 2007-02-04 20:33:02 +00:00
chs 0507747213 more fixes for the new vnode locking scheme:
- don't use SAVESTART in calls to relookup() from unionfs,
   just vref() the desired vnode when we need to.
 - fix locking and refcounting in the unionfs EEXIST error cases.
 - release any vnode locks before calling VFS_ROOT(), vfs_busy() is enough.
   this allows us to simplify union_root() and fix PR 3006.
 - union_lock() doesn't handle shared lock requests correctly,
   so convert them to exclusive instead.  fixes PR 34775.
 - in relookup(), avoid reusing "dp" for different purposes,
   the error handling wasn't right.  (actually just get rid of dp.)
   also, change relookup() to ignore LOCKLEAF and always return the
   vnode locked since the callers already expect this.
2007-02-04 15:03:20 +00:00
hannken 1b9c6382e3 New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE.  This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).
2007-01-19 14:49:08 +00:00
pooka 88f603fea0 TAILQ_INIT a mountpoint's vnode queue and always add vnodes to the
tail instead of an explicit check to add to the head for an empty
queue.  Apparently TAILQ_INSERT_HEAD happens to work for a
non-initialized head and does implicit initialization so that
TAILQ_INSERT_TAIL works after that.
2007-01-15 19:13:30 +00:00
elad ce903562f2 Use kauth(9). 2007-01-05 13:34:17 +00:00
elad 1e70d64818 Consistent usage of KAUTH_GENERIC_ISSUSER. 2007-01-04 16:55:29 +00:00
wrstuden 264b840ad9 Fix issue noted by Ilja van Sprundel and disclosed at 23C3.
Make sure we always FILE_UNUSE the file. To make it easier, exit
via a new "out:" exit path that does so, setting error beforehand.

Fix suggested by Elad, hand-typed by me.
2007-01-03 23:20:58 +00:00
elad a13160f423 Make mount(2) and unmount(2) use kauth(9) for security policy.
Okay yamt@.
2007-01-02 10:47:28 +00:00
pooka b73e147d2c in rename_files(), match pre-1.280 locking behaviour by unlocking
fromnd's dvp only in case the dvp != vp
2007-01-01 22:00:16 +00:00
elad 0b96cfb817 Add back MNT_NOEXEC propagation on new mounts by non-root users.
Mistakenly removed in revision 1.286.
2007-01-01 20:45:51 +00:00
elad b6a8425161 Enforce exclusive MNT_GETARGS in mount_getargs(). 2006-12-31 10:05:52 +00:00
yamt 88bbf6ee26 mount_domount: revive code to enforce MNT_NOSUID and MNT_NODEV for usermount,
which was removed mistakenly by rev.1.286.  pointed by elad.
2006-12-28 14:33:41 +00:00
yamt a8552e41ca mount_domount: don't forget to handle MNT_RDONLY.
PR/35327 from Christian Biere.
2006-12-27 08:55:35 +00:00
yamt 42489b9a68 - shorten the period to modify mnt_flag temporarily.
- desupport MNT_EXPORTED without MNT_UPDATE explicitly.
- fix a comment.
- unwrap short lines.
2006-12-26 12:39:01 +00:00
elad 6be473ba20 Don't reference userspace pointers. 2006-12-25 22:03:42 +00:00
elad a44abdfff8 Properly handle flags in mount_domount(). 2006-12-25 08:11:52 +00:00
elad 97b434c554 Slash sys_mount() and add three helper functions: mount_update(),
mount_getargs(), and mount_domount() to handle three main things it can
do.

This makes the code more readable and removes the horrible goto mess
that was lurking there since forever... it also makes it easier to
implement a security policy for that code.
2006-12-24 12:43:17 +00:00
elad 1124b0b8bc PR/35278: YAMAMOTO Takashi: veriexec sometimes feeds user va to log(9)
Introduce the (intentionally undocumented) pathname_get(), pathname_path(),
and pathname_put(), to deal with allocating and copying of pathnames from
either kernel- or user-space.
2006-12-24 08:54:55 +00:00
yamt 4cfe5a1b41 - just associate fileassoc "table" to struct mount.
because the latter is always available during the lifetime of the former,
  there is little point to use another global list to keep track of them.
  it also allows to remove an #ifdef FILEASSOC.

- avoid some operations (memory allocation and VOP) in fileassoc_file_lookup,
  when fileassoc table is not used.

ok'ed by elad.
2006-12-14 09:24:54 +00:00
chs c398ae9734 a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
   these now always return the parent vnode locked.  namei() works as before.
   lookup() and various other paths no longer acquire vnode locks in the
   wrong order via vrele().  fixes PR 32535.
   as a nice side effect, path lookup is also up to 25% faster.
 - the above allows us to get rid of PDIRUNLOCK.
 - also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
 - remove an assumption in layer_node_find() that all file systems implement
   a recursive VOP_LOCK() (unionfs doesn't).
 - require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
   fill in eopnotsupp() for file systems that don't support being exported
   and remove the checks for NULL.  (layerfs calls these without checking.)
 - in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
   adjust which vnode is locked.  fixes PR 33374.
 - apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
2006-12-09 16:11:50 +00:00
elad 0c67c581a5 Massive restructuring and cleanup of Veriexec, mainly in preparation
for work on some future functionality.

  - Veriexec data-structures are no longer exposed.

  - Thanks to using proplib for data passing now, the interface
    changes further to accomodate that.

    Introduce four new functions. First, veriexec_file_add(), to add
    a new file to be monitored by Veriexec, to replace both
    veriexec_load() and veriexec_hashadd(). veriexec_table_add(), to
    replace veriexec_newtable(), will be used to optimize hash table
    size (during preload), and finally, veriexec_convert(), to convert
    an internal entry to one userland can read.

  - Introduce veriexec_unmountchk(), to enforce Veriexec unmount
    policy. This cleans up a bit of code in kern/vfs_syscalls.c.

  - Rename veriexec_tblfind() with veriexec_table_lookup(), and make
    it static. More functions that became static: veriexec_fp_cmp(),
    veriexec_fp_calc().

  - veriexec_verify() no longer returns the entry as well, but just
    sets a boolean indicating whether an entry was found or not.

  - veriexec_purge() now takes a struct vnode *.

  - veriexec_add_fp_name() was merged into veriexec_add_fp_ops(), that
    changed its name to veriexec_fpops_add(). veriexec_find_ops() was
    also renamed to veriexec_fpops_lookup().

    Also on the fp-ops front, the three function types used to initialize,
    update, and finalize a hash context were renamed to
    veriexec_fpop_init_t, veriexec_fpop_update_t, and veriexec_fpop_final_t
    respectively.

  - Introduce a new malloc(9) type, M_VERIEXEC, and use it instead of
    M_TEMP, so we can tell exactly how much memory is used by Veriexec.

  - And, most importantly, whitespace and indentation nits.

Built successfuly for amd64, i386, sparc, and sparc64. Tested on amd64.
2006-11-30 01:09:47 +00:00
elad cbe2288b0c printf() -> log() for Veriexec messages. 2006-11-21 23:52:41 +00:00
hannken e29b23b983 Add specificdata support to mount points.
Welcome to NetBSD 4.99.4

Approved by: Jason Thorpe <thorpej@netbsd.org>
2006-11-17 17:05:18 +00:00
yamt 1a7bc55dcc remove some __unused from function parameters. 2006-11-01 10:17:58 +00:00
mjf a2be0ed655 Revert the changes I introduced trying to solve tmpfs' NFS export problem.
Requested by yamt@
2006-10-31 08:12:46 +00:00
mjf 84bd46b9f9 Add support to allow a file system to not permit being exported over NFS.
Approved by elad@ and wrstuden@
2006-10-24 21:53:10 +00:00
reinoud 0ce809091d Replace the LIST structure mp->mnt_vnodelist to a TAILQ structure since all
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.

An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.

In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
2006-10-20 18:58:12 +00:00