Commit Graph

313 Commits

Author SHA1 Message Date
elad
6700cfccd6 Some Veriexec stuff that's been rotting in my tree for months.
Bug fixes:
  - Fix crash reported by Scott Ellis on current-users@.

  - Fix race conditions in enforcing the Veriexec rename and remove
    policies. These are NOT security issues.

  - Fix memory leak in rename handling when overwriting a monitored
    file.

  - Fix table deletion logic.

  - Don't prevent query requests if not in learning mode.


KPI updates:
  - fileassoc_table_run() now takes a cookie to pass to the callback.

  - veriexec_table_add() was removed, it is now done internally. As a
    result, there's no longer a need for VERIEXEC_TABLESIZE.

  - veriexec_report() was removed, it is now internal.

  - Perform sanity checks on the entry type, and enforce default type
    in veriexec_file_add() rather than in veriexecctl.

  - Add veriexec_flush(), used to delete all Veriexec tables, and
    veriexec_dump(), used to fill an array with all Veriexec entries.


New features:
  - Add a '-k' flag to veriexecctl, to keep the filenames in the kernel
    database. This allows Veriexec to produce slightly more accurate
    logs under certain circumstances. In the future, this can be either
    replaced by vnode->pathname translation, or combined with it.

  - Add a VERIEXEC_DUMP ioctl, to dump the entire Veriexec database.
    This can be used to recover a database if the file was lost.
    Example usage:

        # veriexecctl dump > /etc/signatures

    Note that only entries with the filename kept (that is, were loaded
    with the '-k' flag) will be dumped.

    Idea from Brett Lymn.

  - Add a VERIEXEC_FLUSH ioctl, to delete all Veriexec entries. Sample
    usage:

        # veriexecctl flush

  - Add a 'veriexec_flags' rc(8) variable, and make its default have
    the '-k' flag. On systems using the default signatures file
    (generaetd from running 'veriexecgen' with no arguments), this will
    use additional 32kb of kernel memory on average.

  - Add a '-e' flag to veriexecctl, to evaluate the fingerprint during
    load. This is done automatically for files marked as 'untrusted'.


Misc. stuff:
  - The code for veriexecctl was massively simplified as a result of
    eliminating the need for VERIEXEC_TABLESIZE, and now uses a single
    pass of the signatures file, making the loading somewhat faster.

  - Lots of minor fixes found using the (still under development)
    Veriexec regression testsuite.

  - Some of the messages Veriexec prints were improved.

  - Various documentation fixes.


All relevant man-pages were updated to reflect the above changes.

Binary compatibility with existing veriexecctl binaries is maintained.
2007-05-15 19:47:43 +00:00
dsl
c83f8a10ad Change the compat sys_[fl]utime code to not use the stackgap. 2007-05-12 17:28:19 +00:00
dsl
0df00dcf55 Split the statvfs functions so that the 'work' is done to a kernel buffer
which can either be copied directly to userspace, or converted then copied.
Saves replicating a lot of code in the compat functions (esp. for
getvfsstat) at a cast of an extra function call in the non-emulated case -
which is unlikely to be measurable given the other costs of the actions
involved (even on vax).
Remove dofhstat() and dofhstatvfs() (and the last caller).
Remove some redundant stackgap_init() calls.
2007-04-30 08:32:14 +00:00
dsl
b8fbaf8c4b Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
  - which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
  the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
  during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
  search for absolute pathnames in the emulation root, if that fails it will
  retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
  of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
  relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
  inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
  the real root is returned instead (matching the behaviour of emul_lookup,
  but being a cheap comparison here) so that programs that scan "../.."
  looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
  CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
  TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
2007-04-22 08:29:55 +00:00
pooka
f3fbb884a5 If mount(MNT_UPDATE) is called for a non-VROOT directory, don't vput()
the "mountpoint" vnode twice due to an error branch.

thanks go to Gert Doering for reporting the problem and testing the fix
and to Juergen Hannken-Illjes for much of the analysis work leading to
the discovery of the problem cause
2007-04-09 21:11:03 +00:00
hannken
fc6776f366 Remove now obsolete vn_start_write() and vn_finished_write() and
corresponding flags.

Revert softdep_trackbufs() to its state before vn_start_write() was added.

Remove from struct mount now unneeded flags IMNT_SUSPEND* and
members mnt_writeopcountupper, mnt_writeopcountlower and mnt_leaf.

Welcome to 4.99.17
2007-04-08 11:20:42 +00:00
hannken
13daf5bc6e Remove calls to now obsolete vn_start_write() and vn_finished_write(). 2007-04-01 10:15:01 +00:00
dsl
6d1bab1af3 Split the work for sys_stat, sys_lstat, sys_fstat and sys_fhstat out into
separate functions that don't do the copyout.
This allows all the compat_xxx versions to convert the 'struct stat' to
the correct format without using the 'stackgap'.
The stackgap isn't at all LWP friendly, and needs to be removed from
any compat functions that might involve threads (inc. clone()).
The code is still binary compatible with existing LKMs.
2007-03-10 16:50:01 +00:00
ad
c147748d84 - Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
2007-03-09 14:11:22 +00:00
pooka
f7ed04a6ff simplify previous a bit. no functional change. 2007-03-01 10:02:31 +00:00
pooka
428270cc03 avoid lock leak in error branch of sys_fchdir()
thanks to Tom Spindler and Greg Oster in helping find the cure
2007-02-28 20:39:06 +00:00
pooka
2da757310f if doing VOP_CREATE via sys_mknod, set va_rdev to VNOVAL instead of 0 2007-02-18 20:36:36 +00:00
pooka
2deb71d45f Support creating regular files with mknod(2) to match Linux/Solaris
behaviour.  This happens if mode contains S_IFREG.  mknod(2) is
still restricted to the superuser.

no objections from tech-kern
2007-02-18 19:57:29 +00:00
ad
b07ec3fc38 Merge newlock2 to head. 2007-02-09 21:55:00 +00:00
elad
9ac600139e Initialize pathname_t objects to NULL. 2007-02-04 20:33:02 +00:00
chs
0507747213 more fixes for the new vnode locking scheme:
- don't use SAVESTART in calls to relookup() from unionfs,
   just vref() the desired vnode when we need to.
 - fix locking and refcounting in the unionfs EEXIST error cases.
 - release any vnode locks before calling VFS_ROOT(), vfs_busy() is enough.
   this allows us to simplify union_root() and fix PR 3006.
 - union_lock() doesn't handle shared lock requests correctly,
   so convert them to exclusive instead.  fixes PR 34775.
 - in relookup(), avoid reusing "dp" for different purposes,
   the error handling wasn't right.  (actually just get rid of dp.)
   also, change relookup() to ignore LOCKLEAF and always return the
   vnode locked since the callers already expect this.
2007-02-04 15:03:20 +00:00
hannken
1b9c6382e3 New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE.  This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).
2007-01-19 14:49:08 +00:00
pooka
88f603fea0 TAILQ_INIT a mountpoint's vnode queue and always add vnodes to the
tail instead of an explicit check to add to the head for an empty
queue.  Apparently TAILQ_INSERT_HEAD happens to work for a
non-initialized head and does implicit initialization so that
TAILQ_INSERT_TAIL works after that.
2007-01-15 19:13:30 +00:00
elad
ce903562f2 Use kauth(9). 2007-01-05 13:34:17 +00:00
elad
1e70d64818 Consistent usage of KAUTH_GENERIC_ISSUSER. 2007-01-04 16:55:29 +00:00
wrstuden
264b840ad9 Fix issue noted by Ilja van Sprundel and disclosed at 23C3.
Make sure we always FILE_UNUSE the file. To make it easier, exit
via a new "out:" exit path that does so, setting error beforehand.

Fix suggested by Elad, hand-typed by me.
2007-01-03 23:20:58 +00:00
elad
a13160f423 Make mount(2) and unmount(2) use kauth(9) for security policy.
Okay yamt@.
2007-01-02 10:47:28 +00:00
pooka
b73e147d2c in rename_files(), match pre-1.280 locking behaviour by unlocking
fromnd's dvp only in case the dvp != vp
2007-01-01 22:00:16 +00:00
elad
0b96cfb817 Add back MNT_NOEXEC propagation on new mounts by non-root users.
Mistakenly removed in revision 1.286.
2007-01-01 20:45:51 +00:00
elad
b6a8425161 Enforce exclusive MNT_GETARGS in mount_getargs(). 2006-12-31 10:05:52 +00:00
yamt
88bbf6ee26 mount_domount: revive code to enforce MNT_NOSUID and MNT_NODEV for usermount,
which was removed mistakenly by rev.1.286.  pointed by elad.
2006-12-28 14:33:41 +00:00
yamt
a8552e41ca mount_domount: don't forget to handle MNT_RDONLY.
PR/35327 from Christian Biere.
2006-12-27 08:55:35 +00:00
yamt
42489b9a68 - shorten the period to modify mnt_flag temporarily.
- desupport MNT_EXPORTED without MNT_UPDATE explicitly.
- fix a comment.
- unwrap short lines.
2006-12-26 12:39:01 +00:00
elad
6be473ba20 Don't reference userspace pointers. 2006-12-25 22:03:42 +00:00
elad
a44abdfff8 Properly handle flags in mount_domount(). 2006-12-25 08:11:52 +00:00
elad
97b434c554 Slash sys_mount() and add three helper functions: mount_update(),
mount_getargs(), and mount_domount() to handle three main things it can
do.

This makes the code more readable and removes the horrible goto mess
that was lurking there since forever... it also makes it easier to
implement a security policy for that code.
2006-12-24 12:43:17 +00:00
elad
1124b0b8bc PR/35278: YAMAMOTO Takashi: veriexec sometimes feeds user va to log(9)
Introduce the (intentionally undocumented) pathname_get(), pathname_path(),
and pathname_put(), to deal with allocating and copying of pathnames from
either kernel- or user-space.
2006-12-24 08:54:55 +00:00
yamt
4cfe5a1b41 - just associate fileassoc "table" to struct mount.
because the latter is always available during the lifetime of the former,
  there is little point to use another global list to keep track of them.
  it also allows to remove an #ifdef FILEASSOC.

- avoid some operations (memory allocation and VOP) in fileassoc_file_lookup,
  when fileassoc table is not used.

ok'ed by elad.
2006-12-14 09:24:54 +00:00
chs
c398ae9734 a smorgasbord of improvements to vnode locking and path lookup:
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
   these now always return the parent vnode locked.  namei() works as before.
   lookup() and various other paths no longer acquire vnode locks in the
   wrong order via vrele().  fixes PR 32535.
   as a nice side effect, path lookup is also up to 25% faster.
 - the above allows us to get rid of PDIRUNLOCK.
 - also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
 - remove an assumption in layer_node_find() that all file systems implement
   a recursive VOP_LOCK() (unionfs doesn't).
 - require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
   fill in eopnotsupp() for file systems that don't support being exported
   and remove the checks for NULL.  (layerfs calls these without checking.)
 - in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
   adjust which vnode is locked.  fixes PR 33374.
 - apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
2006-12-09 16:11:50 +00:00
elad
0c67c581a5 Massive restructuring and cleanup of Veriexec, mainly in preparation
for work on some future functionality.

  - Veriexec data-structures are no longer exposed.

  - Thanks to using proplib for data passing now, the interface
    changes further to accomodate that.

    Introduce four new functions. First, veriexec_file_add(), to add
    a new file to be monitored by Veriexec, to replace both
    veriexec_load() and veriexec_hashadd(). veriexec_table_add(), to
    replace veriexec_newtable(), will be used to optimize hash table
    size (during preload), and finally, veriexec_convert(), to convert
    an internal entry to one userland can read.

  - Introduce veriexec_unmountchk(), to enforce Veriexec unmount
    policy. This cleans up a bit of code in kern/vfs_syscalls.c.

  - Rename veriexec_tblfind() with veriexec_table_lookup(), and make
    it static. More functions that became static: veriexec_fp_cmp(),
    veriexec_fp_calc().

  - veriexec_verify() no longer returns the entry as well, but just
    sets a boolean indicating whether an entry was found or not.

  - veriexec_purge() now takes a struct vnode *.

  - veriexec_add_fp_name() was merged into veriexec_add_fp_ops(), that
    changed its name to veriexec_fpops_add(). veriexec_find_ops() was
    also renamed to veriexec_fpops_lookup().

    Also on the fp-ops front, the three function types used to initialize,
    update, and finalize a hash context were renamed to
    veriexec_fpop_init_t, veriexec_fpop_update_t, and veriexec_fpop_final_t
    respectively.

  - Introduce a new malloc(9) type, M_VERIEXEC, and use it instead of
    M_TEMP, so we can tell exactly how much memory is used by Veriexec.

  - And, most importantly, whitespace and indentation nits.

Built successfuly for amd64, i386, sparc, and sparc64. Tested on amd64.
2006-11-30 01:09:47 +00:00
elad
cbe2288b0c printf() -> log() for Veriexec messages. 2006-11-21 23:52:41 +00:00
hannken
e29b23b983 Add specificdata support to mount points.
Welcome to NetBSD 4.99.4

Approved by: Jason Thorpe <thorpej@netbsd.org>
2006-11-17 17:05:18 +00:00
yamt
1a7bc55dcc remove some __unused from function parameters. 2006-11-01 10:17:58 +00:00
mjf
a2be0ed655 Revert the changes I introduced trying to solve tmpfs' NFS export problem.
Requested by yamt@
2006-10-31 08:12:46 +00:00
mjf
84bd46b9f9 Add support to allow a file system to not permit being exported over NFS.
Approved by elad@ and wrstuden@
2006-10-24 21:53:10 +00:00
reinoud
0ce809091d Replace the LIST structure mp->mnt_vnodelist to a TAILQ structure since all
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.

An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.

In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
2006-10-20 18:58:12 +00:00
christos
152eb5a9c3 according to the manual, the last argument of quotactl(2) is a void *,
not a caddr_t.
2006-10-17 15:06:18 +00:00
christos
4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
elad
bada0c776a Don't use KAUTH_RESULT_* where it's not applicable.
Prompted by yamt@.
2006-09-13 10:07:42 +00:00
elad
0e73c20464 Oops, add forgotten 'if'.
From Geoff Wing, thanks!
2006-09-12 07:51:29 +00:00
elad
5f7169ccb1 First take at security model abstraction.
- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
  opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
  security model, called "bsd44". This is the default (and only) model we
  have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

  * There's a sample overlay model, sitting on-top of "bsd44", for
    fast experimenting with tweaking just a subset of an existing model.

    This is pretty cool because it's *really* straightforward to do stuff
    you had to use ugly hacks for until now...

  * And of course, documentation describing how to do the above for quick
    reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

	http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

  - Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
  - Checks 'securelevel' directly,
  - Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)
2006-09-08 20:58:56 +00:00
yamt
56d02ae53a vfs_copyinfh_alloc: kludge for nfsv2 file handles. 2006-08-08 13:08:08 +00:00
yamt
ac0b9042bb sys___fhstatvfs140: update a comment. 2006-08-04 17:07:32 +00:00
yamt
4977b4bbc0 some filehandle syscall related changes.
- remove the support of variable-sized filehandle from compat version of
  syscalls.  (strictly speaking, it breaks abi.  i don't think it's a problem
  because this feature is short-lived and there are no affected in-tree
  filesystems.)
- unify vfs_copyinfh_alloc and vfs_copyinfh_alloc_size.
- vfs_copyinfh_alloc_size: check fhsize strictly.
- reduce code duplication between compat and current syscalls.
2006-08-04 16:29:51 +00:00
yamt
e99f3cca81 vfs_copyinfh_alloc_size: fix indent. 2006-08-04 13:31:51 +00:00