Make sure we always FILE_UNUSE the file. To make it easier, exit
via a new "out:" exit path that does so, setting error beforehand.
Fix suggested by Elad, hand-typed by me.
mount_getargs(), and mount_domount() to handle three main things it can
do.
This makes the code more readable and removes the horrible goto mess
that was lurking there since forever... it also makes it easier to
implement a security policy for that code.
Introduce the (intentionally undocumented) pathname_get(), pathname_path(),
and pathname_put(), to deal with allocating and copying of pathnames from
either kernel- or user-space.
because the latter is always available during the lifetime of the former,
there is little point to use another global list to keep track of them.
it also allows to remove an #ifdef FILEASSOC.
- avoid some operations (memory allocation and VOP) in fileassoc_file_lookup,
when fileassoc table is not used.
ok'ed by elad.
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
for work on some future functionality.
- Veriexec data-structures are no longer exposed.
- Thanks to using proplib for data passing now, the interface
changes further to accomodate that.
Introduce four new functions. First, veriexec_file_add(), to add
a new file to be monitored by Veriexec, to replace both
veriexec_load() and veriexec_hashadd(). veriexec_table_add(), to
replace veriexec_newtable(), will be used to optimize hash table
size (during preload), and finally, veriexec_convert(), to convert
an internal entry to one userland can read.
- Introduce veriexec_unmountchk(), to enforce Veriexec unmount
policy. This cleans up a bit of code in kern/vfs_syscalls.c.
- Rename veriexec_tblfind() with veriexec_table_lookup(), and make
it static. More functions that became static: veriexec_fp_cmp(),
veriexec_fp_calc().
- veriexec_verify() no longer returns the entry as well, but just
sets a boolean indicating whether an entry was found or not.
- veriexec_purge() now takes a struct vnode *.
- veriexec_add_fp_name() was merged into veriexec_add_fp_ops(), that
changed its name to veriexec_fpops_add(). veriexec_find_ops() was
also renamed to veriexec_fpops_lookup().
Also on the fp-ops front, the three function types used to initialize,
update, and finalize a hash context were renamed to
veriexec_fpop_init_t, veriexec_fpop_update_t, and veriexec_fpop_final_t
respectively.
- Introduce a new malloc(9) type, M_VERIEXEC, and use it instead of
M_TEMP, so we can tell exactly how much memory is used by Veriexec.
- And, most importantly, whitespace and indentation nits.
Built successfuly for amd64, i386, sparc, and sparc64. Tested on amd64.
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.
An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.
In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
- Add a few scopes to the kernel: system, network, and machdep.
- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.
- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.
- Update all relevant documentation.
- Add some code and docs to help folks who want to actually use this stuff:
* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.
This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...
* And of course, documentation describing how to do the above for quick
reference, including code samples.
All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:
http://kauth.linbsd.org/kauthwiki
NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:
- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.
(or if you feel you have to, contact me first)
This is still work in progress; It's far from being done, but now it'll
be a lot easier.
Relevant mailing list threads:
http://mail-index.netbsd.org/tech-security/2006/01/25/0011.htmlhttp://mail-index.netbsd.org/tech-security/2006/03/24/0001.htmlhttp://mail-index.netbsd.org/tech-security/2006/04/18/0000.htmlhttp://mail-index.netbsd.org/tech-security/2006/05/15/0000.htmlhttp://mail-index.netbsd.org/tech-security/2006/08/01/0000.htmlhttp://mail-index.netbsd.org/tech-security/2006/08/25/0000.html
Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).
Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.
Happy birthday Randi! :)
- remove the support of variable-sized filehandle from compat version of
syscalls. (strictly speaking, it breaks abi. i don't think it's a problem
because this feature is short-lived and there are no affected in-tree
filesystems.)
- unify vfs_copyinfh_alloc and vfs_copyinfh_alloc_size.
- vfs_copyinfh_alloc_size: check fhsize strictly.
- reduce code duplication between compat and current syscalls.
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.
introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.
this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.
as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.
also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.
tons of input from yamt@, wrstuden@, martin@, and christos@.
- FHANDLE_SIZE_MAX: refuse unreasonable size allocation, esp. when
it's a user-specified value.
- FHANDLE_SIZE_MIN: pad small filehandles with zero for compatibility.
XXX it might be better to push this into filesystem dependent code so that
new filesystems can choose smaller handles.
- restructure code so that it doesn't try to allocate user-specified
unbound amount of memory.
- don't ignore copyout failure in the case of E2BIG.
- rename vfs_copyinfh to vfs_copyinfh_alloc for consistency.