Bug fixes:
- Fix crash reported by Scott Ellis on current-users@.
- Fix race conditions in enforcing the Veriexec rename and remove
policies. These are NOT security issues.
- Fix memory leak in rename handling when overwriting a monitored
file.
- Fix table deletion logic.
- Don't prevent query requests if not in learning mode.
KPI updates:
- fileassoc_table_run() now takes a cookie to pass to the callback.
- veriexec_table_add() was removed, it is now done internally. As a
result, there's no longer a need for VERIEXEC_TABLESIZE.
- veriexec_report() was removed, it is now internal.
- Perform sanity checks on the entry type, and enforce default type
in veriexec_file_add() rather than in veriexecctl.
- Add veriexec_flush(), used to delete all Veriexec tables, and
veriexec_dump(), used to fill an array with all Veriexec entries.
New features:
- Add a '-k' flag to veriexecctl, to keep the filenames in the kernel
database. This allows Veriexec to produce slightly more accurate
logs under certain circumstances. In the future, this can be either
replaced by vnode->pathname translation, or combined with it.
- Add a VERIEXEC_DUMP ioctl, to dump the entire Veriexec database.
This can be used to recover a database if the file was lost.
Example usage:
# veriexecctl dump > /etc/signatures
Note that only entries with the filename kept (that is, were loaded
with the '-k' flag) will be dumped.
Idea from Brett Lymn.
- Add a VERIEXEC_FLUSH ioctl, to delete all Veriexec entries. Sample
usage:
# veriexecctl flush
- Add a 'veriexec_flags' rc(8) variable, and make its default have
the '-k' flag. On systems using the default signatures file
(generaetd from running 'veriexecgen' with no arguments), this will
use additional 32kb of kernel memory on average.
- Add a '-e' flag to veriexecctl, to evaluate the fingerprint during
load. This is done automatically for files marked as 'untrusted'.
Misc. stuff:
- The code for veriexecctl was massively simplified as a result of
eliminating the need for VERIEXEC_TABLESIZE, and now uses a single
pass of the signatures file, making the loading somewhat faster.
- Lots of minor fixes found using the (still under development)
Veriexec regression testsuite.
- Some of the messages Veriexec prints were improved.
- Various documentation fixes.
All relevant man-pages were updated to reflect the above changes.
Binary compatibility with existing veriexecctl binaries is maintained.
which can either be copied directly to userspace, or converted then copied.
Saves replicating a lot of code in the compat functions (esp. for
getvfsstat) at a cast of an extra function call in the non-emulated case -
which is unlikely to be measurable given the other costs of the actions
involved (even on vax).
Remove dofhstat() and dofhstatvfs() (and the last caller).
Remove some redundant stackgap_init() calls.
avoid having to allocate space in the 'stackgap'
- which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
search for absolute pathnames in the emulation root, if that fails it will
retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
the real root is returned instead (matching the behaviour of emul_lookup,
but being a cheap comparison here) so that programs that scan "../.."
looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
the "mountpoint" vnode twice due to an error branch.
thanks go to Gert Doering for reporting the problem and testing the fix
and to Juergen Hannken-Illjes for much of the analysis work leading to
the discovery of the problem cause
corresponding flags.
Revert softdep_trackbufs() to its state before vn_start_write() was added.
Remove from struct mount now unneeded flags IMNT_SUSPEND* and
members mnt_writeopcountupper, mnt_writeopcountlower and mnt_leaf.
Welcome to 4.99.17
separate functions that don't do the copyout.
This allows all the compat_xxx versions to convert the 'struct stat' to
the correct format without using the 'stackgap'.
The stackgap isn't at all LWP friendly, and needs to be removed from
any compat functions that might involve threads (inc. clone()).
The code is still binary compatible with existing LKMs.
- don't use SAVESTART in calls to relookup() from unionfs,
just vref() the desired vnode when we need to.
- fix locking and refcounting in the unionfs EEXIST error cases.
- release any vnode locks before calling VFS_ROOT(), vfs_busy() is enough.
this allows us to simplify union_root() and fix PR 3006.
- union_lock() doesn't handle shared lock requests correctly,
so convert them to exclusive instead. fixes PR 34775.
- in relookup(), avoid reusing "dp" for different purposes,
the error handling wasn't right. (actually just get rid of dp.)
also, change relookup() to ignore LOCKLEAF and always return the
vnode locked since the callers already expect this.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.
Implemented for file systems of type ffs.
The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.
Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.
Welcome to 4.99.9 (new vfs op vfs_suspendctl).
tail instead of an explicit check to add to the head for an empty
queue. Apparently TAILQ_INSERT_HEAD happens to work for a
non-initialized head and does implicit initialization so that
TAILQ_INSERT_TAIL works after that.
Make sure we always FILE_UNUSE the file. To make it easier, exit
via a new "out:" exit path that does so, setting error beforehand.
Fix suggested by Elad, hand-typed by me.
mount_getargs(), and mount_domount() to handle three main things it can
do.
This makes the code more readable and removes the horrible goto mess
that was lurking there since forever... it also makes it easier to
implement a security policy for that code.
Introduce the (intentionally undocumented) pathname_get(), pathname_path(),
and pathname_put(), to deal with allocating and copying of pathnames from
either kernel- or user-space.
because the latter is always available during the lifetime of the former,
there is little point to use another global list to keep track of them.
it also allows to remove an #ifdef FILEASSOC.
- avoid some operations (memory allocation and VOP) in fileassoc_file_lookup,
when fileassoc table is not used.
ok'ed by elad.
- LOCKPARENT is no longer relevant for lookup(), relookup() or VOP_LOOKUP().
these now always return the parent vnode locked. namei() works as before.
lookup() and various other paths no longer acquire vnode locks in the
wrong order via vrele(). fixes PR 32535.
as a nice side effect, path lookup is also up to 25% faster.
- the above allows us to get rid of PDIRUNLOCK.
- also get rid of WANTPARENT (just use LOCKPARENT and unlock it).
- remove an assumption in layer_node_find() that all file systems implement
a recursive VOP_LOCK() (unionfs doesn't).
- require that all file systems supply vfs_vptofh and vfs_fhtovp routines.
fill in eopnotsupp() for file systems that don't support being exported
and remove the checks for NULL. (layerfs calls these without checking.)
- in union_lookup1(), don't change refcounts in the ISDOTDOT case, just
adjust which vnode is locked. fixes PR 33374.
- apply fixes for ufs_rename() from ufs_vnops.c rev. 1.61 to ext2fs_rename().
for work on some future functionality.
- Veriexec data-structures are no longer exposed.
- Thanks to using proplib for data passing now, the interface
changes further to accomodate that.
Introduce four new functions. First, veriexec_file_add(), to add
a new file to be monitored by Veriexec, to replace both
veriexec_load() and veriexec_hashadd(). veriexec_table_add(), to
replace veriexec_newtable(), will be used to optimize hash table
size (during preload), and finally, veriexec_convert(), to convert
an internal entry to one userland can read.
- Introduce veriexec_unmountchk(), to enforce Veriexec unmount
policy. This cleans up a bit of code in kern/vfs_syscalls.c.
- Rename veriexec_tblfind() with veriexec_table_lookup(), and make
it static. More functions that became static: veriexec_fp_cmp(),
veriexec_fp_calc().
- veriexec_verify() no longer returns the entry as well, but just
sets a boolean indicating whether an entry was found or not.
- veriexec_purge() now takes a struct vnode *.
- veriexec_add_fp_name() was merged into veriexec_add_fp_ops(), that
changed its name to veriexec_fpops_add(). veriexec_find_ops() was
also renamed to veriexec_fpops_lookup().
Also on the fp-ops front, the three function types used to initialize,
update, and finalize a hash context were renamed to
veriexec_fpop_init_t, veriexec_fpop_update_t, and veriexec_fpop_final_t
respectively.
- Introduce a new malloc(9) type, M_VERIEXEC, and use it instead of
M_TEMP, so we can tell exactly how much memory is used by Veriexec.
- And, most importantly, whitespace and indentation nits.
Built successfuly for amd64, i386, sparc, and sparc64. Tested on amd64.
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.
An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.
In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
- Add a few scopes to the kernel: system, network, and machdep.
- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.
- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.
- Update all relevant documentation.
- Add some code and docs to help folks who want to actually use this stuff:
* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.
This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...
* And of course, documentation describing how to do the above for quick
reference, including code samples.
All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:
http://kauth.linbsd.org/kauthwiki
NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:
- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.
(or if you feel you have to, contact me first)
This is still work in progress; It's far from being done, but now it'll
be a lot easier.
Relevant mailing list threads:
http://mail-index.netbsd.org/tech-security/2006/01/25/0011.htmlhttp://mail-index.netbsd.org/tech-security/2006/03/24/0001.htmlhttp://mail-index.netbsd.org/tech-security/2006/04/18/0000.htmlhttp://mail-index.netbsd.org/tech-security/2006/05/15/0000.htmlhttp://mail-index.netbsd.org/tech-security/2006/08/01/0000.htmlhttp://mail-index.netbsd.org/tech-security/2006/08/25/0000.html
Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).
Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.
Happy birthday Randi! :)
- remove the support of variable-sized filehandle from compat version of
syscalls. (strictly speaking, it breaks abi. i don't think it's a problem
because this feature is short-lived and there are no affected in-tree
filesystems.)
- unify vfs_copyinfh_alloc and vfs_copyinfh_alloc_size.
- vfs_copyinfh_alloc_size: check fhsize strictly.
- reduce code duplication between compat and current syscalls.