fileassoc.diff adds a fileassoc_table_run() routine that allows you to
pass a callback to be called with every entry on a given mount.
veriexec.diff adds some raw device access policies: if raw disk is
opened at strict level 1, all fingerprints on this disk will be
invalidated as a safety measure. level 2 will not allow opening disk
for raw writing if we monitor it, and prevent raw writes to memory.
level 3 will not allow opening any disk for raw writing.
both update all relevant documentation.
veriexec concept is okay blymn@.
- remove the support of variable-sized filehandle from compat version of
syscalls. (strictly speaking, it breaks abi. i don't think it's a problem
because this feature is short-lived and there are no affected in-tree
filesystems.)
- unify vfs_copyinfh_alloc and vfs_copyinfh_alloc_size.
- vfs_copyinfh_alloc_size: check fhsize strictly.
- reduce code duplication between compat and current syscalls.
If we cannot write on the slave side, always return EWOULDBLOCK in the
non-blocking case, because we don't know that the buffer we started
writing is actually in a system call boundary.
sysctl(9) flags CTLFLAG_READONLY[12]. luckily they're not documented
so it's only half regression.
only two knobs used them; proc.curproc.corename (check added in the
existing handler; its CTLFLAG_ANYWRITE, yay) and net.inet.ip.forwsrcrt,
that got its own handler now too.
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.
- When freeing, test the value of cr_refcnt from inside the lock perimiter.
- Change some uint16_t/uint32_t types to u_int.
- KASSERT(cr_refcnt > 0) in appropriate places.
- KASSERT(cr_refcnt == 1) when changing the credential.
for two kauth_cred_t rather than kauth_cred_t and struct proc *.
advise against using it in the man-page; it should be used only in cases
where we either don't have an object-specific op or when we can't easily
use one.
struct actually keeps the start of the UTC
time scale and not the boot time. the relationship
is: utc-time = up-time + timebase.
background: when doing an ACPI sleep the uptime
freezes and on wakeup the tc_setclock() leads to
a new timebasebin - this had no relationship with
a boottime as the structure was previously called.
discussed on tech-kern@
anomalies (moving boottime, uptime describing running time)
where discovered by Arnaud Lacombe.
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot
introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.
this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.
as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.
also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.
tons of input from yamt@, wrstuden@, martin@, and christos@.
- FHANDLE_SIZE_MAX: refuse unreasonable size allocation, esp. when
it's a user-specified value.
- FHANDLE_SIZE_MIN: pad small filehandles with zero for compatibility.
XXX it might be better to push this into filesystem dependent code so that
new filesystems can choose smaller handles.
- restructure code so that it doesn't try to allocate user-specified
unbound amount of memory.
- don't ignore copyout failure in the case of E2BIG.
- rename vfs_copyinfh to vfs_copyinfh_alloc for consistency.
- don't initialize it needlessly.
- fix the poll code the same way the select code was fixed, so that it
computes the remaining time to sleep properly.
While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.
Discussed on tech-kern, with lots of help from yamt (thanks!).
select(2) timeouts. Introduced via timecounter branch from a
tvtohz() conversion.
The left over timeout was not decremented when re-starting
the sleep in select.
on alpha (which were broken for more than a year appearently and noone
noticed). (The other archs didn't suffer because their pmap_kenter_pa()
doesn't support non-executable mappings.)
With sock_loan_thresh=4096, sb_lowat==sb_hiwat, and sowritable will never
be true (even if only a single byte is pending). Some programs (like screen)
expect select() to return that a socket is writable on a socket when there
is space to write to it. XXX: What is the right thing to do here?
EPROTONOSUPPORT instead of EAFNOSUPPORT.
from pavel@ with a little bit of clean up from myself.
XXX: netbsd32 (and perhaps other emulations) should be able
XXX: to call the standard socket calls for this i think, but
XXX: revisit this at another time.
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)