proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:
- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.
- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.
- The system spends less time at IPL_SCHED, and there is less lock activity.
we have unlocked the kqueue to check its state, leave it queued and
re-check later.
- knote_dequeue: fold into knote_detach since nothing else uses it.
- Note a couple more problems.
- Redo reference counting to be sane. LWPs accessing files take a short
term reference on the local file descriptor. This is the most common
case. While a file is in a process descriptor table, a reference is
held to the file. The file reference count only changes during control
operations like open() or close(). Code that comes at files from an
unusual direction (i.e. foreign to the process) like procfs or sysctl
takes a reference on the file (f_count), and not on a descriptor.
- Remove knowledge of reference counting and locking from most code that
deals with files.
- Make the usual case of file descriptor lookup lockless.
- Make kqueue MP and MT safe. PR kern/38098, PR kern/38137.
- Fix numerous file handling bugs, and bugs in the descriptor code that
affected multithreaded processes.
- Split descriptor system calls out into sys_descrip.c.
- A few stylistic changes: KNF, remove unused casts now that caddr_t is
gone. Replace dumb gotos with loop control in a few places.
- Don't do redundant pointer passing (struct proc, lwp, filedesc *) unless
the routine is likely to be inlined. Most of the time it's about the
current process.
- Add a lot of missing selinit() and seldestroy() calls.
- Merge selwakeup() and selnotify() calls into a single selnotify().
- Add an additional 'events' argument to selnotify() call. It will
indicate which event (POLL_IN, POLL_OUT, etc) happen. If unknown,
zero may be used.
Note: please pass appropriate value of 'events' where possible.
Proposed on: <tech-kern>
- Add a KAUTH_PROCESS_SCHEDULER action, to handle scheduler related
requests, and add specific requests for set/get scheduler policy and
set/get scheduler parameters.
- Add a KAUTH_PROCESS_KEVENT_FILTER action, to handle kevent(2) related
requests.
- Add a KAUTH_DEVICE_TTY_STI action to handle requests to TIOCSTI.
- Add requests for the KAUTH_PROCESS_CANSEE action, indicating what
process information is being looked at (entry itself, args, env,
open files).
- Add requests for the KAUTH_PROCESS_RLIMIT action indicating set/get.
- Add requests for the KAUTH_PROCESS_CORENAME action indicating set/get.
- Make bsd44 secmodel code handle the newly added rqeuests appropriately.
All of the above make it possible to issue finer-grained kauth(9) calls in
many places, removing some KAUTH_GENERIC_ISSUSER requests.
- Remove the "CAN" from KAUTH_PROCESS_CAN{KTRACE,PROCFS,PTRACE,SIGNAL}.
Discussed with christos@ and yamt@.
int foo(struct lwp *l, void *v, register_t *retval)
to:
int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.
Restores source compatibility with pre-newlock2 tools like ps or top.
Reviewed by Andrew Doran.
kfilter slots are used: no guarantee previously that last
slot had a NULL name.
- Reuse previously deregistered user kfilter slots in
kfilter_register().
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
code.
- To achieve COMPAT_NETBSD32 compatibility, introduce a parameter to
kevent1 that points to functions that do the actual copyin/copyout
operations. This is similar to what was done in FreeBSD by Paul Saab.
- Add the COMPAT_NETBSD32 definitions and hooks.
1. make fileops const
2. add 2 new negative errno's to `officially' support the cloning hack:
- EDUPFD (used to overload ENODEV)
- EMOVEFD (used to overload ENXIO)
3. Created an fdclone() function to encapsulate the operations needed for
EMOVEFD, and made all cloners use it.
4. Centralize the local noop/badop fileops functions to:
fnullop_fcntl, fnullop_poll, fnullop_kqfilter, fbadop_stat
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.
Convert struct session, ucred and lockf to pools.
of using on-stack memory, so that this wouldn't eventually cause kernel
panic if the process get swapped out and another process runs kqueue_scan()
problem pointed out in kern/24220 by Stephan Uphoff
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.
Bump the kernel rev up to 1.6V
Avoids a lot of casting and removes the need for some line breaks.
Removed a load of (caddr_t) casts from calls to copyin/copyout as well.
(approved by christos - he has a plan to remove caddr_t...)
arbitrary number of timers, both oneshot and periodic.
from FreeBSD, only adapted to NetBSD kernel API - mstohz() instead
of tvtohz(), and takes advantage of callout_schedule() in filt_timerexpire()
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.