machine
Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.
we no longer need to guard against access from hardware interrupt handlers.
Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:
- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.
- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.
- The system spends less time at IPL_SCHED, and there is less lock activity.
- Add a KAUTH_PROCESS_SCHEDULER action, to handle scheduler related
requests, and add specific requests for set/get scheduler policy and
set/get scheduler parameters.
- Add a KAUTH_PROCESS_KEVENT_FILTER action, to handle kevent(2) related
requests.
- Add a KAUTH_DEVICE_TTY_STI action to handle requests to TIOCSTI.
- Add requests for the KAUTH_PROCESS_CANSEE action, indicating what
process information is being looked at (entry itself, args, env,
open files).
- Add requests for the KAUTH_PROCESS_RLIMIT action indicating set/get.
- Add requests for the KAUTH_PROCESS_CORENAME action indicating set/get.
- Make bsd44 secmodel code handle the newly added rqeuests appropriately.
All of the above make it possible to issue finer-grained kauth(9) calls in
many places, removing some KAUTH_GENERIC_ISSUSER requests.
- Remove the "CAN" from KAUTH_PROCESS_CAN{KTRACE,PROCFS,PTRACE,SIGNAL}.
Discussed with christos@ and yamt@.
- Lock processes, credentials, filehead etc correctly.
- Acquire a read hold on sysctl_treelock if only doing a query.
- Don't wire down the output buffer. It doesn't work correctly and the code
regularly does long term sleeps with it held - it's not worth it.
- Don't hold locks other than sysctl_lock while doing copyout().
- Drop sysctl_lock while doing copyout / allocating memory in a few places.
- Don't take kernel_lock for sysctl.
- Fix a number of bugs spotted along the way
For regular (non PIE) executables randomization is enabled for:
1. The data segment
2. The stack
For PIE executables(*) randomization is enabled for:
1. The program itself
2. All shared libraries
3. The data segment
4. The stack
(*) To generate a PIE executable:
- compile everything with -fPIC
- link with -shared-libgcc -Wl,-pie
This feature is experimental, and might change. To use selectively add
options PAX_ASLR=0
in your kernel.
Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.
- Don't ignore return value of settime() in sysctl_kern_rtc_offset(), as
suggested by yamt@.
Note: the kauth(9) call in sysctl_kern_rtc_offset() is bogus, but this will
be addressed separately.
tech-kern:
- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
by deleting arguments. This is a popular practice, and failing means that
ps(1) prints (programname). For example this is what XtOpenDisplay() with
-geometry. This used to work before 2.0H, and the behavior is allowed and
hinted by POSIX. Found out by Anon Ymous.
can copy directly to/from userspace.
Avoids exposing the implementation of the group list as an array to code
outside kern_auth.c.
compat code and man page need updating.
from doc/BRANCHES:
idle lwp, and some changes depending on it.
1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
I think it existed to cache the numbers in kernel memory of a zombie when
proc->p_stats was part of the 'u' area - so got freed earlier and wouldn't
(easily) be accessible from a separate process. However since both the
p_ru and p_stats fields are freed at the same time it is no longer needed.
Ride the recent 4.99.19 version change.
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.
Restores source compatibility with pre-newlock2 tools like ps or top.
Reviewed by Andrew Doran.