the BSD/POSIX per-process timers:
- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").
- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().
- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).
- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.
Welcome to NetBSD 9.99.77.
own LWP ID space, LWP IDs came from the same number space as PIDs. The
lead LWP of a process gets the PID as its LID. If a multi-LWP process's
lead LWP exits, the PID persists for the process.
In addition to providing system-wide unique thread IDs, this also lets us
eliminate the per-process LWP radix tree, and some associated locks.
Remove the separate "global thread ID" map added previously; it is no longer
needed to provide this functionality.
Nudged in this direction by ad@ and chs@.
when allocating a PID.
- Per above, proc_free_pid() no longer decrements nprocs. It's now done
in proc_free() right after proc_free_pid().
- Ensure nprocs is accessed using atomics everywhere.
identifier uniquely identifies an LWP across the entire system, and will
be used in future improvements in user-space synchronization primitives.
(Test disabled and libc stub not included intentionally so as to avoid
multiple libc version bumps.)
single threaded case. Replace scans of p->p_lwps with lookups in the
tree. Find free LIDs for new LWPs in the tree. Replace the hashed sleep
queues for park/unpark with lookups in the tree under cover of a RW lock.
- lwp_wait(): if waiting on a specific LWP, find the LWP via tree lookup and
return EINVAL if it's detached, not ESRCH.
- Group the locks in struct proc at the end of the struct in their own cache
line.
- Add some comments.
where curcpu() is defined as curlwp->l_cpu:
- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.
- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.
- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.
- Remove some KERNEL_LOCK handling which hasn't been needed for years.
This seems to take about 3us on my Intel system. Two changes required:
- Have the caller to mi_switch() be responsible for calling spc_lock().
- Avoid using l->l_cpu in mi_switch().
While here:
- Add a couple of calls to membar_enter()
- Have the idle LWP set itself to LSIDL, to match softint_thread().
- Remove unused return value from mi_switch().
Once a thread was stopped with ptrace(2), userland process must not
be able to unstop it deliberately or by an accident.
This was a Windows-style behavior that makes threading tracing fragile.
In the previous behavior vforking parent was keeping pointer to a child
and checking whether it clears a PL_PPWAIT in its bitfield p_lflag. However
a child can go invalid between exec/exit event from child and waking up
vforked parent and this can cause invalid pointer read and in the worst
scenario kernel crash.
In the new behavior vforked child keeps a reference to vforked parent LWP
and sets a value l_vforkwaiting to false. This means that vforked child
can finish its work, exec/exit and be terminated and once parent will be
woken up it will read its own field whether its child is still blocking.
Add new field in struct lwp: l_vforkwaiting protected by proc_lock.
In future it should be refactored and all PL_PPWAIT users transformed to
l_vforkwaiting and next l_vforkwaiting probably transformed into a bit
field.
This is another attempt of fixing this bug after <rmind> from 2012 in
commit:
Author: rmind <rmind@NetBSD.org>
Date: Sun Jul 22 22:40:18 2012 +0000
fork1: fix use-after-free problems. Addresses PR/46128 from Andrew Doran.
Note: PL_PPWAIT should be fully replaced and modificaiton of l_pflag by
other LWP is undesirable, but this is enough for netbsd-6.
The new version no longer performs unsafe access in l_lflag changing the
LP_VFORKWAIT bit.
Verified with ATF t_vfork and t_ptrace* tests and they are no longer
causing any issues in my local setup.
Fixes PR/46128 by Andrew Doran
It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.
Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.
The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.
The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.
Proposed on tech-kern
all.
+ Possible info leak: [len=4, leaked=4]
| #0 0xffffffff80baf397 in kleak_copyout
| #1 0xffffffff80b56d0c in sys_wait6
| #2 0xffffffff80259c42 in syscall
This change:
* Removes "options PERFCTRS", the associated includes, and the associated
ifdefs. In doing so, it removes several XXXSMPs in the MI code, which is
good.
* Removes the PMC code of ARM XSCALE.
* Removes all the pmc.h files. They were all empty, except for ARM XSCALE.
* Reorders the x86 PMC code not to rely on the legacy pmc.h file. The
definitions are put in sysarch.h.
* Removes the kern/sys_pmc.c file, and along with it, the sys_pmc_control
and sys_pmc_get_info syscalls. They are marked as OBSOL in kern,
netbsd32 and rump.
* Removes the pmc_evid_t and pmc_ctr_t types.
* Removes all the associated man pages. The sets are marked as obsolete.
This means that the full executable path is always available.
- exec_elf.c: use p->path to set AT_SUN_EXECNAME, and since this is
always set, do so unconditionally.
- kern_exec.c: simplify pathexec, use kmem_strfree where appropriate
and set p->p_path
- kern_exit.c: free p->p_path
- kern_fork.c: set p->p_path for the child.
- kern_proc.c: use p->p_path to return the executable pathname; the
NULL check for p->p_path, should be a KASSERT?
- exec.h: gc ep_path, it is not used anymore
- param.h: bump version, 'struct proc' size change
TODO:
1. reference count the path string, to save copy at fork and free
just before exec?
2. canonicalize the pathname by changing namei() to LOCKPARENT
vnode and then using getcwd() on the parent directory?
This is a legacy interface from 4.4BSD, and it was
introduced to overcome shortcomings of ptrace(2) at that time, which are
no longer relevant (performance). Today /proc/#/ctl offers a narrow
subset of ptrace(2) commands and is not applicable for modern
applications use beyond simplistic tracing scenarios.
This removal will simplify kernel internals. Users will still be able to
use all the other /proc files.
This change won't affect other procfs files neither Linux compat
features within mount_procfs(8). /proc/#/ctl isn't available on Linux.
Remove:
- /proc/#/ctl from mount_procfs(8)
- P_FSTRACE note from the documentation of ps(1)
- /proc/#/ctl and filesystem tracing documentation from mount_procfs(8)
- KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9)
- source code file miscfs/procfs/procfs_ctl.c
- PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h
- KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h
- PSL_FSTRACE (0x00010000) from sys/sys/proc.h
- P_FSTRACE (0x00010000) from sys/sys/sysctl.h
Reduce code complexity after removal of this functionality.
Update TODO.ptrace accordingly: remove two entries about /proc tracing.
Do not keep legacy notes as comments in the headers about removed
PSL_FSTRACE / P_FSTRACE, as this interface had little number of users
(close or equal to zero).
Proposed on tech-kern@.
All filesystem tracing utility users are encouraged to switch to ptrace(2).
Sponsored by <The NetBSD Foundation>
This removes dead code introduced with the following commit:
date: 2012-07-27 22:52:49 +0200; author: christos; state: Exp; lines: +8 -2;
revert racy vfork() parent-blocking-before-child-execs-or-exits code.
ok rmind
Only change it when we are being permanently reparented to init. Since
p_ppid is only used as a cached value to retrieve the parent's process id
from userland, this change makes it correct at all times. Idea from kre@
Revert specialized logic from getpid/getppid now that it is not needed.
Revert 1.264 - that was intended to fix 51600, but didn't, it just
hid the problem, and caused 51606. This fixes 51606.
Handle waiting on a process that has been detatched from its parent
because of being ptrace'd by some other process. This fixes 51600.
("handle" here means that the wait() hangs, or with WNOHANG, returns 0,
we cannot actually wait on a process that is not currently an attached
child.)
Note: the detatched process waiting is not yet perfect (it fails to
take account of options like WALLSIG and WALTSIG) - suport for those
(that is, ignoring a detatched child that one of those options will
later cause to be ignored when the process is re-attached.)
For now, for ither than when waiting for a specific process ID, when
a process does a wait() sys call (any of them), has no applicable
children attached that can be returned, and has at least one detatched
child, then we do a linear search of all processes to look for a
suitable detatched child. This is likely to be slow - but very rare.
Eventually it might be better to keep a list of detatched children
per process.
on context) into:
1. p_xexit: exit code
2. p_xsig: signal number
3. p_sflag & WCOREFLAG bit to indicated that the process core-dumped.
Fix the documentation of the flag bits in <sys/proc.h>
value, and update its parent's p_nstopchild value when marking the
process's p_stat to SSTOP. The process needed to be SACTIVE to get
here, so this transition represents an additional process for which
the parent needs to wait.
Fixes PR kern/50308
Pullups will be requested for:
NetBSD-7, -6, -6-0, -6-1, -5, -5-0, -5-1, and -5-2
of reaping the process (nor any other children), the process wil get
reparented to init. Since the state of the exiting process at this point
is SDEAD, proc_reparent() will not update either the old or new parent's
p_nstopchild counters.
This change causes both old and new parents to be properly updated.
Fixes PR kern/50300
Pullups will be requested for:
NetBSD-7, -6, -6-0, -6-1, -5, -5-0, -5-1, and -5-2
can named the same as those on other platforms.
For example, proc:::exec-success, not proc:::exec_success.
Implementation follows the same basic principle as FreeBSD's; add
another field to the SDT_PROBE_DEFINE macro which is the name
as exposed to userland.
in a case of process exit. Necessary to re-flag all LWPs for exit, as their
state might have changed or new LWPs spawned.
Should fix PR/46168 and PR/46402.