version or the 32bit compat layout in execve1. Introduce a new function
copyin_psstrings for reading it back from userland and converting it to
the native layout. Refactor procfs to share most of the code with the
kern.proc_args sysctl handler.
This material is based upon work partially supported by
The NetBSD Foundation under a contract with Joerg Sonnenberger.
Otherwise the child will incorrectly see it is not running on any
CPU. Among other things, this fixes crashes from having
LD_PRELOAD=libpthread.so set in the env.
reviewed by tech-kern
that scheduler locks are special in this regard - adaptive locks cannot
be in the path due to turnstiles. Randomly spotted/reported by uebayasi@.
- Remove unused lwp_relock() and replace lwp_lock_retry() by simplifying
lwp_lock() and sleepq_enter() a little.
- Give alllwp its own cache-line and mark lwp_cache pointer as read-mostly.
OK ad@
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)
This removes the need for the SAVENAME and HASBUF namei flags.
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.
Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).
The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.
- update the linux syscall table for each platform.
- support new-style (NPTL) linux pthreads on all platforms.
clone() with CLONE_THREAD uses 1 process with many LWPs
instead of separate processes.
- move the contents of sys__lwp_setprivate() into a new
lwp_setprivate() and use that everywhere.
- update linux_release[] and linux32_release[] to "2.6.18".
- adjust placement of emul fork/exec/exit hooks as needed
and adjust other emul code to match.
- convert all struct emul definitions to use named initializers.
- change the pid allocator to allow multiple pids to refer to the same proc.
- remove a few fields from struct proc that are no longer needed.
- disable the non-functional "vdso" code in linux32/amd64,
glibc works fine without it.
- fix a race in the futex code where we could miss a wakeup after
a requeue operation.
- redo futex locking to be a little more efficient.
things: passing an argument to check_exec, which is better done explicitly,
and handing back the resolved pathname generated by namei, which we can
make an explicit slot for in struct exec_package instead. While here,
perform some related tidyup, and store the kernel-side copy of the path
to the executable as well as the pointer into userspace. (But the latter
should probably be removed in the future.)
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase). Plenty of mix'n match upper/lowercase has creeped
into the tree since then. Nuke the macros and convert all callsites
to lowercase.
no functional change
address space available to processes. this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.
- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.
- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)
- patch sh, and csh to notice RLIMIT_AS. (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)
- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.
- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information. (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)
this addresses PR 7897. it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.
tested on i386 and sparc64, build tested on several other platforms.
thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
into modules. By and large this commit:
- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
we no longer need to guard against access from hardware interrupt handlers.
Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:
- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.
- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.
- The system spends less time at IPL_SCHED, and there is less lock activity.