Commit Graph

289 Commits

Author SHA1 Message Date
pooka
b89c189be7 Declare extern syscallnames in a header. 2009-06-02 23:21:37 +00:00
mrg
fcc023545e - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes.  this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS.  (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information.  (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897.  it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
2009-03-29 01:02:48 +00:00
christos
c29e9578af use kauth instead of uid != 0 2009-03-24 21:00:05 +00:00
christos
4dbda5da97 don't enforce maxproc resource limits for root. 2009-03-07 19:23:02 +00:00
apb
0cc72e51ac Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.
2009-02-13 22:41:00 +00:00
cegger
9b87d582bd kill MALLOC and FREE macros. 2008-12-17 20:51:31 +00:00
ad
1a8ada2ed9 exec_add, exec_remove: allow zero entries in case a module provides nothing. 2008-11-28 10:55:10 +00:00
ad
92ce8c6a3d Make the emulations, exec formats, coredump, NFS, and the NFS server
into modules. By and large this commit:

- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
2008-11-19 18:35:57 +00:00
ad
0efea177e3 Remove LKMs and switch to the module framework, pass 1.
Proposed on tech-kern@.
2008-11-12 12:35:50 +00:00
matt
8f23ec634b Only define/use saemul_netbsd if KERN_SA is defined. (maybe this should be
moved to compat_sa.c)
2008-10-21 20:52:11 +00:00
wrstuden
fc7511b00e Merge wrstuden-revivesa into HEAD. 2008-10-15 06:51:17 +00:00
pooka
7e5aba5af0 Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.
2008-10-11 13:40:57 +00:00
ad
b94f79f0e8 Replce exec_map with a pool. Proposed on tech-kern@, reviewed by chs@. 2008-07-02 17:28:54 +00:00
ad
8c5bcce4b9 execve1:
- Properly terminate the fake argument list.
- Fix two double-frees.
2008-06-24 18:04:52 +00:00
ad
5adf7333fd - PPWAIT is need only be locked by proc_lock, so move it to proc::p_lflag.
- Remove a few needless lock acquires from exec/fork/exit.
- Sprinkle branch hints.

No functional change.
2008-06-16 09:51:14 +00:00
ad
d78b3bf8f2 PR kern/38881 execve(2) panic: lock error, with path argument > PATH_MAX 2008-06-06 22:21:11 +00:00
ad
65b594ff61 Make sure the PAX flags are copied/zeroed correctly. 2008-06-04 12:16:06 +00:00
ad
284c2b9aef Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
2008-04-24 18:39:20 +00:00
ad
6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
ad
a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
christos
3aa2f8095e Check for number of processes resource violation in execve(). 2008-02-24 21:46:04 +00:00
elad
4240d37675 KTRFAC_ROOT -> KTRFAC_PERSISTENT, and update comments.
Discussed with christos@ and yamt@.
2008-02-02 20:42:18 +00:00
dsl
46e566819b Don't allocated the stackgap during exec (but do allocate 32 bytes
for 'stack grows up' systems for the _rtld interface).
2008-01-20 10:15:50 +00:00
yamt
5661ec0901 - malloc -> kmem_alloc
- kill M_EXEC.
2008-01-03 14:36:57 +00:00
yamt
22d570119a use kmem_alloc instead of malloc. 2008-01-02 19:44:36 +00:00
ad
4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
ad
2ecdf58c2c Remove systrace. Ok core@. 2007-12-31 15:31:24 +00:00
elad
0f25f24ed8 Provide 8 more bits of stack randomization, from the PaX author.
While here, don't make too much use of one random value, and call
arc4random() directly. Allows for the removal of 'ep_random' from the
exec_package.

Prompted by and okay christos@.
2007-12-28 17:14:50 +00:00
xtraeme
64ee7cfdb7 Fix build without debug enabled:
/usr/src/sys/kern/kern_exec.c: In function 'execve1':
/usr/src/sys/kern/kern_exec.c:505: warning: empty body in an if-statement
2007-12-26 22:49:19 +00:00
christos
65c680cad7 Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]
For regular (non PIE) executables randomization is enabled for:
    1. The data segment
    2. The stack

For PIE executables(*) randomization is enabled for:
    1. The program itself
    2. All shared libraries
    3. The data segment
    4. The stack

(*) To generate a PIE executable:
    - compile everything with -fPIC
    - link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
    options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.
2007-12-26 22:11:47 +00:00
ad
ea3f10f7e0 Merge more changes from vmlocking2, mainly:
- Locking improvements.
- Use pool_cache for more items.
2007-12-26 16:01:34 +00:00
dsl
7e2790cf6f Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
    int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
2007-12-20 23:02:38 +00:00
pooka
db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
christos
9e2c6405b0 kill a diagnostic for now. 2007-12-03 02:20:24 +00:00
christos
33cb8be1db - add an elf aux vector entry for implementing $ORIGIN.
- the code to convert from a vnode to a path is commented out now until
  a better solution is implemented. Only absolute paths work for now
  (which is most of the cases).

requested by core
2007-12-03 02:06:57 +00:00
pooka
61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
ad
b668a9a05f Add _lwp_ctl() system call: provides a bidirectional, per-LWP communication
area between processes and the kernel.
2007-11-12 23:11:58 +00:00
ad
d18c6ca4de Merge from vmlocking:
- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
2007-11-07 00:23:13 +00:00
ad
bccf777b72 Make ras_lookup() lockless. 2007-10-24 14:50:38 +00:00
ad
a2a3828545 machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h 2007-10-19 11:59:34 +00:00
pooka
87319696e5 Add a comment clarifying that in the succesful case the function
returns from the middle of the loop.
2007-10-02 12:01:17 +00:00
christos
861eaa84d8 - add debugging info to the remaining failure cases in execve1.
- use size_t where appropriate.
2007-09-20 20:51:38 +00:00
ad
63c4506184 Changes to make ktrace LKM friendly and reduce ifdef KTRACE. Proposed
on tech-kern.
2007-08-15 12:07:23 +00:00
pooka
05ce20f4a0 Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
2007-07-22 19:16:04 +00:00
yamt
f03010953f merge yamt-idlelwp branch. asked by core@. some ports still needs work.
from doc/BRANCHES:

	idle lwp, and some changes depending on it.

	1. separate context switching and thread scheduling.
	   (cf. gmcgarry_ctxsw)
	2. implement idle lwp.
	3. clean up related MD/MI interfaces.
	4. make scheduler(s) modular.
2007-05-17 14:51:11 +00:00
dsl
b8fbaf8c4b Change the way that emulations locate files within the emulation root to
avoid having to allocate space in the 'stackgap'
  - which is very LWP unfriendly.
The additional code for non-emulation namei() is trivial, the reduction for
  the emulations is massive.
The vnode for a processes emulation root is saved in the cwdi structure
  during process exec.
If the emulation root the TRYEMULROOT flag are set, namei() will do an initial
  search for absolute pathnames in the emulation root, if that fails it will
  retry from the normal root.
".." at the emulation root will always go to the real root, even in the middle
  of paths and when expanding symlinks.
Absolute symlinks found using absolute paths in the emulation root will be
  relative to the emulation root (so /usr/lib/xxx.so -> /lib/xxx.so links
  inside the emulation root don't need changing).
If the root of the emulation would be returned (for an emulation lookup), then
  the real root is returned instead (matching the behaviour of emul_lookup,
  but being a cheap comparison here) so that programs that scan "../.."
  looking for the root dircetory don't loop forever.
The target for symbolic links is no longer mangled (it used to get the
  CHECK_ALT_xxx() treatment, so could get /emul/xxx prepended).
CHECK_ALT_xxx() are no more. Most of the change is deleting them, and adding
  TRYEMULROOT to the flags to NDINIT().
A lot of the emulation system call stubs could now be deleted.
2007-04-22 08:29:55 +00:00
matt
770557b66a If STACKALIGN is defined, use it instead of ALIGN. Some arches need a
stack more aligned that ALIGN will give.
2007-03-09 22:25:56 +00:00
ad
c147748d84 - Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.
2007-03-09 14:11:22 +00:00
dogcow
66b89c08f2 die, caddr_t, die. 2007-03-05 04:59:19 +00:00
thorpej
4f3d5a9cc0 TRUE -> true, FALSE -> false 2007-02-22 06:34:42 +00:00