Commit Graph

78 Commits

Author SHA1 Message Date
christos
1e7fb326f1 de-triplicate. 2017-05-07 22:54:54 +00:00
joerg
4f77b889d0 Extend the mmap(2) interface to allow requesting protections for later
use with mprotect(2), but without enabling them immediately.

Extend the mremap(2) interface to allow duplicating mappings, i.e.
create a second range of virtual addresses references the same physical
pages. Duplicated mappings can have different effective protections.

Adjust PAX mprotect logic to disallow effective protections of W&X, but
allow one mapping W and another X protections. This obsoletes using
temporary files for purposes like JIT.

Adjust PAX logic for mmap(2) and mprotect(2) to fail if W&X is requested
and not silently drop the X protection.

Improve test cases to ensure correct operation of the changed
interfaces.
2017-05-06 21:34:51 +00:00
christos
b039ee7763 reduce #ifdef mess caused by PaX 2016-05-22 14:26:09 +00:00
christos
f305e57def - make pax aslr stack eat up to 1/8 of the max stack space insted of 1/4
and reduce the length of the randomization bits since this is unused.
- call the pax aslr stack function sooner so we don't need to re-adjust the
  stack size.
- adjust the stack max resource limit to account for the maximum space that
  can be lost by aslr
- tidy up debugging printfs
2016-05-13 17:33:43 +00:00
christos
57b625b6f2 remove more ifdefs 2016-04-07 12:06:50 +00:00
christos
03c12592a0 Add PAX_MPROTECT_DEBUG 2016-04-07 03:31:12 +00:00
maxv
9ed595918a Revamp the way processes are PaX'ed in the kernel. Sent on tech-kern@ two
months ago, but no one reviewed it - probably because it's not a trivial
change.

This change fixes the following bug: when loading a PaX'ed binary, the
kernel updates the PaX flag of the calling process before it makes sure
the new process is actually launched. If the kernel fails to launch the
new process, it does not restore the PaX flag of the calling process,
leaving it in an inconsistent state.

Actually, simply restoring it would be horrible as well, since in the
meantime another thread may have used the flag.

The solution is therefore: modify all the functions used by PaX so that
they take as argument the exec package instead of the lwp, and set the PaX
flag in the process *right before* launching the new process - it cannot
fail in the meantime.
2015-09-26 16:12:24 +00:00
maxv
687880ac6a Style 2014-03-29 09:31:11 +00:00
enami
a154f93a37 Bounds process's stack size with max_stack_size so that 32bit
binary works regardless of stack size limit setting.
2011-08-08 06:30:43 +00:00
matt
65bd0920b3 Allow PAX_ASLR to be used by itself. 2011-06-23 23:42:43 +00:00
christos
72233ad7fa PR/44673: Arna Clauson: Latest MAXSSIZ bump broke netbsd32 emulation on amd64.
- Use MAXSSIZ32 instead of MAXSSIZ for 32 bit binaries
- Default MAXXSIZ32 to a quarter of MAXSSIZ (good enough?)
- Add debugging
XXX: Note that:
	- sparc32 MAXSSIZ is larger than sparc64 MAXSSIZ
	- sparc64 MAXSSIZ32 != sparc32 MAXSSIZ
2011-03-04 04:25:58 +00:00
uebayasi
9d567f003d Include internal definitions (uvm/uvm.h) only where necessary. 2011-01-17 07:13:31 +00:00
yamt
9ca0d1bba8 new_vmcmd: assertions 2010-12-17 22:35:07 +00:00
christos
6a6858c912 Fix issues with stack allocation and pax aslr:
- since the size is unsigned, don't check just that it is > 0, but limit
  it to the MAXSSIZ
- if the stack size is reduced because of aslr, make sure we reduce the
  actual allocation by the same size so that the size does not wrap around.
NB: Must be pulled up to 5.x!
2010-08-23 20:53:08 +00:00
hannken
1423e65b26 Clean up vnode lock operations pass 2:
VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
2010-06-24 12:58:48 +00:00
matt
6a9e4e8eeb Change u_long to vaddr_t/vsize_t in exec code where appropriate (mostly
involves setregs and vmcmds).  Should result in no code differences.
2009-12-10 14:13:48 +00:00
mrg
fcc023545e - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes.  this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS.  (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information.  (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897.  it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
2009-03-29 01:02:48 +00:00
ad
fb8c6fcc83 Don't needlessly acquire v_interlock. 2008-06-02 16:16:27 +00:00
ad
a77566d7d5 Authorize using the LWP cached credentials, not process credentials. 2008-01-28 20:09:06 +00:00
yamt
6184f2ada8 malloc -> kmem_alloc 2008-01-03 14:29:31 +00:00
ad
4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
christos
65c680cad7 Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]
For regular (non PIE) executables randomization is enabled for:
    1. The data segment
    2. The stack

For PIE executables(*) randomization is enabled for:
    1. The program itself
    2. All shared libraries
    3. The data segment
    4. The stack

(*) To generate a PIE executable:
    - compile everything with -fPIC
    - link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
    options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.
2007-12-26 22:11:47 +00:00
pooka
61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
ad
7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
pooka
1ce406a846 Change unused fflags parameter in VOP_MMAP to prot and pass in
desired vm protection.
2007-07-27 08:26:38 +00:00
pooka
05ce20f4a0 Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
2007-07-22 19:16:04 +00:00
christos
53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
thorpej
4f3d5a9cc0 TRUE -> true, FALSE -> false 2007-02-22 06:34:42 +00:00
chs
33c1fd1917 add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).
2006-10-05 14:48:32 +00:00
ad
f474dceb13 Use the LWP cached credentials where sane. 2006-07-23 22:06:03 +00:00
elad
b3e7e1b010 Better implementation of PaX MPROTECT, after looking some more into the
code and not trying to use temporary solutions.

Lots of comments and help from YAMAMOTO Takashi, also thanks to the PaX
author for being quick to recognize that something fishy's going on. :)

Hook up in mmap/vmcmd rather than (ugh!) uvm_map_protect().

Next time I suggest to commit a temporary solution just revoke my
commit bit.
2006-05-20 15:45:37 +00:00
elad
215bd95ba4 integrate kauth. 2006-05-14 21:15:11 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
thorpej
f44b62c49d Collect vmcmd statistics. 2005-07-06 23:08:57 +00:00
christos
efb6943313 - add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.
2005-05-29 22:24:14 +00:00
perry
da8abec863 nuke trailing whitespace 2005-02-26 21:34:55 +00:00
skrll
f7155e40f6 There's no need to pass a proc value when using UIO_SYSSPACE with
vn_rdwr(9) and uiomove(9).

OK'd by Jason Thorpe
2004-09-17 14:11:20 +00:00
junyoung
d532b6f4e6 Expand NEW_VMCMD() macro to a real function new_vmcmd() for the
non-debugging case as well, rather than expanding it inline. This saves
a bunch of kernel bits, e.g. 4kB from GENERIC on i386.
2003-08-29 01:44:02 +00:00
chs
939df36e55 add support for non-executable mappings (where the hardware allows this)
and make the stack and heap non-executable by default.  the changes
fall into two basic catagories:

 - pmap and trap-handler changes.  these are all MD:
   = alpha: we already track per-page execute permission with the (software)
	PG_EXEC bit, so just have the trap handler pay attention to it.
   = i386: use a new GDT segment for %cs for processes that have no
	executable mappings above a certain threshold (currently the
	bottom of the stack).  track per-page execute permission with
	the last unused PTE bit.
   = powerpc/ibm4xx: just use the hardware exec bit.
   = powerpc/oea: we already track per-page exec bits, but the hardware only
	implements non-exec mappings at the segment level.  so track the
	number of executable mappings in each segment and turn on the no-exec
	segment bit iff the count is 0.  adjust the trap handler to deal.
   = sparc (sun4m): fix our use of the hardware protection bits.
	fix the trap handler to recognize text faults.
   = sparc64: split the existing unified TSB into data and instruction TSBs,
	and only load TTEs into the appropriate TSB(s) for the permissions.
	fix the trap handler to check for execute permission.
   = not yet implemented: amd64, hppa, sh5

 - changes in all the emulations that put a signal trampoline on the stack.
   instead, we now put the trampoline into a uvm_aobj and map that into
   the process separately.

originally from openbsd, adapted for netbsd by me.
2003-08-24 17:52:28 +00:00
yamt
a7f9b1d8e0 don't make zero-sized mappings. 2003-08-21 15:17:03 +00:00
christos
c3c2f78f98 GC: exec_foo_setup_stack; use exec_setup_stack, and provide a way for
emulations to override it.
2003-08-08 18:53:13 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
darrenr
960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
matt
9c3775aadf Make elf32 load_file work properly with TOPDOWN by mapping psections in
reverse order.  Remove TOPDOWN support from VMCMDs since elf32 does the
right stuff now.  With these changes, VAX can now use TOPDOWN.
2003-02-26 21:18:22 +00:00
atatat
df0a9badc6 Introduce "top down" memory management for mmap()ed allocations. This
means that the dynamic linker gets mapped in at the top of available
user virtual memory (typically just below the stack), shared libraries
get mapped downwards from that point, and calls to mmap() that don't
specify a preferred address will get mapped in below those.

This means that the heap and the mmap()ed allocations will grow
towards each other, allowing one or the other to grow larger than
before.  Previously, the heap was limited to MAXDSIZ by the placement
of the dynamic linker (and the process's rlimits) and the space
available to mmap was hobbled by this reservation.

This is currently only enabled via an *option* for the i386 platform
(though other platforms are expected to follow).  Add "options
USE_TOPDOWN_VM" to your kernel config file, rerun config, and rebuild
your kernel to take advantage of this.

Note that the pmap_prefer() interface has not yet been modified to
play nicely with this, so those platforms require a bit more work
(most notably the sparc) before they can use this new memory
arrangement.

This change also introduces a VM_DEFAULT_ADDRESS() macro that picks
the appropriate default address based on the size of the allocation or
the size of the process's text segment accordingly.  Several drivers
and the SYSV SHM address assignment were changed to use this instead
of each one picking their own "default".
2003-02-20 22:16:05 +00:00
atatat
7a8e4b4bc4 Two small changes to the ELF exec code:
(1) ELFNAME(load_file)() now takes a pointer to the entry point
offset, instead of taking a pointer to the entry point itself.  This
allows proper adjustment of the ultimate entry point at a higher level
if the object containing the entry point is moved before the exec is
finished.

(2) Introduce VMCMD_FIXED, which means the address at which a given
vmcmd describes a mapping is fixed (ie, should not be moved).  Don't
set this for entries pertaining to ld.so.

Also some minor comment/whitespace tweaks.
2003-01-30 20:03:46 +00:00
matt
c838a0fb9e In vmcmd_readvn, if the page is mapped executable and PMAP_NEED_PROCWR
is defined, call pmap_procwr to synchronize the icache.  This fixes the
problem of dynamic programs crashing on powerpc systems.
2003-01-12 05:24:17 +00:00
chs
993948e989 count executable image pages as executable for vm-usage purposes.
also, always do the VTEXT vs. v_writecount mutual exclusion
(which we previously skipped if the text or data segment was empty).
2002-10-05 22:34:02 +00:00
thorpej
d4a2567abe Fix a signed/unsigned comparison warning from GCC 3.3. 2002-08-25 19:13:08 +00:00
lukem
adc783d537 add RCSIDs 2001-11-12 15:25:01 +00:00