Commit Graph

132 Commits

Author SHA1 Message Date
uebayasi 55c6ab5c75 Consistently call amap / uobj layers as upper / lower, because UVM has only
those two layers by design.  Approved by Chuck Cranor some time ago.
2009-11-01 11:16:32 +00:00
yamt 9373b602f6 uvm_mmap: remove a dead conditional. 2009-08-18 02:41:31 +00:00
yamt 16babfa6fb on MADV_WILLNEED, start prefetching backing object's pages. 2009-06-10 01:55:33 +00:00
yamt 5ebfa691fe wrap long lines. 2009-05-30 04:26:16 +00:00
mrg fcc023545e - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes.  this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS.  (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information.  (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897.  it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
2009-03-29 01:02:48 +00:00
dsl 82357f6d42 ANSIfy another 1261 function definitions.
The only ones left in sys are beyond by sed script!
(or in sys/dist or sys/external)
Mostly they have function pointer parameters.
2009-03-14 21:04:01 +00:00
ad da4bdc8954 uvm_mmap: don't lock the map unless we need to. 2008-06-03 21:48:27 +00:00
ad bb29be511d One more. 2008-06-02 16:17:12 +00:00
ad fb8c6fcc83 Don't needlessly acquire v_interlock. 2008-06-02 16:16:27 +00:00
ad 0d151db6c9 Don't needlessly acquire v_interlock. 2008-06-02 16:00:33 +00:00
ad a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
ad 4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
christos 65c680cad7 Add PaX ASLR (Address Space Layout Randomization) [from elad and myself]
For regular (non PIE) executables randomization is enabled for:
    1. The data segment
    2. The stack

For PIE executables(*) randomization is enabled for:
    1. The program itself
    2. All shared libraries
    3. The data segment
    4. The stack

(*) To generate a PIE executable:
    - compile everything with -fPIC
    - link with -shared-libgcc -Wl,-pie

This feature is experimental, and might change. To use selectively add
    options PAX_ASLR=0
in your kernel.

Currently we are using 12 bits for the stack, program, and data segment and
16 or 24 bits for mmap, depending on __LP64__.
2007-12-26 22:11:47 +00:00
dsl 7e2790cf6f Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
    int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
2007-12-20 23:02:38 +00:00
pooka 61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
ad 7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad 451aacda90 Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.
2007-10-08 15:12:05 +00:00
yamt d71c563a36 make RANGE_TEST a function. 2007-09-23 16:05:40 +00:00
pooka 1ce406a846 Change unused fflags parameter in VOP_MMAP to prot and pass in
desired vm protection.
2007-07-27 08:26:38 +00:00
pooka 05ce20f4a0 Retire uvn_attach() - it abuses VXLOCK and its functionality,
setting vnode sizes, is handled elsewhere: file system vnode creation
or spec_open() for regular files or block special files, respectively.

Add a call to VOP_MMAP() to the pagedvn exec path, since the vnode
is being memory mapped.

reviewed by tech-kern & wrstuden
2007-07-22 19:16:04 +00:00
elad 6700cfccd6 Some Veriexec stuff that's been rotting in my tree for months.
Bug fixes:
  - Fix crash reported by Scott Ellis on current-users@.

  - Fix race conditions in enforcing the Veriexec rename and remove
    policies. These are NOT security issues.

  - Fix memory leak in rename handling when overwriting a monitored
    file.

  - Fix table deletion logic.

  - Don't prevent query requests if not in learning mode.


KPI updates:
  - fileassoc_table_run() now takes a cookie to pass to the callback.

  - veriexec_table_add() was removed, it is now done internally. As a
    result, there's no longer a need for VERIEXEC_TABLESIZE.

  - veriexec_report() was removed, it is now internal.

  - Perform sanity checks on the entry type, and enforce default type
    in veriexec_file_add() rather than in veriexecctl.

  - Add veriexec_flush(), used to delete all Veriexec tables, and
    veriexec_dump(), used to fill an array with all Veriexec entries.


New features:
  - Add a '-k' flag to veriexecctl, to keep the filenames in the kernel
    database. This allows Veriexec to produce slightly more accurate
    logs under certain circumstances. In the future, this can be either
    replaced by vnode->pathname translation, or combined with it.

  - Add a VERIEXEC_DUMP ioctl, to dump the entire Veriexec database.
    This can be used to recover a database if the file was lost.
    Example usage:

        # veriexecctl dump > /etc/signatures

    Note that only entries with the filename kept (that is, were loaded
    with the '-k' flag) will be dumped.

    Idea from Brett Lymn.

  - Add a VERIEXEC_FLUSH ioctl, to delete all Veriexec entries. Sample
    usage:

        # veriexecctl flush

  - Add a 'veriexec_flags' rc(8) variable, and make its default have
    the '-k' flag. On systems using the default signatures file
    (generaetd from running 'veriexecgen' with no arguments), this will
    use additional 32kb of kernel memory on average.

  - Add a '-e' flag to veriexecctl, to evaluate the fingerprint during
    load. This is done automatically for files marked as 'untrusted'.


Misc. stuff:
  - The code for veriexecctl was massively simplified as a result of
    eliminating the need for VERIEXEC_TABLESIZE, and now uses a single
    pass of the signatures file, making the loading somewhat faster.

  - Lots of minor fixes found using the (still under development)
    Veriexec regression testsuite.

  - Some of the messages Veriexec prints were improved.

  - Various documentation fixes.


All relevant man-pages were updated to reflect the above changes.

Binary compatibility with existing veriexecctl binaries is maintained.
2007-05-15 19:47:43 +00:00
christos 67695a4835 Make us standards compliant again. Return EINVAL in all cases (except for
mmap) so we cannot tell what went wrong.
2007-05-11 21:30:23 +00:00
christos 5e0b06ff45 Improve on previous and write a RANGE_TEST macro and do it on all the
system calls instead of doing a half-assed job on some of them and none
on others.
2007-05-11 20:41:14 +00:00
christos c5a89b37a0 fix bogus wrap tests; ssize_t != int... 2007-05-11 20:05:50 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
thorpej b3667ada6d TRUE -> true, FALSE -> false 2007-02-22 06:05:00 +00:00
thorpej 712239e366 Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
2007-02-21 22:59:35 +00:00
ad b07ec3fc38 Merge newlock2 to head. 2007-02-09 21:55:00 +00:00
elad d60f1f435f If Veriexec prevents indirect execution of the binary, in addition to just
blocking the mmap() if exec bit is requested, also strip exec bit from
maxprot for further mprotect() calls.

Okay joerg@.
2007-02-03 01:11:50 +00:00
elad fa26351488 Cosmetic nit in the 'filename' passed to veriexec_verify(). 2007-01-11 14:26:07 +00:00
yamt 1a7bc55dcc remove some __unused from function parameters. 2006-11-01 10:17:58 +00:00
christos 4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
chs 33c1fd1917 add support for O_DIRECT (I/O directly to application memory,
bypassing any kernel caching for file data).
2006-10-05 14:48:32 +00:00
elad 53ca07b4d7 If Veriexec enforces access type, don't allow mmap() to use PROT_EXEC on
files that don't have the "indirect" flag. Also change the "library" alias
in veriexecctl(8) to mean "file, indirect".

okay blymn@
2006-09-30 10:56:31 +00:00
ad 3029ac48c7 - Use the LWP cached credentials where sane.
- Minor cosmetic changes.
2006-07-21 16:48:45 +00:00
elad b3e7e1b010 Better implementation of PaX MPROTECT, after looking some more into the
code and not trying to use temporary solutions.

Lots of comments and help from YAMAMOTO Takashi, also thanks to the PaX
author for being quick to recognize that something fishy's going on. :)

Hook up in mmap/vmcmd rather than (ugh!) uvm_map_protect().

Next time I suggest to commit a temporary solution just revoke my
commit bit.
2006-05-20 15:45:37 +00:00
elad fc9422c9d9 integrate kauth. 2006-05-14 21:31:52 +00:00
christos 435a7d0d03 Coverity CID 2721: Avoid bitching for impossible cases, by adding KASSERT. 2006-04-05 19:49:28 +00:00
christos 95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
chs 9307d68e6a stop converting async msync() to sync.
this hasn't been needed for years (if it ever was).
2005-10-10 15:53:28 +00:00
yamt b7bfe82866 update file timestamps for nfsd loaned-read and mmap.
PR/25279.  discussed on tech-kern@.
2005-07-23 12:18:41 +00:00
yamt 662ada8f7a allocate anons on-demand, rather than reserving static amount of
them on boot/swapon.
2005-05-11 13:02:25 +00:00
yamt 6b2d8b66a4 merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
  save some resources like pv_entry.  also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.
2005-04-01 11:59:21 +00:00
fvdl c487efe4a7 Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.
* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
  that will return the default VM map address. The default function
  is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
  macro. This gives emulations control over the default map address,
  and allows things to be mapped at the right address (in 32bit range)
  for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
  or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
  instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.
2005-03-26 05:12:34 +00:00
chs 467487d274 use vm_map_{min,max}() instead of dereferencing the vm_map pointer directly.
define and use vm_map_set{min,max}() for modifying these values.
remove the {min,max}_offset aliases for these vm_map fields to be more
namespace-friendly.  PR 26475.
2005-02-11 02:12:03 +00:00
chs b0c54c738d pmap_wired_count() is now available on all platforms,
remove the code for the case where it's not defined.
2005-01-23 15:58:13 +00:00
yamt 1207308b90 for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
  remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
  PR/24039.
2005-01-01 21:00:06 +00:00
briggs ddcb68edd6 mlock(2) and munlock(2) are defined by our man pages (which agree with
those on opengroup.org) to return ENOMEM if trying to lock a region that
is not accessible.  So if uvm_map_pageable() returns EFAULT, make it ENOMEM.
2004-12-02 15:23:47 +00:00
hannken 8c21bc6224 Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.
- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
    may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
    Snapshots may not be opened for writing and the attributes are read-only.
    Use the mtime as the time this snapshot was taken.
    Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
  one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
  a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
2004-05-25 14:54:55 +00:00
darrenr e6416b1b3c rather than just try to get a mapping from a device as only PROT_EXEC, work
down the list of protections until either we run out or we find one that the
device is willing to work with.
2004-05-19 13:20:27 +00:00