Commit Graph

94 Commits

Author SHA1 Message Date
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
chs
9307d68e6a stop converting async msync() to sync.
this hasn't been needed for years (if it ever was).
2005-10-10 15:53:28 +00:00
yamt
b7bfe82866 update file timestamps for nfsd loaned-read and mmap.
PR/25279.  discussed on tech-kern@.
2005-07-23 12:18:41 +00:00
yamt
662ada8f7a allocate anons on-demand, rather than reserving static amount of
them on boot/swapon.
2005-05-11 13:02:25 +00:00
yamt
6b2d8b66a4 merge yamt-km branch.
- don't use managed mappings/backing objects for wired memory allocations.
  save some resources like pv_entry.  also fix (most of) PR/27030.
- simplify kernel memory management API.
- simplify pmap bootstrap of some ports.
- some related cleanups.
2005-04-01 11:59:21 +00:00
fvdl
c487efe4a7 Fix some things regarding COMPAT_NETBSD32 and limits/VM addresses.
* For sparc64 and amd64, define *SIZ32 VM constants.
* Add a new function pointer to struct emul, pointing at a function
  that will return the default VM map address. The default function
  is uvm_map_defaultaddr, which just uses the VM_DEFAULT_ADDRESS
  macro. This gives emulations control over the default map address,
  and allows things to be mapped at the right address (in 32bit range)
  for COMPAT_NETBSD32.
* Add code to adjust the data and stack limits when a COMPAT_NETBSD32
  or COMPAT_SVR4_32 binary is executed.
* Don't use USRSTACK in kern_resource.c, use p_vmspace->vm_minsaddr
  instead (emulations might have set it differently)
* Since this changes struct emul, bump kernel version to 3.99.2

Tested on amd64, compile-tested on sparc64.
2005-03-26 05:12:34 +00:00
chs
467487d274 use vm_map_{min,max}() instead of dereferencing the vm_map pointer directly.
define and use vm_map_set{min,max}() for modifying these values.
remove the {min,max}_offset aliases for these vm_map fields to be more
namespace-friendly.  PR 26475.
2005-02-11 02:12:03 +00:00
chs
b0c54c738d pmap_wired_count() is now available on all platforms,
remove the code for the case where it's not defined.
2005-01-23 15:58:13 +00:00
yamt
1207308b90 for in-kernel maps,
- allocate kva for vm_map_entry from the map itsself and
  remove the static limit, MAX_KMAPENT.
- keep merged entries for later splitting to fix allocate-to-free problem.
  PR/24039.
2005-01-01 21:00:06 +00:00
briggs
ddcb68edd6 mlock(2) and munlock(2) are defined by our man pages (which agree with
those on opengroup.org) to return ENOMEM if trying to lock a region that
is not accessible.  So if uvm_map_pageable() returns EFAULT, make it ENOMEM.
2004-12-02 15:23:47 +00:00
hannken
8c21bc6224 Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.
- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
    may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
    Snapshots may not be opened for writing and the attributes are read-only.
    Use the mtime as the time this snapshot was taken.
    Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
  one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
  a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
2004-05-25 14:54:55 +00:00
darrenr
e6416b1b3c rather than just try to get a mapping from a device as only PROT_EXEC, work
down the list of protections until either we run out or we find one that the
device is willing to work with.
2004-05-19 13:20:27 +00:00
junyoung
7e0c058612 Drop trailing spaces. 2004-03-24 07:47:32 +00:00
dsl
bcaf57b039 Fix prev. so it compiles 2004-02-14 16:40:22 +00:00
jdolecek
af46922ada add compat hook in check for zerodev; use this hook to recognize
the old ARM /dev/zero minor mapping #ifdef COMPAT_16
fixes second part of PR kern/23581 by Richard Earnshaw
2004-02-14 12:20:14 +00:00
yamt
f7d48e3571 mincore: don't treat an aobj as a device mapping. 2003-11-29 19:06:48 +00:00
thorpej
8655c7d7eb Add a MAP_WIRED flag to mmap(2), which causes the new mapping to be
wired as if by mlock(2).
2003-10-07 00:17:09 +00:00
chs
12f04351ad fix some indentation. 2003-08-24 18:12:25 +00:00
chs
4ffa07757d mprotect()'s "len" is really a size_t, and we can't do any useful
bounds-checking on it.
2003-08-24 16:32:50 +00:00
christos
b6dc1230b9 PR/22062: Dheeraj S: Don't compare an integral type with NULL. 2003-07-06 16:19:18 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
darrenr
960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
christos
40e148ef6b PR/21948: Todd Vierling: Implement MAP_TRYFIXED for linux emulation. 2003-06-23 21:32:33 +00:00
gmcgarry
9b44363ff5 Don't use overloaded term "comm". From Greg A. Woods in PR#17394. 2003-05-04 01:54:26 +00:00
matt
6074e25e08 Add support for mmap(2) to be able to return memory aligned on a 2^n
boundary.
2003-03-06 00:41:51 +00:00
pk
2931081a79 Make updating a file's reference and use count MP-safe. 2003-02-23 14:37:32 +00:00
atatat
df0a9badc6 Introduce "top down" memory management for mmap()ed allocations. This
means that the dynamic linker gets mapped in at the top of available
user virtual memory (typically just below the stack), shared libraries
get mapped downwards from that point, and calls to mmap() that don't
specify a preferred address will get mapped in below those.

This means that the heap and the mmap()ed allocations will grow
towards each other, allowing one or the other to grow larger than
before.  Previously, the heap was limited to MAXDSIZ by the placement
of the dynamic linker (and the process's rlimits) and the space
available to mmap was hobbled by this reservation.

This is currently only enabled via an *option* for the i386 platform
(though other platforms are expected to follow).  Add "options
USE_TOPDOWN_VM" to your kernel config file, rerun config, and rebuild
your kernel to take advantage of this.

Note that the pmap_prefer() interface has not yet been modified to
play nicely with this, so those platforms require a bit more work
(most notably the sparc) before they can use this new memory
arrangement.

This change also introduces a VM_DEFAULT_ADDRESS() macro that picks
the appropriate default address based on the size of the allocation or
the size of the process's text segment accordingly.  Several drivers
and the SYSV SHM address assignment were changed to use this instead
of each one picking their own "default".
2003-02-20 22:16:05 +00:00
thorpej
b78f59b443 Merge the nathanw_sa branch. 2003-01-18 08:51:40 +00:00
mycroft
3c7847ff41 #if 0 the call to uvm_map_checkprot() in sys_munmap() -- it's not documented,
and programs do not expect it.  Also fixes memory leaks in dlopen()/dlclose().
2002-09-27 19:13:29 +00:00
gehenna
77a6b82b27 Merge the gehenna-devsw branch into the trunk.
This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

	device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
  by using this grammer.

- Added the new naming convention.
  The name of the device switch must be <prefix>_[bc]devsw for auto-generation
  of device switch tables.

- The backward compatibility of loading block/character device
  switch by LKM framework is broken. This is necessary to convert
  from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
  We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
  the LKM framework will refer it to assign device major number dynamically.
2002-09-06 13:18:43 +00:00
atatat
6c03c181d2 "offest" -> "offset" in a comment 2002-05-31 16:49:50 +00:00
darrenr
256089809f Return EFBIG from mmap() if we try to map too much data and in the fixed
address allocation, return EOVERFLOW to match with the non-fixed error.
2002-03-22 11:06:33 +00:00
chs
4923ddfdda in sys_mincore(), check the return value of uvm_vslock() to determine
if the vec pointer is valid rather than using uvm_useracc().
uvm_useracc() just tells you if the permissions of a user mapping allow
the desired access, not whether faulting on that mapping will succeed.
2001-12-14 04:21:22 +00:00
chs
1b8f294146 disallow mapping negative offsets for both regular files and block devices. 2001-11-25 06:42:47 +00:00
lukem
b616d1ca1d add RCSIDs, and in some cases, slightly cleanup #include order 2001-11-10 07:36:59 +00:00
thorpej
f67e15c839 uvm_map_protect(): Don't allow VM_PROT_EXECUTE to be set on entries
(either the current protection or the max protection) that reference
vnodes associated with a file system mounted with the NOEXEC option.

uvm_mmap(): Don't allow PROT_EXEC mappings to be established of vnodes
which are associated with a file system mounted with the NOEXEC option.
2001-10-30 19:05:26 +00:00
thorpej
e8ee04475d - Add a new vnode flag VEXECMAP, which indicates that a vnode has
executable mappings.  Stop overloading VTEXT for this purpose (VTEXT
  also has another meaning).
- Rename vn_marktext() to vn_markexec(), and use it when executable
  mappings of a vnode are established.
- In places where we want to set VTEXT, set it in v_flag directly, rather
  than making a function call to do this (it no longer makes sense to
  use a function call, since we no longer overload VTEXT with VEXECMAP's
  meaning).

VEXECMAP suggested by Chuq Silvers.
2001-10-30 15:32:01 +00:00
thorpej
7285b2c290 uvm_mmap(): If a vnode mapping is established with PROT_EXEC, mark the
vnode as VTEXT.

uvm_map_protect(): When VM_PROT_EXECUTE is added to a VA range, mark
all the vnodes mapped by the range as VTEXT.
2001-10-29 23:06:03 +00:00
chs
64c6d1d2dc a whole bunch of changes to improve performance and robustness under load:
- remove special treatment of pager_map mappings in pmaps.  this is
   required now, since I've removed the globals that expose the address range.
   pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
   no longer any need to special-case it.
 - eliminate struct uvm_vnode by moving its fields into struct vnode.
 - rewrite the pageout path.  the pager is now responsible for handling the
   high-level requests instead of only getting control after a bunch of work
   has already been done on its behalf.  this will allow us to UBCify LFS,
   which needs tighter control over its pages than other filesystems do.
   writing a page to disk no longer requires making it read-only, which
   allows us to write wired pages without causing all kinds of havoc.
 - use a new PG_PAGEOUT flag to indicate that a page should be freed
   on behalf of the pagedaemon when it's unlocked.  this flag is very similar
   to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
   pageout fails due to eg. an indirect-block buffer being locked.
   this allows us to remove the "version" field from struct vm_page,
   and together with shrinking "loan_count" from 32 bits to 16,
   struct vm_page is now 4 bytes smaller.
 - no longer use PG_RELEASED for swap-backed pages.  if the page is busy
   because it's being paged out, we can't release the swap slot to be
   reallocated until that write is complete, but unlike with vnodes we
   don't keep a count of in-progress writes so there's no good way to
   know when the write is done.  instead, when we need to free a busy
   swap-backed page, just sleep until we can get it busy ourselves.
 - implement a fast-path for extending writes which allows us to avoid
   zeroing new pages.  this substantially reduces cpu usage.
 - encapsulate the data used by the genfs code in a struct genfs_node,
   which must be the first element of the filesystem-specific vnode data
   for filesystems which use genfs_{get,put}pages().
 - eliminate many of the UVM pagerops, since they aren't needed anymore
   now that the pager "put" operation is a higher-level operation.
 - enhance the genfs code to allow NFS to use the genfs_{get,put}pages
   instead of a modified copy.
 - clean up struct vnode by removing all the fields that used to be used by
   the vfs_cluster.c code (which we don't use anymore with UBC).
 - remove kmem_object and mb_object since they were useless.
   instead of allocating pages to these objects, we now just allocate
   pages with no object.  such pages are mapped in the kernel until they
   are freed, so we can use the mapping to find the page to free it.
   this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
2001-09-15 20:36:31 +00:00
chs
37f6c5155d call VOP_MMAP() before allowing mappings of vnodes to allow
filesystems which do not support memory mapped access to cause
mmap() of their vnodes to fail.
2001-08-17 05:52:46 +00:00
thorpej
80cc38a1af Fix a partial construction problem that can cause race conditions
between creation of a file descriptor and close(2) when using kernel
assisted threads.  What we do is stick descriptors in the table, but
mark them as "larval".  This causes essentially everything to treat
it as a non-existent descriptor, except for fdalloc(), which sees a
filled slot so that it won't (incorrectly) allocate it again.  When
a descriptor is fully constructed, the code that has constructed it
marks it as "mature" (which actually clears the "larval" flag), and
things continue to work as normal.

While here, gather all the code that gets a descriptor from the table
into a fd_getfile() function, and call it, rather than having the
same (sometimes incorrect) code copied all over the place.
2001-06-14 20:32:41 +00:00
chs
821ec03ed9 replace vm_map{,_entry}_t with struct vm_map{,_entry} *. 2001-06-02 18:09:08 +00:00
chs
11a9651c8f replace vm_page_t with struct vm_page *. 2001-05-26 21:27:10 +00:00
chs
3845302904 remove trailing whitespace. 2001-05-25 04:06:11 +00:00
chs
ac3bc537bd eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS			0
KERN_INVALID_ADDRESS		EFAULT
KERN_PROTECTION_FAILURE		EACCES
KERN_NO_SPACE			ENOMEM
KERN_INVALID_ARGUMENT		EINVAL
KERN_FAILURE			various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE		ENOMEM
KERN_NOT_RECEIVER		<unused>
KERN_NO_ACCESS			<unused>
KERN_PAGES_LOCKED		<unused>
2001-03-15 06:10:32 +00:00
chs
19b7b64642 clean up DIAGNOSTIC checks, use KASSERT(). 2001-02-18 21:19:08 +00:00
thorpej
4d4b2b5626 Nevermind that it's silly to include PROT_EXEC even if a vnode
doesn't have the exec bit set, we need to have PROT_EXEC set
in order for some expected mmap/mprotect behavior to work, so
do the last bit slightly differently: if udv_attach() fails, and
the protection (NOT maxprot) doens't include PROT_EXEC, then clear
PROT_EXEC from maxprot and try udv_attach() again.

Sigh, mmap really needs to be rototilled.
2001-01-08 01:35:03 +00:00
thorpej
781516b080 Only include PROT_EXEC in maxprot if the user specified PROT_EXEC
in the mmap() call.  maxprot is used to create device mappings,
and always including PROT_EXEC causes the mapping to fail on the Alpha
when mapping a non-RAM offset of /dev/mem (which may be sparse, so
instruction fetch from there is disallowed).
2001-01-07 06:16:46 +00:00
chs
aeda8d3b77 Initial integration of the Unified Buffer Cache project. 2000-11-27 08:39:39 +00:00
soren
2a6c823e89 Typo in comment. 2000-11-24 23:30:01 +00:00