NetBSD

Commit Graph

Author	SHA1	Message	Date
matt	6074e25e08	Add support for mmap(2) to be able to return memory aligned on a 2^n boundary.	2003-03-06 00:41:51 +00:00
pk	2931081a79	Make updating a file's reference and use count MP-safe.	2003-02-23 14:37:32 +00:00
atatat	df0a9badc6	Introduce "top down" memory management for mmap()ed allocations. This means that the dynamic linker gets mapped in at the top of available user virtual memory (typically just below the stack), shared libraries get mapped downwards from that point, and calls to mmap() that don't specify a preferred address will get mapped in below those. This means that the heap and the mmap()ed allocations will grow towards each other, allowing one or the other to grow larger than before. Previously, the heap was limited to MAXDSIZ by the placement of the dynamic linker (and the process's rlimits) and the space available to mmap was hobbled by this reservation. This is currently only enabled via an option for the i386 platform (though other platforms are expected to follow). Add "options USE_TOPDOWN_VM" to your kernel config file, rerun config, and rebuild your kernel to take advantage of this. Note that the pmap_prefer() interface has not yet been modified to play nicely with this, so those platforms require a bit more work (most notably the sparc) before they can use this new memory arrangement. This change also introduces a VM_DEFAULT_ADDRESS() macro that picks the appropriate default address based on the size of the allocation or the size of the process's text segment accordingly. Several drivers and the SYSV SHM address assignment were changed to use this instead of each one picking their own "default".	2003-02-20 22:16:05 +00:00
thorpej	b78f59b443	Merge the nathanw_sa branch.	2003-01-18 08:51:40 +00:00
mycroft	3c7847ff41	#if 0 the call to uvm_map_checkprot() in sys_munmap() -- it's not documented, and programs do not expect it. Also fixes memory leaks in dlopen()/dlclose().	2002-09-27 19:13:29 +00:00
gehenna	77a6b82b27	Merge the gehenna-devsw branch into the trunk. This merge changes the device switch tables from static array to dynamically generated by config(8). - All device switches is defined as a constant structure in device drivers. - The new grammer ``device-major'' is introduced to ``files''. device-major <prefix> char <num> [block <num>] [<rules>] - All device major numbers must be listed up in port dependent majors.<arch> by using this grammer. - Added the new naming convention. The name of the device switch must be <prefix>_[bc]devsw for auto-generation of device switch tables. - The backward compatibility of loading block/character device switch by LKM framework is broken. This is necessary to convert from block/character device major to device name in runtime and vice versa. - The restriction to assign device major by LKM is completely removed. We don't need to reserve LKM entries for dynamic loading of device switch. - In compile time, device major numbers list is packed into the kernel and the LKM framework will refer it to assign device major number dynamically.	2002-09-06 13:18:43 +00:00
atatat	6c03c181d2	"offest" -> "offset" in a comment	2002-05-31 16:49:50 +00:00
darrenr	256089809f	Return EFBIG from mmap() if we try to map too much data and in the fixed address allocation, return EOVERFLOW to match with the non-fixed error.	2002-03-22 11:06:33 +00:00
chs	4923ddfdda	in sys_mincore(), check the return value of uvm_vslock() to determine if the vec pointer is valid rather than using uvm_useracc(). uvm_useracc() just tells you if the permissions of a user mapping allow the desired access, not whether faulting on that mapping will succeed.	2001-12-14 04:21:22 +00:00
chs	1b8f294146	disallow mapping negative offsets for both regular files and block devices.	2001-11-25 06:42:47 +00:00
lukem	b616d1ca1d	add RCSIDs, and in some cases, slightly cleanup #include order	2001-11-10 07:36:59 +00:00
thorpej	f67e15c839	uvm_map_protect(): Don't allow VM_PROT_EXECUTE to be set on entries (either the current protection or the max protection) that reference vnodes associated with a file system mounted with the NOEXEC option. uvm_mmap(): Don't allow PROT_EXEC mappings to be established of vnodes which are associated with a file system mounted with the NOEXEC option.	2001-10-30 19:05:26 +00:00
thorpej	e8ee04475d	- Add a new vnode flag VEXECMAP, which indicates that a vnode has executable mappings. Stop overloading VTEXT for this purpose (VTEXT also has another meaning). - Rename vn_marktext() to vn_markexec(), and use it when executable mappings of a vnode are established. - In places where we want to set VTEXT, set it in v_flag directly, rather than making a function call to do this (it no longer makes sense to use a function call, since we no longer overload VTEXT with VEXECMAP's meaning). VEXECMAP suggested by Chuq Silvers.	2001-10-30 15:32:01 +00:00
thorpej	7285b2c290	uvm_mmap(): If a vnode mapping is established with PROT_EXEC, mark the vnode as VTEXT. uvm_map_protect(): When VM_PROT_EXECUTE is added to a VA range, mark all the vnodes mapped by the range as VTEXT.	2001-10-29 23:06:03 +00:00
chs	64c6d1d2dc	a whole bunch of changes to improve performance and robustness under load: - remove special treatment of pager_map mappings in pmaps. this is required now, since I've removed the globals that expose the address range. pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's no longer any need to special-case it. - eliminate struct uvm_vnode by moving its fields into struct vnode. - rewrite the pageout path. the pager is now responsible for handling the high-level requests instead of only getting control after a bunch of work has already been done on its behalf. this will allow us to UBCify LFS, which needs tighter control over its pages than other filesystems do. writing a page to disk no longer requires making it read-only, which allows us to write wired pages without causing all kinds of havoc. - use a new PG_PAGEOUT flag to indicate that a page should be freed on behalf of the pagedaemon when it's unlocked. this flag is very similar to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the pageout fails due to eg. an indirect-block buffer being locked. this allows us to remove the "version" field from struct vm_page, and together with shrinking "loan_count" from 32 bits to 16, struct vm_page is now 4 bytes smaller. - no longer use PG_RELEASED for swap-backed pages. if the page is busy because it's being paged out, we can't release the swap slot to be reallocated until that write is complete, but unlike with vnodes we don't keep a count of in-progress writes so there's no good way to know when the write is done. instead, when we need to free a busy swap-backed page, just sleep until we can get it busy ourselves. - implement a fast-path for extending writes which allows us to avoid zeroing new pages. this substantially reduces cpu usage. - encapsulate the data used by the genfs code in a struct genfs_node, which must be the first element of the filesystem-specific vnode data for filesystems which use genfs_{get,put}pages(). - eliminate many of the UVM pagerops, since they aren't needed anymore now that the pager "put" operation is a higher-level operation. - enhance the genfs code to allow NFS to use the genfs_{get,put}pages instead of a modified copy. - clean up struct vnode by removing all the fields that used to be used by the vfs_cluster.c code (which we don't use anymore with UBC). - remove kmem_object and mb_object since they were useless. instead of allocating pages to these objects, we now just allocate pages with no object. such pages are mapped in the kernel until they are freed, so we can use the mapping to find the page to free it. this allows us to remove splvm() protection in several places. The sum of all these changes improves write throughput on my decstation 5000/200 to within 1% of the rate of NetBSD 1.5 and reduces the elapsed time for "make release" of a NetBSD 1.5 source tree on my 128MB pc to 10% less than a 1.5 kernel took.	2001-09-15 20:36:31 +00:00
chs	37f6c5155d	call VOP_MMAP() before allowing mappings of vnodes to allow filesystems which do not support memory mapped access to cause mmap() of their vnodes to fail.	2001-08-17 05:52:46 +00:00
thorpej	80cc38a1af	Fix a partial construction problem that can cause race conditions between creation of a file descriptor and close(2) when using kernel assisted threads. What we do is stick descriptors in the table, but mark them as "larval". This causes essentially everything to treat it as a non-existent descriptor, except for fdalloc(), which sees a filled slot so that it won't (incorrectly) allocate it again. When a descriptor is fully constructed, the code that has constructed it marks it as "mature" (which actually clears the "larval" flag), and things continue to work as normal. While here, gather all the code that gets a descriptor from the table into a fd_getfile() function, and call it, rather than having the same (sometimes incorrect) code copied all over the place.	2001-06-14 20:32:41 +00:00
chs	821ec03ed9	replace vm_map{,_entry}_t with struct vm_map{,_entry} *.	2001-06-02 18:09:08 +00:00
chs	11a9651c8f	replace vm_page_t with struct vm_page *.	2001-05-26 21:27:10 +00:00
chs	3845302904	remove trailing whitespace.	2001-05-25 04:06:11 +00:00
chs	ac3bc537bd	eliminate the KERN_* error codes in favor of the traditional E* codes. the mapping is: KERN_SUCCESS 0 KERN_INVALID_ADDRESS EFAULT KERN_PROTECTION_FAILURE EACCES KERN_NO_SPACE ENOMEM KERN_INVALID_ARGUMENT EINVAL KERN_FAILURE various, mostly turn into KASSERTs KERN_RESOURCE_SHORTAGE ENOMEM KERN_NOT_RECEIVER <unused> KERN_NO_ACCESS <unused> KERN_PAGES_LOCKED <unused>	2001-03-15 06:10:32 +00:00
chs	19b7b64642	clean up DIAGNOSTIC checks, use KASSERT().	2001-02-18 21:19:08 +00:00
thorpej	4d4b2b5626	Nevermind that it's silly to include PROT_EXEC even if a vnode doesn't have the exec bit set, we need to have PROT_EXEC set in order for some expected mmap/mprotect behavior to work, so do the last bit slightly differently: if udv_attach() fails, and the protection (NOT maxprot) doens't include PROT_EXEC, then clear PROT_EXEC from maxprot and try udv_attach() again. Sigh, mmap really needs to be rototilled.	2001-01-08 01:35:03 +00:00
thorpej	781516b080	Only include PROT_EXEC in maxprot if the user specified PROT_EXEC in the mmap() call. maxprot is used to create device mappings, and always including PROT_EXEC causes the mapping to fail on the Alpha when mapping a non-RAM offset of /dev/mem (which may be sparse, so instruction fetch from there is disallowed).	2001-01-07 06:16:46 +00:00
chs	aeda8d3b77	Initial integration of the Unified Buffer Cache project.	2000-11-27 08:39:39 +00:00
soren	2a6c823e89	Typo in comment.	2000-11-24 23:30:01 +00:00
thorpej	72a24b4eae	Add an align argument to uvm_map() and some callers of that routine. Works similarly fto pmap_prefer(), but allows callers to specify a minimum power-of-two alignment of the region. How we ever got along without this for so long is beyond me.	2000-09-13 15:00:15 +00:00
mrg	dea44a9ec4	remove include of <vm/vm.h>	2000-06-27 17:29:17 +00:00
mrg	2f159a1bac	remove/move more mach vm header files: <vm/pglist.h> -> <uvm/uvm_pglist.h> <vm/vm_inherit.h> -> <uvm/uvm_inherit.h> <vm/vm_kern.h> -> into <uvm/uvm_extern.h> <vm/vm_object.h> -> nothing <vm/vm_pager.h> -> into <uvm/uvm_pager.h> also includes a bunch of <vm/vm_page.h> include removals (due to redudancy with <vm/vm.h>), and a scattering of other similar headers.	2000-06-26 14:20:25 +00:00
enami	332c98526a	- Move the comment, which describes that calling the function uvm_map_pageable(map, ...) implies unlocking passed map, just before the function call. - If we bail out before calling the uvm_map_pageable, unlock the map by ourself to prevent a panic ``locking against myself''. The panic is, for example, caused when cdrecord is invoked with too large fifo size.	2000-05-23 02:19:20 +00:00
augustss	641df97d12	Remove more register declarations.	2000-03-30 12:31:50 +00:00
kleink	7e35a43e67	In mmap(), bail out with EOVERFLOW when mapping a regular file and the file offset plus mapping length cannot be represented in an off_t.	2000-03-28 18:45:19 +00:00
kleink	6e5b64c8a0	Merge parts of chs-ubc2 into the trunk: Add a new type voff_t (defined as a synonym for off_t) to describe offsets into uvm objects, and update the appropriate interfaces to use it, the most visible effect being the ability to mmap() file offsets beyond the range of a vaddr_t. Originally by Chuck Silvers; blame me for problems caused by merging this into non-UBC.	2000-03-26 20:54:45 +00:00
thorpej	7287dd22c6	Remove a piece of code introduced in rev 1.36 that I didn't intend to commit.	1999-12-11 05:38:41 +00:00
thorpej	1da427a80a	Change the pmap_enter() API slightly; pmap_enter() now returns an error value (KERN_SUCCESS or KERN_RESOURCE_SHORTAGE) indicating if it succeeded or failed. Change the `wired' and `access_type' arguments to a single `flags' argument, which includes the access type, and flags: PMAP_WIRED the old `wired' boolean PMAP_CANFAIL pmap_enter() is allowed to fail If PMAP_CANFAIL is not specified, the pmap should behave as it always has in the face of a drastic resource shortage: fall over dead. Change the fault handler to deal with failure (which indicates resource shortage) by unlocking everything, waiting for the pagedaemon to free more memory, then retrying the fault.	1999-11-13 00:24:38 +00:00
thorpej	b6f435026c	Add a set of "lockflags", which can control the locking behavior of some functions. Use these flags in uvm_map_pageable() to determine if the map is locked on entry (replaces an already present boolean_t argument `islocked'), and if the function should return with the map still locked.	1999-07-17 21:35:49 +00:00
thorpej	8e06a75bcb	Fix an operator precedence error which caused msync(2) to fail to pass the PGO_CLEANIT flag to the object pagers. Fixes PR #7978, from Matthias Pfaller.	1999-07-14 21:06:30 +00:00
kleink	e79a283e47	XSH5: change function signature to `void *sbrk(intptr_t)'.	1999-07-12 21:55:19 +00:00
thorpej	c0389be5da	Make a comment reflect reality.	1999-07-10 20:40:23 +00:00
thorpej	d75fb0f6b0	Slightly better test for "object with no real pages". Test for NULL pgo_releasepg rather than if the pager is the device pager.	1999-07-10 20:29:24 +00:00
thorpej	ec74ea9486	Correct a comment.	1999-07-08 00:52:45 +00:00
thorpej	4e398a6ded	Add some more meat to madvise(2): * Implement MADV_DONTNEED: deactivate pages in the specified range, semantics similar to Solaris's MADV_DONTNEED. * Add MADV_FREE: free pages and swap resources associated with the specified range, causing the range to be reloaded from backing store (vnodes) or zero-fill (anonymous), semantics like FreeBSD's MADV_FREE and like Digital UNIX's MADV_DONTNEED (isn't it SO GREAT that madvise(2) isn't standardized!?) As part of this, move the non-map-modifying advice handling out of uvm_map_advise(), and into sys_madvise(). As another part, implement general amap cleaning in uvm_map_clean(), and change uvm_map_clean() to only push dirty pages to disk if PGO_CLEANIT is set in its flags (and update sys___msync13() accordingly). XXX Add a patchable global "amap_clean_works", defaulting to 1, which can disable the amap cleaning code, just in case problems are unearthed; this gives a developer/user a quick way to recover and send a bug report (e.g. boot into DDB and change the value). XXX Still need to implement a real uao_flush(). XXX Need to update the manual page. With these changes, rebuilding libc will automatically cause the new malloc(3) to use MADV_FREE to actually release pages and swap resources when it decides that can be done.	1999-07-07 06:02:21 +00:00
cgd	c1b7b40399	from the comment added to the code: > XXX (in)sanity check. We don't do proper datasize checking > XXX for anonymous (or private writable) mmap(). However, > XXX know that if we're trying to allocate more than the amount > XXX remaining under our current data size limit, _that_ should > XXX be disallowed. This is one link on the chain of lossage known as PR#7897. It's definitely not the right fix, but it's better than nothing.	1999-07-06 02:31:05 +00:00
thorpej	c859e43fbb	Fix tyop. From Bill Studenmund.	1999-07-01 18:40:39 +00:00
thorpej	72fcd1784e	Fix a typo.	1999-06-19 00:11:17 +00:00
thorpej	9e9f068f43	Add the guts of mlockall(MCL_FUTURE). This requires that a process's "memlock" resource limit to uvm_mmap(). Update all calls accordingly.	1999-06-18 05:13:45 +00:00
thorpej	01ac9b6529	In sys_mmap(): - rather than treating MAP_COPY like MAP_PRIVATE by sheer virtue of it not being MAP_SHARED, actually convert the MAP_COPY flag into MAP_PRIVATE. - return EINVAL if MAP_SHARED and MAP_PRIVATE are both included in flags.	1999-06-17 21:05:19 +00:00
minoura	ff8fb3ef82	Remove extra ].	1999-06-16 17:25:39 +00:00
thorpej	c5a43ae10c	Several changes, developed and tested concurrently: * Provide POSIX 1003.1b mlockall(2) and munlockall(2) system calls. MCL_CURRENT is presently implemented. MCL_FUTURE is not fully implemented. Also, the same one-unlock-for-every-lock caveat currently applies here as it does to mlock(2). This will be addressed in a future commit. * Provide the mincore(2) system call, with the same semantics as Solaris. * Clean up the error recovery in uvm_map_pageable(). * Fix a bug where a process would hang if attempting to mlock a zero-fill region where none of the pages in that region are resident. [ This fix has been submitted for inclusion in 1.4.1 ]	1999-06-15 23:27:47 +00:00
mrg	f1f95c374b	implement madvice() for MADV_{NORMAL,RANDOM,SEQUENTIAL}, others are not yet done.	1999-05-23 06:27:13 +00:00

1 2

70 Commits