NetBSD

Commit Graph

Author	SHA1	Message	Date
chs	64c6d1d2dc	a whole bunch of changes to improve performance and robustness under load: - remove special treatment of pager_map mappings in pmaps. this is required now, since I've removed the globals that expose the address range. pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's no longer any need to special-case it. - eliminate struct uvm_vnode by moving its fields into struct vnode. - rewrite the pageout path. the pager is now responsible for handling the high-level requests instead of only getting control after a bunch of work has already been done on its behalf. this will allow us to UBCify LFS, which needs tighter control over its pages than other filesystems do. writing a page to disk no longer requires making it read-only, which allows us to write wired pages without causing all kinds of havoc. - use a new PG_PAGEOUT flag to indicate that a page should be freed on behalf of the pagedaemon when it's unlocked. this flag is very similar to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the pageout fails due to eg. an indirect-block buffer being locked. this allows us to remove the "version" field from struct vm_page, and together with shrinking "loan_count" from 32 bits to 16, struct vm_page is now 4 bytes smaller. - no longer use PG_RELEASED for swap-backed pages. if the page is busy because it's being paged out, we can't release the swap slot to be reallocated until that write is complete, but unlike with vnodes we don't keep a count of in-progress writes so there's no good way to know when the write is done. instead, when we need to free a busy swap-backed page, just sleep until we can get it busy ourselves. - implement a fast-path for extending writes which allows us to avoid zeroing new pages. this substantially reduces cpu usage. - encapsulate the data used by the genfs code in a struct genfs_node, which must be the first element of the filesystem-specific vnode data for filesystems which use genfs_{get,put}pages(). - eliminate many of the UVM pagerops, since they aren't needed anymore now that the pager "put" operation is a higher-level operation. - enhance the genfs code to allow NFS to use the genfs_{get,put}pages instead of a modified copy. - clean up struct vnode by removing all the fields that used to be used by the vfs_cluster.c code (which we don't use anymore with UBC). - remove kmem_object and mb_object since they were useless. instead of allocating pages to these objects, we now just allocate pages with no object. such pages are mapped in the kernel until they are freed, so we can use the mapping to find the page to free it. this allows us to remove splvm() protection in several places. The sum of all these changes improves write throughput on my decstation 5000/200 to within 1% of the rate of NetBSD 1.5 and reduces the elapsed time for "make release" of a NetBSD 1.5 source tree on my 128MB pc to 10% less than a 1.5 kernel took.	2001-09-15 20:36:31 +00:00
chs	adf5d360a7	add a new VFS op, vfs_reinit, which is called when desiredvnodes is adjusted via sysctl. file systems that have hash tables which are sized based on the value of this variable now resize those hash tables using the new value. the max number of FFS softdeps is also recalculated. convert various file systems to use the <sys/queue.h> macros for their hash tables.	2001-09-15 16:12:54 +00:00
jdolecek	332bb4894a	bound check mount args more thoroughly	2001-08-03 06:00:13 +00:00
jdolecek	9bbd53c2ba	vfs_sysctl(): cosmetic: provide explicit size for vfsnames[], to catch mistakes VFS_MAXID/CTL_VFS_NAMES are updated	2001-07-08 10:32:38 +00:00
jdolecek	77f0267d21	Use array based upon CTL_VFS_NAMES to get filesystem name for non-VFS_GENERIC syscall, instead of mountcompatnames[]. Move the extern mountcompatnames[], nmountcompatnames definition to COMPAT_09 \|\| COMPAT_43 section.	2001-06-28 08:12:08 +00:00
thorpej	25f00d4c18	In getnewvnode(), allocate a vnode from the pool with NOWAIT. If that fails, just try to recycle a vnode. If we can't allocate or recycle, issue a warning, sleep a bit, and try the whole thing again. This prevents us from blocking forever if we want to use a very large number of vnodes, but don't have {memory,kva} resources from which to allocate them.	2001-06-26 22:52:03 +00:00
jdolecek	e65c47a67f	vfs_rootmountalloc: take advantage of LIST_FOREACH()	2001-06-26 19:14:25 +00:00
wrstuden	716d3ae08f	In vcount(), when getting rid of unused aliases, don't vgone one which has VXLOCK set - it's already being vgoned, most likely by one of our callers. If we call vgone, we can end up sleeping against ourself with VXLOCK set - we'll start the race for root. Pointed out by Love <lha@stacken.kth.se> on tech-kern. Analysis from Artur Grabowski <art@openbsd.org> via Love. Should resolve PR kern/13077	2001-06-26 15:51:06 +00:00
thorpej	e93d1531c2	Avoid a sleeping malloc call while holding the spechash_slock. XXX This is kinda gross, but prevents complete lossage on an XXX MP system. From Bill Sommerfeld.	2001-06-05 04:42:05 +00:00
thorpej	5b35dc8136	When unmounting a file system, acquire the syncer_lock before vfs_busy'ing just before the dounmount() call. This is to avoid sleeping with the mountlist_slock held -- but we must acquire syncer_lock before vfs_busy because the syncer itself uses syncer_lock -> vfs_busy locking order.	2001-04-16 22:41:09 +00:00
enami	c75004b245	Fix the name of some bits in struct vnode.v_flag.	2001-04-09 14:14:10 +00:00
chs	83d071a318	add UBC memory-usage balancing. we track the number of pages in use for each of the basic types (anonymous data, executable image, cached files) and prevent the pagedaemon from reusing a given page if that would reduce the count of that type of page below a sysctl-setable minimum threshold. the thresholds are controlled via three new sysctl tunables: vm.anonmin, vm.vnodemin, and vm.vtextmin. these tunables are the percentages of pageable memory reserved for each usage, and we do not allow the sum of the minimums to be more than 95% so that there's always some memory that can be reused.	2001-03-09 01:02:10 +00:00
jdolecek	522f569810	make some more constant arrays 'const'	2001-02-21 21:39:52 +00:00
chs	eef7499a6c	in vtruncbuf(), pass 0 (meaning everything at or past the start of the range) instead of the vnode's size to pgo_flush() since there can be pages past EOF. in the same call, cast "lbn" to voff_t to avoid overflow.	2001-02-06 10:58:55 +00:00
chs	6717a2ac1b	in vtruncbuf(), use a "synchronous freeing" flush to prevent a race between write i/os in a disk-based filesystem vs. the disk block being freed by a truncation, allocated to a new file, and written again with different data. if the disk driver reorders the requests and does the second i/o first, the old data will clobber the new, corrupting the new file.	2001-01-08 07:05:47 +00:00
sommerfeld	f2bdd546dd	Add a missing simple_unlock() to the LK_NOWAIT/VXLOCK error case in vget().	2000-12-31 03:13:51 +00:00
chs	aeda8d3b77	Initial integration of the Unified Buffer Cache project.	2000-11-27 08:39:39 +00:00
chs	fa19fe52db	adjust the spinlock macros in the non-MULTIPROCESSOR, non-LOCKDEBUG case so that gcc will think that static spinlock are used. this allows us to remove the ugly conditionalization of static spinlock declarations.	2000-11-24 03:59:07 +00:00
fvdl	8c28d7e864	Adapt for VOP_FSYNC parameter change.	2000-09-19 22:00:01 +00:00
enami	445cbcb8c1	Accquire vnode interlock while playing with flags to see if there is someone waiting this vnode.	2000-09-05 05:13:43 +00:00
bouyer	efc4435cb3	in vfs_shutdown(), use sched_suspend() to suspend scheduling, and use tsleep() instead of DELAY. Also, keep trying flushing buffers when the number of dirty buffers decreases (20 rounds may not be enouth for a very large buffer cache). Using tsleep instead of delay gives a chance to others kernel threads to run, which is needed for raidframe. With this change I've not been able to reproduce the 'dirty buffer not flushed' problem with raidframe.	2000-08-31 14:41:35 +00:00
enami	d707b78562	Declare this static simplelock data only when MULTIPROCESSOR or LOCKDEBUG is defined to prevent compiler warning.	2000-08-21 06:42:57 +00:00
sommerfeld	78e4a089b8	Don't bother reinitializing statically-inited locks	2000-08-21 02:16:30 +00:00
sommerfeld	8875442492	Statically initialize statically-allocated locks	2000-08-19 17:25:33 +00:00
sommerfeld	861fcc44b7	Use ltsleep(...,PNORELOCK..) instead of simple_unlock()/tsleep()	2000-08-12 16:43:00 +00:00
fvdl	57e3691758	Don't wait for B_READ buffers to finish in vfs_shutdown, it makes no sense to do so. From Ethan Solomita.	2000-07-16 21:07:24 +00:00
jdolecek	1ec07d7439	change tablefull() to accept one more parameter - optional hint use that to inform about way to raise current limit when we reach maximum number of processes, descriptors or vnodes XXX hopefully I catched all users of tablefull()	2000-07-04 15:33:28 +00:00
fvdl	975751cda2	vinsheadfree -> ungetnewvnode	2000-06-27 23:51:51 +00:00
fvdl	c39797c045	Add vinsheadfree, a small function to push vnodes that have just been allocated by getnewvnode, back onto the head of the free list. Needed in some VFS_VGET functions to deal with races.	2000-06-27 23:34:45 +00:00
mrg	32aa199ccf	remove include of <vm/vm.h>	2000-06-27 17:41:07 +00:00
sommerfeld	e964d558a7	Fix assorted bugs around shutdown/reboot/panic time. - add a new global variable, doing_shutdown, which is nonzero if vfs_shutdown() or panic() have been called. - in panic, set RB_NOSYNC if doing_shutdown is already set on entry so we don't reenter vfs_shutdown if we panic'ed there. - in vfs_shutdown, don't use proc0's process for sys_sync unless curproc is NULL. - in lockmgr, attribute successful locks to proc0 if doing_shutdown && curproc==NULL, and panic if we can't get the lock right away; avoids the spurious lockmgr DIAGNOSTIC panic from the ddb reboot command. - in subr_pool, deal with curproc==NULL in the doing_shutdown case. - in mfs_strategy, bitbucket writes if doing_shutdown, so we don't wedge waiting for the mfs process. - in ltsleep, treat ((curproc == NULL) && doing_shutdown) like the panicstr case. Appears to fix: kern/9239, kern/10187, kern/9367. May also fix kern/10122.	2000-06-10 18:44:43 +00:00
assar	6c734cd283	make vfs_getnewfsid only take one argument and fetch the name of the filesystem from the supplied mount argument. also make makefstype take a const parameter. update all the callers.	2000-06-10 18:27:01 +00:00
mycroft	4656dfd24f	Add a new function to remove extra buffers when truncating a file. This is more generic than the vinvalbuf(V_SAVEMETA) case, avoiding synchronous operations when truncating to a non-zero length.	2000-05-28 04:13:56 +00:00
chs	1c084aee4f	add ddb commands for printing vnodes and bufs.	2000-04-10 02:22:13 +00:00
augustss	c87c1861bb	Add a special option, DEBUG_HALT_BUSY, that allows you to debug when the system doesn't want to halt cleanly. The code was there before, but only with the DEBUG option.	2000-03-30 09:32:25 +00:00
augustss	264f1d27c6	Get rid of register declarations.	2000-03-30 09:27:11 +00:00
fvdl	c3167b9545	Do previous better. Use FSYNC_RECLAIM as it was before.	2000-03-17 01:25:06 +00:00
jdolecek	89015c4648	Add new VFS op routine - vfs_done and call it on filesystem detach in vfs_detach(). vfs_done may free global filesystem's resources, typically those allocated in respective filesystem's init function. Needed so those filesystems which went in via LKM have a chance to clean after themselves before unloading. This fixes random panics when LKM for filesystem using pools was loaded and unloaded several times. For each leaf filesystem, add appropriate vfs_done routine.	2000-03-16 18:08:17 +00:00
fvdl	01db605567	Do the previous slightly different: any files on MNT_SOFTDEP filesystems do not want all their metadata dependencies flushed from vinvalbuf() if there are no dirty blocks.	2000-03-15 16:28:45 +00:00
perseant	61fa9e1409	Move vinvalbuf's check for dirty blocks into ffs_fsync, to ensure that mode and ownership bits are flushed to disk before the vnode is reclaimed. The check, introduced in the softdep merge, assumes that if no blocks are dirty, no file data or metadata needs to be flushed to disk. This is true of ffs, but is not true of lfs, and may not be true of other filesystems. Tested by myself and Bill Squier <groo@cs.stevens-tech.edu>.	2000-03-11 05:00:18 +00:00
mycroft	1d915f4130	Allow my disk to actually spin down using `-o async' again. Note: This uses the same questionable logic as vfs_bio.c to check MNT_ASYNC. Something needs to be done about this.	2000-03-03 05:21:03 +00:00
fvdl	c13f6dd258	Introduce a sysctl to enable/disable if non-root users can mount filesystems. Default: off.	2000-02-16 11:57:45 +00:00
perseant	fa6a733240	In lfs_bwrite, don't mark buffers dirty if lfs is mounted read-only. (Previously buffers could be marked dirty by the cleaner, and possibly by other means.) Also check for softdep mount in vfs_shutdown before trying to bawrite buffers, since other filesystems don't need it and lfs doesn't bawrite. (This fragment reviewed by fvdl.) Partially addresses PR#8964.	1999-12-15 07:10:32 +00:00
fvdl	d901f6eae0	Be more careful to block bio interrupts for some data structures. There were at least a few missed cases where vp->v_{clean,dirty}blkhd were unprotected since the softdep/trickle sync merge.	1999-11-23 23:52:40 +00:00
enami	f6b8114fc7	Initialize the vnode_hold_list correctly.	1999-11-18 05:50:25 +00:00
fvdl	0b1963121a	Add Kirk McKusick's soft updates code to the trunk. Not enabled by default, as the copyright on the main file (ffs_softdep.c) is such that is has been put into gnusrc. options SOFTDEP will pull this in. This code also contains the trickle syncer. Bump version number to 1.4O	1999-11-15 18:49:07 +00:00
mycroft	fde519b5e2	Widen usecount and writecount to prevent overflow.	1999-10-01 22:03:17 +00:00
mycroft	5e7ae44739	Correct spelling in an #ifdef.	1999-10-01 21:57:42 +00:00
wrstuden	ba891a728d	Deal with device vnodes which aren't on the spechash tables, rather than panicing. So now we make sure vp->v_hashchain != NULL before removing the node from the chain.	1999-08-20 22:21:25 +00:00
thorpej	f2c2e160b1	Fix "print vnodes for dirty buffers" change: use vprint(); VOP_PRINT() is only meant to be used by vprint(), and vprint() provides more information about the vnode.	1999-08-19 18:09:44 +00:00

1 2 3 4

158 Commits