NetBSD

Commit Graph

Author	SHA1	Message	Date
tnn	4407197569	Add missing underscore to wchan name.	2007-05-15 14:35:29 +00:00
perseant	0549fd6148	Add/change a couple of comments about locking restrictions.	2007-04-18 00:50:06 +00:00
ad	59d979c5f1	Pass an ipl argument to pool_init/POOL_INIT to be used when initializing the pool's lock.	2007-03-12 18:18:22 +00:00
thorpej	712239e366	Replace the Mach-derived boolean_t type with the C99 bool type. A future commit will replace use of TRUE and FALSE with true and false.	2007-02-21 22:59:35 +00:00
ad	9abeea588a	Replace some uses of lockmgr() / simplelocks.	2007-02-15 15:40:50 +00:00
christos	168cd830d2	__unused removal on arguments; approved by core.	2006-11-16 01:32:37 +00:00
christos	4d595fd7b1	- sprinkle __unused on function decls. - fix a couple of unused bugs - no more -Wno-unused for i386	2006-10-12 01:30:41 +00:00
christos	b64edcaded	fix empty if	2006-10-04 15:53:24 +00:00
perseant	8c43e08b21	Don't remark a locked inode with IN_MODIFIED after writing it to disk, if we ourselves hold the lock. This prevents e.g. mknod from hanging indefinitely. Also, always use the return value from VOP_ISLOCKED to determine whether we hold the lock or someone else does, rather than looking into the lock structure ourselves.	2006-09-15 18:50:49 +00:00
perseant	437e855235	Changes to help the roll-forward agent, to wit: * Mark being-deleted files in the Ifile so we can finish deleting them at fs mount time. * Flag the Ifile with "cleaner must clean" when writers are waiting for the cleaner, rather than relying solely on the cleaner's estimation of whether it should clean or not. * Note partial segments written by a user agent (in particular, fsck_lfs) so that repeated rolls forward don't interfere with one another. * Add a new fcntl, LFCNPASS, that allows the log to wrap exactly once, for better testing of the validity of checkpoints. * Keep track of the on-disk nlink count when cleaning, so that we don't partially complete directory operations while cleaning. * Ensure that every single Ifile inode write represents a consistent view of the filesystem. In particular, the accounting for the segment we are writing the inode into must be correct, and the accounting for the segment that inode used to reside in must be correct. Rather than just rewriting the inode if we wrote it wrong, rewrite the necessary ifile blocks before writing the inode so we never write it wrong. * Don't unmark any VDIROP vnodes if we haven't written them to disk, avoiding yet another problem with the "wait for the cleaner" error return from lfs_putpages(). Also, move the last callback to an aiodone call, so we no longer do any memory management from interrupt context.	2006-09-01 19:41:28 +00:00
perseant	b99e4c8268	Don't wake up the cleaner if the filesystem is unwrappable, and fix the compatibility fcntls. Also includes one-line fixes for an MP locking bug and a zero-length FINFO problem that manifested during testing.	2006-06-29 19:28:21 +00:00
perseant	ce053245eb	Introduce another per-filesystem parameter, lfs_resvseg, to separate the notion of "how many segments are reserved for the cleaner" from that of "how many segments are not counted in lfs_bfree". The default value used for existing filesystems is the same as the previous implicit value of (lfs_minfreeseg / 2 + 1), modulo some sanity checking. Count pending dirops on a per-filesystem basis, since once we start writing them we can't stop until we're done. This seems to help stave off the "no clean segments" panic in the case of filling the filesystem with directories and small files (e.g. simultaneously unpacking more copies of pkgsrc than will fit).	2006-05-04 04:22:55 +00:00
perseant	d28248e84e	Make the segment lock aware of LWPs. Fixes a (somewhat confusing) "lockmgr: pid 3997, not exclusive lockholder 3997, unlocking" panic I encountered while running blogbench on an LFS.	2006-04-07 23:44:14 +00:00
perseant	dddf5c5171	Improvements to LFS's paging mechanism, to wit: * Acknowledge that sometimes there are more dirty pages to be written to disk than clean segments. When we reach the danger line, lfs_gop_write() now returns EAGAIN. The caller of VOP_PUTPAGES(), if it holds the segment lock, drops it and waits for the cleaner to make room before continuing. * Note and avoid a three-way deadlock in lfs_putpages (a writer holding a page busy blocks on the cleaner while the cleaner blocks on the segment lock while lfs_putpages blocks on the page).	2006-03-24 20:05:32 +00:00
yamt	03f80508d6	- unify ffs_blkatoff and lfs_blkatoff. - remove ufs_ops::uo_blkatoff. - add directory read-ahead code. (disabled for now.)	2006-01-14 17:41:16 +00:00
christos	95e1ffb156	merge ktrace-lwp.	2005-12-11 12:16:03 +00:00
yamt	a748ea88dd	merge yamt-vop branch. remove following VOPs. VOP_BLKATOFF VOP_VALLOC VOP_BALLOC VOP_REALLOCBLKS VOP_VFREE VOP_TRUNCATE VOP_UPDATE	2005-11-02 12:38:58 +00:00
christos	273df63602	- sprinkle const - avoid shadow variables.	2005-05-29 21:25:24 +00:00
perseant	94decdd25d	Use lfs_malloc() to manage the blkiov arrays that the cleaner functions use, since the cleaner is likely to operate in a low-memory condition.	2005-04-16 17:28:37 +00:00
perseant	1ebfc508b6	Protect various per-fs structures with fs->lfs_interlock simple_lock, to improve behavior in the multiprocessor case. Add debugging segment-lock assertion statements.	2005-04-01 21:59:46 +00:00
perseant	eefd94b8e2	Straighten out the maze of ifdefs. Instead, consolidate all the debugging stuff under '#ifdef DEBUG', and use sysctl knobs to turn on/off particular parts of the debugging reporting (if DEBUG is enabled). Re-enable the LFS statistics in sysctl, while I'm there. A bit of a rototill.	2005-03-08 00:18:19 +00:00
perry	bcfcddbac1	nuke trailing whitespace	2005-02-26 22:31:44 +00:00
perseant	25f49c3c91	Various minor LFS improvements: * Note when lfs_putpages(9) thinks it is not going to be writing any pages before calling genfs_putpages(9). This prevents a situation in which blocks can be queued for writing without a segment header. * Correct computation of NRESERVE(), though it is still a gross overestimate in most cases. Note that if NRESERVE() is too high, it may be impossible to create files on the filesystem. We catch this case on filesystem mount and refuse to mount r/w. * Allow filesystems to be mounted whose block size is == MAXBSIZE. * Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN entries in indirect blocks again, triggering a failed assertion "daddr <= LFS_MAX_DADDR". Explicitly convert to and from int32_t to correct this. * Add a high-water mark for the number of dirty pages any given LFS can hold before triggering a flush. This is settable by sysctl, but off (zero) by default. * Be more careful about the MAX_BYTES and MAX_BUFS computations so we shouldn't see "please increase to at least zero" messages. * Note that VBLK and VCHR vnodes can have nonzero values in di_db[0] even though their v_size == 0. Don't panic when we see this. * Change lfs_bfree to a signed quantity. The manner in which it is processed before being passed to the cleaner means that sometimes it may drop below zero, and the cleaner must be aware of this. * Never report bfree < 0 (or higher than lfs_dsize) through lfs_statvfs(9). This prevents df(1) from ever telling us that our full filesystems have 16TB free. * Account space allocated through lfs_balloc(9) that does not have associated buffer headers, so that the pagedaemon doesn't run us out of segments. * Return ENOSPC from lfs_balloc(9) when bfree drops to zero. * Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being unmounted. Because vfs_busy() is a shared lock, and lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be holding the lock that umount() is blocking on, then try to vfs_busy() again in getnewvnode().	2005-02-26 05:40:42 +00:00
yamt	81ce5e8cc3	use correct segment size. this fixes memory corruption when using lfsv1.	2004-03-09 06:43:18 +00:00
simonb	740725d725	Fix usage of fifth argument to pool_init().	2003-12-21 07:53:58 +00:00
dbj	fe7c786886	add mnt_iflag field to struct mount for internal flags mv MNT_GONE, MNT_UNMOUNT and MNT_WANTRDWR to this field additonally add mnt_writeopcountupper and mnt_writeopcountlower fields in preparation for pending write suspension support work bump kernel version to 1.6ZD	2003-10-14 14:02:56 +00:00
yamt	3ed90e8152	use LFS_DEBUG_COUNTLOCKED macro.	2003-09-07 11:44:22 +00:00
agc	aad01611e7	Move UCB-licensed code from 4-clause to 3-clause licence. Patches provided by Joel Baker in PR 22364, verified by myself.	2003-08-07 16:26:28 +00:00
yamt	3852db2096	- protect global resource counts with lfs_subsys_lock. - clean up scattered externs a little.	2003-07-12 16:17:06 +00:00
yamt	102c8a6a74	- add a new functions, lfs_writer_enter/leave, and use them instead of duplicated code fragments. - add an assertion.	2003-07-02 13:40:51 +00:00
perseant	ef3c60764c	Make LFS work better (though still not "well") as an NFS-exported filesystem (and other things that needed to be fixed before the tests would complete), to wit: * Include the fs ident in the filehandle; improve stale filehandle checks. * Change definition of blksize() to use the on-dinode size instead of the inode's i_size, so that fsck_lfs will work properly again. * Use b_interlock in lfs_vtruncbuf. * Postpone dirop reclamation until after the seglock has been released, so that lfs_truncate is not called with the segment lock held. * Don't loop in lfs_fsync(), just write everything and wait. * Be more careful about the interlock/uobjlock in lfs_putpages: when we lose this lock, we have to resynchronize dirtiness of pages in each block. * Be sure to always write indirect blocks and update metadata in lfs_putpages; fixes a bug that caused blocks to be accounted to the wrong segment.	2003-04-23 07:20:37 +00:00
perseant	78dff1cfa3	KNF (space after keywords).	2003-03-21 06:26:36 +00:00
perseant	ea03a1ac09	Add simple_lock protection for lfs_seglock and lfs_subsys_pages; these will be expanded to cover other per-fs and subsystem-wide data as well. Fix a case of IN_MODIFIED being set without updating lfs_uinodes, resulting in a "lfs_uinodes < 0" panic. Fix a deadlock in lfs_putpages arising from the need to busy all pages in a block; unbusy any that had already been busied before starting over.	2003-03-15 06:58:49 +00:00
perseant	ec13062af8	- Get rid of unused #ifdefs LFS_NO_PAGEMOVE and LFS_MALLOC_SUMMARY (both always true) and accompanying dead code. - When constructing write clusters in lfs_writeseg, if the block we are about to add is itself a cluster from GOP_WRITE, don't put a cluster in a cluster, just write the GOP_WRITE cluster on its own. This seems to represent a slight performance gain on my test machine. - Charge someone's rusage for writes on LFSes. It's difficult to tell who the "right" process to charge is; just charge whoever triggered the write.	2003-03-11 02:47:39 +00:00
perseant	4b4f884b89	Add an lfs_strategy() that checks to make sure we're not trying to read where the cleaner is trying to write, instead of tying up the "live" buffers (or pages). Fix a bug in the LFS_UBC case where oversized buffers would not be checksummed correctly, causing uncleanable segments. Make sure that wakeup(fs->lfs_iocount) is done if fs->lfs_iocount is 1 as well as 0, since we wait in some places for it to drop to 1. Activate all pages that make it into lfs_gop_write without the segment lock held, since they must have been dirtied very recently, even if PG_DELWRI is not set.	2003-03-08 02:55:47 +00:00
perseant	003cfbd545	Don't add dirty blocks to the ifile in lfs_segunlock, if we're trying to unmount the filesystem. This avoids a "dirty blocks" panic.	2003-03-04 19:15:26 +00:00
perseant	3ab94fed93	Fix a buffer overflow bug in the LFS_UBC case that manifested itself either as a mysterious UVM error or as "panic: dirty bufs". Verify maximum size in lfs_malloc. Teach lfs_updatemeta and lfs_shellsort about oversized cluster blocks from lfs_gop_write. When unwiring pages in lfs_gop_write, deactivate them, under the theory that the pagedaemon wanted to free them last we knew.	2003-02-23 00:22:33 +00:00
perseant	fdf4bfe002	Tabify, and fix some comment alignment problems.	2003-02-20 04:27:23 +00:00
yamt	5f444770aa	add debug code to lfs_free.	2003-02-19 12:58:53 +00:00
perseant	b397c875ae	Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now (there are still some details to work out) but expect that to go away soon. To support these basic changes (creation of lfs_putpages, lfs_gop_write, mods to lfs_balloc) several other changes were made, to wit: * Create a writer daemon kernel thread whose purpose is to handle page writes for the pagedaemon, but which also takes over some of the functions of lfs_check(). This thread is started the first time an LFS is mounted. * Add a "flags" parameter to GOP_SIZE. Current values are GOP_SIZE_READ, meaning that the call should return the size of the in-core version of the file, and GOP_SIZE_WRITE, meaning that it should return the on-disk size. One of GOP_SIZE_READ or GOP_SIZE_WRITE must be specified. * Instead of using malloc(...M_WAITOK) for everything, reserve enough resources to get by and use malloc(...M_NOWAIT), using the reserves if necessary. Use the pool subsystem for structures small enough that this is feasible. This also obsoletes LFS_THROTTLE. And a few that are not strictly necessary: * Moves the LFS inode extensions off onto a separately allocated structure; getting closer to LFS as an LKM. "Welcome to 1.6O." * Unified GOP_ALLOC between FFS and LFS. * Update LFS copyright headers to correct values. * Actually cast to unsigned in lfs_shellsort, like the comment says. * Keep track of which segments were empty before the previous checkpoint; any segments that pass two checkpoints both dirty and empty can be summarily cleaned. Do this. Right now lfs_segclean still works, but this should be turned into an effectless compatibility syscall.	2003-02-17 23:48:08 +00:00
yamt	53d6eb47ee	don't use daddr_t for segment summary since it's an on-disk structure.	2003-01-29 13:14:33 +00:00
fvdl	a3ff3a3038	Bump daddr_t to 64 bits. Replace it with int32_t in all places where it was used on-disk, so that on-disk formats remain the same. Remove ufs_daddr_t and ufs_lbn_t for the time being.	2003-01-24 21:55:02 +00:00
perseant	8f30dc2c9b	Remove lying comment on SEGM_PROT seglock.	2002-07-11 21:09:00 +00:00
perseant	32ae84b188	Deal with fragment size changes better. For each fragment that can exist on an on-disk inode, we keep a record of its size in struct inode, which is updated when we write the block to disk. The cleaner routines thus have ready access to what size is the correct size for this block, on disk. Fixed a related bug: if a file with fragments is being cleaned (fragments being cleaned) at the same time it is being extended beyond NDADDR blocks, we could write a bogus FINFO record that has a frag in the middle; when it was cleaned this would give back bogus file data. Don't write the indirect blocks in this case, since there is no need. lfs_fragextend and lfs_truncate no longer require the seglock, but instead take a shared lock, which the seglock locks exclusively.	2002-07-06 01:30:11 +00:00
perseant	ddfb1dbb92	For synchronous writes, keep separate i/o counters for each write, so processes don't have to wait for one another to finish (e.g., nfsd seems to be a little happier now, though I haven't measured the difference). Synchronous checkpoints, however, must always wait for all i/o to finish. Take the contents of the callback functions and have them run in thread context instead (aiodoned thread). lfs_iocount no longer has to be protected in splbio(), and quite a bit less of the segment construction loop needs to be in splbio() as well. If lfs_markv is handed a block that is not the correct size according to the inode, refuse to process it. (Formerly it was extended to the "correct" size.) This is possibly more prone to deadlock, but less prone to corruption. lfs_segclean now outright refuses to clean segments that appear to have live bytes in them. Again this may be more prone to deadlock but avoids corruption. Replace ufsspec_close and ufsfifo_close with LFS equivalents; this means that no UFS functions need to know about LFS_ITIMES any more. Remove the reference from ufs/inode.h. Tested on i386, test-compiled on alpha.	2002-06-16 00:13:15 +00:00
perseant	d67a5bbb21	Fix a couple of instances where reassignbuf() was not done at splbio. Tested on i386.	2002-05-24 22:13:57 +00:00
perseant	43ca783b4a	Back out rev 1.174 of vfs_subr.c, because the splbio() wasn't protecting enough to be useful, and broadening it so that it did would have meant that operations possibly requiring synchronous disk activity would have to be done in splbio(). This clearly was not going to work. Worked around this in the LFS case by having lfs_cluster_callback put an extra hold on the vnode before calling biodone(), and taking the hold off without HOLDRELE's problematic list swapping. lfs_vunref() will take care of that---in thread context---on the next write if need be. Also, ensure that the list walking in lfs_{writevnodes,segunlock,gather} takes into account the possibility that the list may change underneath it (possibly because it itself deleted an element). Tested on i386, test-compiled on alpha.	2002-05-23 23:05:25 +00:00
perseant	36efaa3565	use macros from <sys/queue.h>	2002-05-17 21:42:38 +00:00
perseant	8886b0f4b2	Phase one of my three-phase plan to make LFS play nice with UBC, and bug-fixes I found while making sure there weren't any new ones. * Make the write clusters keep track of the buffers whose blocks they contain. This should make it possible to (1) write clusters using a page mapping instead of malloc, if desired, and (2) schedule blocks for rewriting (somewhere else) if a write error occurs. Code is present to use pagemove() to construct the clusters but that is untested and will go away anyway in favor of page mapping. * DEBUG now keeps a log of Ifile writes, so that any lingering instances of the "dirty bufs" problem can be properly debugged. * Keep track of whether the Ifile has been dirtied by various routines that can be called by lfs_segwrite, and loop on that until it is clean, for a checkpoint. Checkpoints need to be squeaky clean. * Warn the user (once) if the Ifile grows larger than is reasonable for their buffer cache. Both lfs_mountfs and lfs_unmount check since the Ifile can grow. * If an inode is not found in a disk block, try rereading the block, under the assumption that the block was copied to a cluster and then freed. * Protect WRITEINPROG() with splbio() to fix a hang in lfs_update.	2002-05-14 20:03:53 +00:00
chs	a106161b5a	add spaces for KNF. confirmed to produce identical objects.	2001-11-23 21:44:25 +00:00

1 2

70 Commits