NetBSD

Author	SHA1	Message	Date
yamt	a8e8f3ea02	lfs_writevnodes: in the case of "starting over", kick lfs_writeseg in order to avoid deadlock in check_dirty.	2003-03-20 14:17:21 +00:00
perseant	12a78a5a7e	Don't break out of Ifile-writing loop in lfs_segwrite until nothing is left. Note however that blocks can be added to the Ifile even when the segment block is held because of inodes' atime. Do not panic with "dirty blocks" if these blocks are present.	2003-03-20 06:51:17 +00:00
perseant	ea03a1ac09	Add simple_lock protection for lfs_seglock and lfs_subsys_pages; these will be expanded to cover other per-fs and subsystem-wide data as well. Fix a case of IN_MODIFIED being set without updating lfs_uinodes, resulting in a "lfs_uinodes < 0" panic. Fix a deadlock in lfs_putpages arising from the need to busy all pages in a block; unbusy any that had already been busied before starting over.	2003-03-15 06:58:49 +00:00
kristerw	ea98786439	SO C requires a statement after a label.	2003-03-15 02:27:18 +00:00
perseant	ec13062af8	- Get rid of unused #ifdefs LFS_NO_PAGEMOVE and LFS_MALLOC_SUMMARY (both always true) and accompanying dead code. - When constructing write clusters in lfs_writeseg, if the block we are about to add is itself a cluster from GOP_WRITE, don't put a cluster in a cluster, just write the GOP_WRITE cluster on its own. This seems to represent a slight performance gain on my test machine. - Charge someone's rusage for writes on LFSes. It's difficult to tell who the "right" process to charge is; just charge whoever triggered the write.	2003-03-11 02:47:39 +00:00
perseant	8feb2c22f5	Take away "#ifdef LFS_UBC".	2003-03-08 21:46:04 +00:00
perseant	4b4f884b89	Add an lfs_strategy() that checks to make sure we're not trying to read where the cleaner is trying to write, instead of tying up the "live" buffers (or pages). Fix a bug in the LFS_UBC case where oversized buffers would not be checksummed correctly, causing uncleanable segments. Make sure that wakeup(fs->lfs_iocount) is done if fs->lfs_iocount is 1 as well as 0, since we wait in some places for it to drop to 1. Activate all pages that make it into lfs_gop_write without the segment lock held, since they must have been dirtied very recently, even if PG_DELWRI is not set.	2003-03-08 02:55:47 +00:00
perseant	d51fdbef63	Make sure we hold the uobjlock when checking for dirty pages, in lfs_vflush. Note that pages can become dirty without our knowing it, anyway; don't panic if that happens.	2003-03-04 19:19:43 +00:00
perseant	9192f047ac	Account SEGUSE_ACTIVE correctly so that the automatic segment cleaning actually happens. Add a new fcntl call that will write the minimum necessary to checkpoint (i.e., for on-disk directory structure to be consistent, not including updates to file data) so that the cleaner can clean segments more quickly without sacrificing three-way commit for cleaning.	2003-03-02 04:34:30 +00:00
perseant	3ab94fed93	Fix a buffer overflow bug in the LFS_UBC case that manifested itself either as a mysterious UVM error or as "panic: dirty bufs". Verify maximum size in lfs_malloc. Teach lfs_updatemeta and lfs_shellsort about oversized cluster blocks from lfs_gop_write. When unwiring pages in lfs_gop_write, deactivate them, under the theory that the pagedaemon wanted to free them last we knew.	2003-02-23 00:22:33 +00:00
perseant	fdf4bfe002	Tabify, and fix some comment alignment problems.	2003-02-20 04:27:23 +00:00
yamt	2be86f2ff8	acquire v_interlock before calling VOP_PUTPAGES.	2003-02-19 12:02:38 +00:00
perseant	b397c875ae	Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now (there are still some details to work out) but expect that to go away soon. To support these basic changes (creation of lfs_putpages, lfs_gop_write, mods to lfs_balloc) several other changes were made, to wit: * Create a writer daemon kernel thread whose purpose is to handle page writes for the pagedaemon, but which also takes over some of the functions of lfs_check(). This thread is started the first time an LFS is mounted. * Add a "flags" parameter to GOP_SIZE. Current values are GOP_SIZE_READ, meaning that the call should return the size of the in-core version of the file, and GOP_SIZE_WRITE, meaning that it should return the on-disk size. One of GOP_SIZE_READ or GOP_SIZE_WRITE must be specified. * Instead of using malloc(...M_WAITOK) for everything, reserve enough resources to get by and use malloc(...M_NOWAIT), using the reserves if necessary. Use the pool subsystem for structures small enough that this is feasible. This also obsoletes LFS_THROTTLE. And a few that are not strictly necessary: * Moves the LFS inode extensions off onto a separately allocated structure; getting closer to LFS as an LKM. "Welcome to 1.6O." * Unified GOP_ALLOC between FFS and LFS. * Update LFS copyright headers to correct values. * Actually cast to unsigned in lfs_shellsort, like the comment says. * Keep track of which segments were empty before the previous checkpoint; any segments that pass two checkpoints both dirty and empty can be summarily cleaned. Do this. Right now lfs_segclean still works, but this should be turned into an effectless compatibility syscall.	2003-02-17 23:48:08 +00:00
pk	338f31f581	Make the buffer cache code MP-safe.	2003-02-05 21:38:38 +00:00
thorpej	b193480908	Add extensible malloc types, adapted from FreeBSD. This turns malloc types into a structure, a pointer to which is passed around, instead of an int constant. Allow the limit to be adjusted when the malloc type is defined, or with a function call, as suggested by Jonathan Stone.	2003-02-01 06:23:35 +00:00
yamt	53d6eb47ee	don't use daddr_t for segment summary since it's an on-disk structure.	2003-01-29 13:14:33 +00:00
simonb	0adecbd12b	Remove variable that is only assigned to but not referenced.	2003-01-29 03:06:40 +00:00
yamt	e41d3a6f1c	make these compilable with lfs debug options. (follow daddr_t change) XXX maybe segment number should be 64bit.	2003-01-27 23:17:56 +00:00
kleink	865868a8b1	Further printf format fixes in the wake of daddr_t. Note that PRI?64 and long long int arguments aren't made for each other, nor are %lld and int64_t arguments.	2003-01-27 21:45:52 +00:00
kleink	4e0e5333ae	Fix further printf format warnings for DEBUG, in the wake of daddr_t having changed.	2003-01-25 23:00:09 +00:00
tron	5067836b9e	Use PRId64 instead of hard coding "%lld" to fix build problems under LP64 ports.	2003-01-25 18:12:31 +00:00
tron	63dda858c6	Fix printf() format strings problems caused by "daddr_t" change.	2003-01-25 12:50:38 +00:00
fvdl	a3ff3a3038	Bump daddr_t to 64 bits. Replace it with int32_t in all places where it was used on-disk, so that on-disk formats remain the same. Remove ufs_daddr_t and ufs_lbn_t for the time being.	2003-01-24 21:55:02 +00:00
yamt	5f254d46cc	backout wrong assertions that i added.	2003-01-08 17:16:52 +00:00
yamt	ee36fccabb	add assertions.	2003-01-08 15:40:54 +00:00
yamt	140a8e56ca	write ifile only when it has dirty buffers.	2002-12-31 14:54:32 +00:00
yamt	a999523301	no need for cleaner to hold vnode locks. cleaner and normal vnode operations are synchronized enough by seglock/fraglock and buf's B_BUSY-ness.	2002-12-17 14:37:49 +00:00
yamt	b2d5b49e2b	use ufs_daddr_t instead of int where appropriate.	2002-12-17 14:28:54 +00:00
yamt	e5ea55e4ea	in lfs_writefile, check v_type==VNON earlier. to avoid null dereference with DEBUG_LFS_VERBOSE.	2002-12-14 11:54:47 +00:00
yamt	8fe8a4ced8	save a segment write when doing checkpoint.	2002-12-13 14:40:02 +00:00
yamt	275b3a47a2	correct DIAGNOSTIC code for duplicated inodes in a segment and su_nbytes.	2002-12-12 12:28:13 +00:00
provos	0f09ed48a5	remove trailing \n in panic(). approved perry.	2002-09-27 15:35:29 +00:00
jdolecek	e305eb63e8	don't need <sys/conf.h> here	2002-09-22 19:32:54 +00:00
perseant	32ae84b188	Deal with fragment size changes better. For each fragment that can exist on an on-disk inode, we keep a record of its size in struct inode, which is updated when we write the block to disk. The cleaner routines thus have ready access to what size is the correct size for this block, on disk. Fixed a related bug: if a file with fragments is being cleaned (fragments being cleaned) at the same time it is being extended beyond NDADDR blocks, we could write a bogus FINFO record that has a frag in the middle; when it was cleaned this would give back bogus file data. Don't write the indirect blocks in this case, since there is no need. lfs_fragextend and lfs_truncate no longer require the seglock, but instead take a shared lock, which the seglock locks exclusively.	2002-07-06 01:30:11 +00:00
perseant	ddfb1dbb92	For synchronous writes, keep separate i/o counters for each write, so processes don't have to wait for one another to finish (e.g., nfsd seems to be a little happier now, though I haven't measured the difference). Synchronous checkpoints, however, must always wait for all i/o to finish. Take the contents of the callback functions and have them run in thread context instead (aiodoned thread). lfs_iocount no longer has to be protected in splbio(), and quite a bit less of the segment construction loop needs to be in splbio() as well. If lfs_markv is handed a block that is not the correct size according to the inode, refuse to process it. (Formerly it was extended to the "correct" size.) This is possibly more prone to deadlock, but less prone to corruption. lfs_segclean now outright refuses to clean segments that appear to have live bytes in them. Again this may be more prone to deadlock but avoids corruption. Replace ufsspec_close and ufsfifo_close with LFS equivalents; this means that no UFS functions need to know about LFS_ITIMES any more. Remove the reference from ufs/inode.h. Tested on i386, test-compiled on alpha.	2002-06-16 00:13:15 +00:00
perseant	d67a5bbb21	Fix a couple of instances where reassignbuf() was not done at splbio. Tested on i386.	2002-05-24 22:13:57 +00:00
perseant	43ca783b4a	Back out rev 1.174 of vfs_subr.c, because the splbio() wasn't protecting enough to be useful, and broadening it so that it did would have meant that operations possibly requiring synchronous disk activity would have to be done in splbio(). This clearly was not going to work. Worked around this in the LFS case by having lfs_cluster_callback put an extra hold on the vnode before calling biodone(), and taking the hold off without HOLDRELE's problematic list swapping. lfs_vunref() will take care of that---in thread context---on the next write if need be. Also, ensure that the list walking in lfs_{writevnodes,segunlock,gather} takes into account the possibility that the list may change underneath it (possibly because it itself deleted an element). Tested on i386, test-compiled on alpha.	2002-05-23 23:05:25 +00:00
perseant	ec0ca919be	Protect v_freelist with splbio(), since HOLDRELE can be called in interrupt context (through brelvp). (LFS may be the only subsystem affected by this problem.) Tested on i386.	2002-05-20 22:50:57 +00:00
perseant	36efaa3565	use macros from <sys/queue.h>	2002-05-17 21:42:38 +00:00
perseant	8886b0f4b2	Phase one of my three-phase plan to make LFS play nice with UBC, and bug-fixes I found while making sure there weren't any new ones. * Make the write clusters keep track of the buffers whose blocks they contain. This should make it possible to (1) write clusters using a page mapping instead of malloc, if desired, and (2) schedule blocks for rewriting (somewhere else) if a write error occurs. Code is present to use pagemove() to construct the clusters but that is untested and will go away anyway in favor of page mapping. * DEBUG now keeps a log of Ifile writes, so that any lingering instances of the "dirty bufs" problem can be properly debugged. * Keep track of whether the Ifile has been dirtied by various routines that can be called by lfs_segwrite, and loop on that until it is clean, for a checkpoint. Checkpoints need to be squeaky clean. * Warn the user (once) if the Ifile grows larger than is reasonable for their buffer cache. Both lfs_mountfs and lfs_unmount check since the Ifile can grow. * If an inode is not found in a disk block, try rereading the block, under the assumption that the block was copied to a cluster and then freed. * Protect WRITEINPROG() with splbio() to fix a hang in lfs_update.	2002-05-14 20:03:53 +00:00
chs	a106161b5a	add spaces for KNF. confirmed to produce identical objects.	2001-11-23 21:44:25 +00:00
lukem	ec6245465a	add RCSID	2001-11-08 02:39:06 +00:00
lukem	99147a7648	remove #include <ufs/ufs/quota.h> where it was just to appease <ufs/ufs/inode.h>, since the latter now includes the former. leave the former in source that obviously uses specific bits of it (for completeness.)	2001-10-26 05:56:06 +00:00
jdolecek	bd21ec5d2e	lfs_writeseg(): make el_size a size_t (cosmetic only, no functional change)	2001-07-26 20:20:15 +00:00
perseant	4e3fced95b	Merge the short-lived perseant-lfsv2 branch into the trunk. Kernels and tools understand both v1 and v2 filesystems; newfs_lfs generates v2 by default. Changes for the v2 layout include: - Segments of non-PO2 size and arbitrary block offset, so these can be matched to convenient physical characteristics of the partition (e.g., stripe or track size and offset). - Address by fragment instead of by disk sector, paving the way for non-512-byte-sector devices. In theory fragments can be as large as you like, though in reality they must be smaller than MAXBSIZE in size. - Use serial number and filesystem identifier to ensure that roll-forward doesn't get old data and think it's new. Roll-forward is enabled for v2 filesystems, though not for v1 filesystems by default. - The inode free list is now a tailq, paving the way for undelete (undelete is not yet implemented, but can be without further non-backwards-compatible changes to disk structures). - Inode atime information is kept in the Ifile, instead of on the inode; that is, the inode is never written just because atime was changed. Because of this the inodes remain near the file data on the disk, rather than wandering all over as the disk is read repeatedly. This speeds up repeated reads by a small but noticeable amount. Other changes of note include: - The ifile written by newfs_lfs can now be of arbitrary length, it is no longer restricted to a single indirect block. - Fixed an old bug where ctime was changed every time a vnode was created. I need to look more closely to make sure that the times are only updated during write(2) and friends, not after-the-fact during a segment write, and certainly not by the cleaner.	2001-07-13 20:30:18 +00:00
mrg	67afbd6270	use _KERNEL_OPT	2001-05-30 11:57:16 +00:00
joff	a6ef389457	If DIAGNOSTIC and the segment writer gets a badly sized buffer, panic() instead of silently corrupting the filesystem.	2001-01-09 05:05:35 +00:00
perseant	2a53ff5ab9	Get rid of some old unnecessary code that cleared B_NEEDCOMMIT from buffers in lfs_writeseg (possibly after they had been freed). If MALLOCLOG is defined, make lfs_newbuf and lfs_freebuf pass along the caller's file and line to _malloc and _free.	2000-12-03 05:56:27 +00:00
jdolecek	bf558e3b3e	only include opt_ddb.h for !LKM	2000-11-30 15:59:47 +00:00
chs	aeda8d3b77	Initial integration of the Unified Buffer Cache project.	2000-11-27 08:39:39 +00:00

1 2 3

113 Commits