NetBSD

Author	SHA1	Message	Date
perseant	f4a7694fc9	Keep per-inode, per-fs, and subsystem-wide counts of blocks allocated through lfs_balloc(), and use that to estimate the number of dirty pages belonging to LFS (subsystem or filesystem). This is almost certainly wrong for the case of a large mmap()ed region, but the accounting is tighter than what we had before, and performs much better in the typical case of pages dirtied through write().	2005-04-19 20:59:05 +00:00
perseant	f63fa194c2	Check the to-be-on-disk consistency of directories as well (correct a typo in an earlier commit).	2005-04-18 23:03:08 +00:00
perseant	b2d19f57a3	Check for the inode having been previously freed, in UNMARK_VNODE(). Avoids a panic when calling mkdir() on a full filesystem.	2005-04-18 17:36:46 +00:00
perseant	5923fa20f1	Make userland compile again.	2005-04-16 19:52:09 +00:00
perseant	ad0169af41	Remove left-over reference to "lfs_blist", for _LKM case.	2005-04-16 18:10:12 +00:00
perseant	5ed792ecb0	Use splay trees, rather than a hash table, to manage the accounting of blocks allocated through VOP_BALLOC() for pages to be written to disk. This accounting no longer takes a noticeable fraction of the system CPU.	2005-04-16 17:35:58 +00:00
perseant	94decdd25d	Use lfs_malloc() to manage the blkiov arrays that the cleaner functions use, since the cleaner is likely to operate in a low-memory condition.	2005-04-16 17:28:37 +00:00
perseant	9936b8ce7e	Tabify leading whitespace	2005-04-14 00:58:26 +00:00
perseant	f08a1ca4fa	Consolidate the hash table we use to maintain the integrity of lfs_avail into a single, system-wide table, rather than having a separate hash table per inode. Significantly reduces the "system" cpu usage of your average file write.	2005-04-14 00:44:16 +00:00
perseant	2ee78c4fa9	Keep track of the highest block held by an LFS inode, so that we can be assured that the last byte of a file is always allocated. Previously a file extension could cause the filesystem to be flushed, writing an inconsistent inode to disk. Although this condition would be corrected the next time blocks were written to disk, an intervening crash would leave the filesystem in an inconsistent state, leaving fsck_lfs to complain of an inode "partially truncated".	2005-04-14 00:02:46 +00:00
perseant	af48a6d91c	Clean up the handling of the pager_map deadlock in lfs_putpages, after realizing that it is safe to sleep the second time through the loop.	2005-04-08 00:08:42 +00:00
perseant	c9d4fa4c0d	Fix some locking issues that appeared with the simple_lock work. Address a "pager_map" deadlock in lfs_putpages().	2005-04-06 04:30:46 +00:00
perseant	1ebfc508b6	Protect various per-fs structures with fs->lfs_interlock simple_lock, to improve behavior in the multiprocessor case. Add debugging segment-lock assertion statements.	2005-04-01 21:59:46 +00:00
thorpej	e633e8b61b	- Define a VFS_ATTACH() macro that places a reference to a vfsops structure into the "vfsops" link set. - Use VFS_ATTACH() where vfsops are declared for individual file systems. - In vfsinit(), traverse the "vfsops" link set, rather than vfs_list_initial[].	2005-03-29 02:41:05 +00:00
christos	f2b82c7f8a	make this compile again :-(	2005-03-26 19:40:31 +00:00
christos	aca59c847f	Use vlog(9). Open-coding vlog here breaks lkm's because including <sys/kprintf.h> includes opt_multiprocessor.h. One could argue that the lock stuff should just move to subr_prf.c since nothing else uses it.	2005-03-26 19:39:08 +00:00
perseant	bb7bbb2d16	Don't sleep while holding the vnode interlock. Should take care of the first panic case in PR #26043.	2005-03-25 01:45:05 +00:00
chs	f31a80ccd3	avoid the need for recursive locking lfs_flush_dirops() by unlocking the vnode around the call to this in the caller.	2005-03-24 04:00:33 +00:00
perseant	c716c3d307	Make LFS dirops get their vnode first, before incrementing the dirop count, to prevent a deadlock trying to call VOP_PUTPAGES() on a VDIROP vnode. This can happen when a stacked filesystem is mounted on top of an LFS: an LFS dirop needs to get a vnode, which is available from the upper layer. The corresponding lower layer vnode, however, is VDIROP, so the upper layer can't be cleaned out since its VOP_PUTPAGES() is passed through to the lower layer, which waits for dirops to drain before it can proceed. Deadlock. Tweak ufs_makeinode() and ufs_mkdir() to pass the a_vpp argument through to VOP_VALLOC(). Partially addresses PR # 26043, though it probably does not completely fix the problem described there.	2005-03-23 00:12:51 +00:00
perseant	8e578e185f	Be more careful about handling of flags to lfs_flush, to ensure that the lfs_writing mutex is respected.	2005-03-09 22:12:15 +00:00
simonb	52c470b886	Tab Police.	2005-03-08 04:49:35 +00:00
perseant	eefd94b8e2	Straighten out the maze of ifdefs. Instead, consolidate all the debugging stuff under '#ifdef DEBUG', and use sysctl knobs to turn on/off particular parts of the debugging reporting (if DEBUG is enabled). Re-enable the LFS statistics in sysctl, while I'm there. A bit of a rototill.	2005-03-08 00:18:19 +00:00
perseant	8de99480fa	Move "ifile is too large for your NBUFS/BUFPAGES" messages into a function. Use log(9) to warn the user instead of printf(9). Since the theory is that the Ifile is "always in cache", but the greater performance risk is when the inode entries can't be held in cache, note these two cases separately, at different log levels (notice and warning, respectively).	2005-03-04 22:19:05 +00:00
perseant	871beffabf	Put the ISSPACE() check where it belongs. This allows rewriting a file on a full filesystem while still returning ENOSPC on an attempt to allocate new blocks.	2005-03-02 21:16:09 +00:00
perry	bcfcddbac1	nuke trailing whitespace	2005-02-26 22:31:44 +00:00
perseant	25f49c3c91	Various minor LFS improvements: * Note when lfs_putpages(9) thinks it is not going to be writing any pages before calling genfs_putpages(9). This prevents a situation in which blocks can be queued for writing without a segment header. * Correct computation of NRESERVE(), though it is still a gross overestimate in most cases. Note that if NRESERVE() is too high, it may be impossible to create files on the filesystem. We catch this case on filesystem mount and refuse to mount r/w. * Allow filesystems to be mounted whose block size is == MAXBSIZE. * Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN entries in indirect blocks again, triggering a failed assertion "daddr <= LFS_MAX_DADDR". Explicitly convert to and from int32_t to correct this. * Add a high-water mark for the number of dirty pages any given LFS can hold before triggering a flush. This is settable by sysctl, but off (zero) by default. * Be more careful about the MAX_BYTES and MAX_BUFS computations so we shouldn't see "please increase to at least zero" messages. * Note that VBLK and VCHR vnodes can have nonzero values in di_db[0] even though their v_size == 0. Don't panic when we see this. * Change lfs_bfree to a signed quantity. The manner in which it is processed before being passed to the cleaner means that sometimes it may drop below zero, and the cleaner must be aware of this. * Never report bfree < 0 (or higher than lfs_dsize) through lfs_statvfs(9). This prevents df(1) from ever telling us that our full filesystems have 16TB free. * Account space allocated through lfs_balloc(9) that does not have associated buffer headers, so that the pagedaemon doesn't run us out of segments. * Return ENOSPC from lfs_balloc(9) when bfree drops to zero. * Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being unmounted. Because vfs_busy() is a shared lock, and lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be holding the lock that umount() is blocking on, then try to vfs_busy() again in getnewvnode().	2005-02-26 05:40:42 +00:00
wrstuden	e384a44e9d	Extend fsync_range(2) to support the FDISKSYNC flag, which requests that the sync be propogated out through the disk drive caches.	2005-01-25 23:55:20 +00:00
mycroft	7f1fe4e81f	Rearrange some code slightly to avoid uninitialized variable warnings.	2005-01-11 00:19:36 +00:00
mycroft	e72fc6717e	Whoops -- move the location of the VOP_OPEN()/VOP_CLOSE(), et al, from foo_mountfs() to foo_mount(), to match the new mountroot API. Also, for ext2fs and lfs, copy some restructuring from ffs to allow changing file system parameters without specifying the device name. (ntfs could use some more work.)	2005-01-09 09:27:17 +00:00
mycroft	0461b30ac3	Rework the mountroot interface so that vfs_mountroot() opens the root device and just passes it on to the file system functions. This avoids opening and closing the device several times. Mentioned on tech-kern some time ago, IIRC. I've been running this for a long time.	2005-01-09 03:11:48 +00:00
thorpej	1c95472d01	Add the system call and VFS infrastructure for file system extended attributes. From FreeBSD.	2005-01-02 16:08:28 +00:00
yamt	22399b45d0	change some members of struct buf from long to int. ride on 2.0H.	2004-09-18 16:40:11 +00:00
mycroft	2070a0c580	Make sure to set IMNT_DTYPE here...	2004-08-16 12:49:55 +00:00
mycroft	bb17450999	Don't write out the extra zero pages with PGO_SYNCIO. We start an asynchronous write anyway, and they will not be freed until that write is finished.	2004-08-15 19:01:16 +00:00
mycroft	4303882b7e	Copy the current partial-truncate logic from FFS. In the process, fix a potential overrun when truncating a fragment.	2004-08-15 17:37:07 +00:00
mycroft	f3fbefe76a	Minor simplification to some arithmetic.	2004-08-15 16:17:37 +00:00
mycroft	14f6fc2dfb	Need to set um_dirblksiz here...	2004-08-15 16:07:08 +00:00
mycroft	45a21b76f0	Fixing age old cruft: * Rather than using mnt_maxsymlinklen to indicate that a file systems returns d_type fields(!), add a new internal flag, IMNT_DTYPE. Add 3 new elements to ufsmount: * um_maxsymlinklen, replaces mnt_maxsymlinklen (which never should have existed in the first place). * um_dirblksiz, which tracks the current directory block size, eliminating the FS-specific checks littered throughout the code. This may be used later to make the block size variable. * um_maxfilesize, which is the maximum file size, possibly adjusted lower due to implementation issues. Sync some bug fixes from FFS into ext2fs, particularly: * ffs_lookup.c 1.21, 1.28, 1.33, 1.48 * ffs_inode.c 1.43, 1.44, 1.45, 1.66, 1.67 * ffs_vnops.c 1.84, 1.85, 1.86 Clean up some crappy pointer frobnication.	2004-08-15 07:19:54 +00:00
mycroft	c09a793e93	Push atime/mtime updates even further -- into the reclaim path, so they happen rarely in the normal case. (Note: This happens at reboot/shutdown time because all file systems are unmounted.) Also, for IN_MODIFY, use IN_ACCESSED, not IN_MODIFIED; otherwise "ls -l" of your device node or FIFO would cause the time stamps to get written too quickly.	2004-08-14 14:32:04 +00:00
mycroft	bc25b30608	Add a new flag, IN_MODIFY. This is like IN_UPDATE\|IN_CHANGE, but unlike setting those flags, it does not cause the inode to be written in the periodic sync. This is used for writes to special files (devices and named pipes) and FIFOs. Do not preemptively sync updates to access times and modification times. They are now updated in the inode only opportunistically, or when the file or device is closed. (Really, it should be delayed beyond close, but this is enough to help substantially with device nodes.) And the most amusing part: Trickle sync was broken on both FFS and ext2fs, in different ways. In FFS, the periodic call to VFS_SYNC(MNT_LAZY) was still causing all file data to be synced. In ext2fs, it was causing the metadata to not be synced. We now only call VOP_UPDATE() on the node if we're doing MNT_LAZY. I've confirmed that we do in fact trickle correctly now.	2004-08-14 01:08:02 +00:00
pk	a7c40722d8	Call inittodr() from main(). Let file system code set the recorded `last update' time (if any) through the new function setrootfstime().	2004-07-05 07:28:45 +00:00
yamt	2209153ea4	lfs_gop_write: assert that ifile never come here.	2004-05-30 20:45:44 +00:00
hannken	8c21bc6224	Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD. - Not enabled by default. Needs kernel option FFS_SNAPSHOT. - Change parameters of ffs_blkfree. - Let the copy-on-write functions return an error so spec_strategy may fail if the copy-on-write fails. - Change genfs_lock() to use vp->v_vnlock instead of &vp->v_lock. - Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer. - Add a function ffs_checkfreefile needed for snapshot creation. - Add special handling of snapshot files: Snapshots may not be opened for writing and the attributes are read-only. Use the mtime as the time this snapshot was taken. Deny mtime updates for snapshot files. - Add function transferlockers to transfer any waiting processes from one lock to another. - Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through a vnode. - Add snapshot support to ls, fsck_ffs and dump. Welcome to 2.0F. Approved by: Jason R. Thorpe <thorpej@netbsd.org>	2004-05-25 14:54:55 +00:00
atatat	53c625655c	Sysctl descriptions under vfs subtree	2004-05-25 04:44:43 +00:00
atatat	10a7ba9ef6	Tweak sysctl setup functions (the macros, actually) for use in lkms, and tweak lkminit_*.c (where applicable) to call them, and to call sysctl_teardown() when being unloaded. This consists of (1) making setup functions not be static when being compiled as lkms (change to sys/sysctl.h), (2) making prototypes visible for the various setup functions in header files (changes to various header files), and (3) making simple "load" and "unload" functions in the actual lkminit stuff. linux_sysctl.c also needs its root exposed (ie, made not static) for this (when built as an lkm).	2004-05-20 06:34:24 +00:00
atatat	1d3a6a329e	Explicitly call pool_init() (and pool_destroy()) when being built as an _LKM. This adds pools to the list of things that lkms must do manually because they're set up with link sets. Not that there's anything wrong with link sets, but that we need to try harder to remember that lkms are second class citizens. Of a sort.	2004-05-20 05:39:34 +00:00
yamt	58912348a7	lfs_cluster_aiodone: turn an invariant condition into an assertion.	2004-05-19 11:29:32 +00:00
simonb	b5d0e6bf06	Initialise (most) pools from a link set instead of explicit calls to pool_init. Untouched pools are ones that either in arch-specific code, or aren't initialiased during initial system startup. Convert struct session, ucred and lockf to pools.	2004-04-25 16:42:40 +00:00
yamt	54b5826d2c	lfs_statvfs: report f_frsize correctly.	2004-04-22 10:45:56 +00:00
yamt	2b17bf3d63	check_dirty: fix another PHOLD leak. ("goto top" path)	2004-04-22 10:45:00 +00:00
christos	6bd1d6d4db	Replace the statfs() family of system calls with statvfs(). Retain binary compatibility.	2004-04-21 01:05:31 +00:00
yamt	aa514117d5	check_dirty: plug a PHOLD leak. from Greg Oster.	2004-04-20 11:52:17 +00:00
oster	87d110abfa	If we bail out due to an error, we need 'unreserve' the space that we'd reserved earlier. Approved by: yamt	2004-03-30 14:50:46 +00:00
atatat	284a91c3ab	Manually attach malloc types when being built as an lkm.	2004-03-27 04:43:43 +00:00
atatat	19af35fd0d	Tango on sysctl_createv() and flags. The flags have all been renamed, and sysctl_createv() now uses more arguments.	2004-03-24 15:34:46 +00:00
yamt	15c9d33810	calculate data checksum inline.	2004-03-09 07:43:49 +00:00
yamt	81ce5e8cc3	use correct segment size. this fixes memory corruption when using lfsv1.	2004-03-09 06:43:18 +00:00
oster	19eeec0a9c	Add a missing: pool_destroy(&lfs_dinode_pool); to lfs_done(). Approved-by: yamt	2004-02-26 22:56:55 +00:00
yamt	f9571060ef	lfs_putpages: fix a simple_lock mismatch.	2004-02-26 22:41:36 +00:00
wiz	73e1501b98	parameter with two es. From Peter Postma.	2004-02-24 15:22:01 +00:00
yamt	a57f9a6ca5	lfs_update_single: add an assertion.	2004-01-29 12:10:07 +00:00
he	11544aaa71	Let the cast to (long long) for using the result as a printf argument apply to the whole expression, not just the first factor.	2004-01-28 20:57:15 +00:00
yamt	3e9d8d6772	use bufmem instead of bufpages to make lfs a little less broken.	2004-01-28 10:54:23 +00:00
yamt	09ec20ca66	eliminate tricky usages of VOP_STRATEGY which are (no longer?) necessary.	2004-01-28 10:53:12 +00:00
hannken	d6170777cf	Fix xxx_strategy() to use the vnode arg instead of bp->b_vp.	2004-01-26 10:39:29 +00:00
hannken	3db4e2acd8	Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern. VOP_STRATEGY(bp) is replaced by one of two new functions: - VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp. - DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp. DEV_STRATEGY(bp) is used only for block-to-block device situations.	2004-01-25 18:06:48 +00:00
yamt	7266a95907	store a i/o priority hint in struct buf for buffer queue discipline.	2004-01-10 14:39:50 +00:00
pk	70f20a1217	Replace the traditional buffer memory management -- based on fixed per buffer virtual memory reservation and a private pool of memory pages -- by a scheme based on memory pools. This allows better utilization of memory because buffers can now be allocated with a granularity finer than the system's native page size (useful for filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation of virtual to physical memory mappings (due to the former fixed virtual address reservation) resulting in better utilization of MMU resources on some platforms. Finally, the scheme is more flexible by allowing run-time decisions on the amount of memory to be used for buffers. On the other hand, the effectiveness of the LRU queue for buffer recycling may be somewhat reduced compared to the traditional method since, due to the nature of the pool based memory allocation, the actual least recently used buffer may release its memory to a pool different from the one needed by a newly allocated buffer. However, this effect will kick in only if the system is under memory pressure.	2003-12-30 12:33:13 +00:00
simonb	740725d725	Fix usage of fifth argument to pool_init().	2003-12-21 07:53:58 +00:00
yamt	009640868e	set VBWAIT when waiting v_numoutput to be drained.	2003-12-17 10:38:39 +00:00
yamt	ce11c3ce4e	remove a redundant substitution.	2003-12-17 07:14:03 +00:00
yamt	6b95193071	- reduce code duplication. - use boolean_t where appropriate.	2003-12-16 13:47:48 +00:00
yamt	98e9a8c373	g/c lfs_no_inactive.	2003-12-16 11:45:07 +00:00
atatat	13f8d2ce5f	Dynamic sysctl. Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(), vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all nodes are registered with the tree, and nodes can be added (or removed) easily, and I/O to and from the tree is handled generically. Since the nodes are registered with the tree, the mapping from name to number (and back again) can now be discovered, instead of having to be hard coded. Adding new nodes to the tree is likewise much simpler -- the new infrastructure handles almost all the work for simple types, and just about anything else can be done with a small helper function. All existing nodes are where they were before (numerically speaking), so all existing consumers of sysctl information should notice no difference. PS - I'm sorry, but there's a distinct lack of documentation at the moment. I'm working on sysctl(3/8/9) right now, and I promise to watch out for buses.	2003-12-04 19:38:21 +00:00
yamt	3ea6756a92	use b_private rather than b_saveaddr. XXX LFS_USE_B_INVAL	2003-12-04 14:57:47 +00:00
yamt	5b12f94dde	use FINFOSIZE macro.	2003-11-25 15:14:57 +00:00
yamt	cc716087b0	- tweak lfs_update_single()'s prototype so that it can be used by roll-forward code. - reduce code duplication using the above in update_meta() this also fixes fragment accounting.	2003-11-07 17:55:29 +00:00
yamt	71602f6ec9	fix spec vnode aliasing.	2003-11-07 14:52:27 +00:00
yamt	b43ed49269	- tell filesize changes to vm when roll-forwarding data blocks. - handle fragment extension better during roll-forward. - related assertions.	2003-11-07 14:50:18 +00:00
yamt	cd2445d8d3	more assertion about file truncation to zero.	2003-11-07 14:48:28 +00:00
simonb	a2facef339	Remove some assigned-to but otherwise unused variables.	2003-10-30 01:43:08 +00:00
mycroft	be505a4f82	Adjust to remove bogus initializer.	2003-10-29 01:25:04 +00:00
christos	372f57e757	Fix uninitialized variable warnings.	2003-10-25 18:26:46 +00:00
fvdl	c6019338cd	Correct preempt() calls.	2003-10-21 00:39:03 +00:00
yamt	4e9f921204	be more strict about sa->vp. (make sure the last lfs_updatemata in lfs_putpages takes effect.)	2003-10-18 15:52:42 +00:00
simonb	c25af55e8c	Remove assigned-to but otherwise unused variable.	2003-10-18 04:03:22 +00:00
yamt	818ef92da6	add comments and tweak code a little for readability. (no behaviour changes)	2003-10-17 14:20:12 +00:00
dbj	fe7c786886	add mnt_iflag field to struct mount for internal flags mv MNT_GONE, MNT_UNMOUNT and MNT_WANTRDWR to this field additonally add mnt_writeopcountupper and mnt_writeopcountlower fields in preparation for pending write suspension support work bump kernel version to 1.6ZD	2003-10-14 14:02:56 +00:00
yamt	1fb76f9bad	add a prototype of check_segsum().	2003-10-14 13:51:51 +00:00
yamt	d457c892fa	when roll-forwarding, check segment serial numbers correctly.	2003-10-14 13:46:30 +00:00
yamt	73e762ca69	add a missing fsbtodb() to read a correct block for roll-forwarding.	2003-10-14 12:52:28 +00:00
yamt	1508246f38	remove a redundant definition of LFS_MAX_ACTIVE.	2003-10-14 12:51:31 +00:00
yamt	dd4d591157	- a comment. - bcopy -> memcpy - increase 'p' only when needed.	2003-10-08 15:07:25 +00:00
yamt	4ce4892712	assertions.	2003-10-03 15:35:54 +00:00
yamt	33feb8e686	reassignbuf() when lfs_writeseg() takes away B_DELWRI.	2003-10-03 15:35:03 +00:00
yamt	656ff745cf	when inactivating segments, compare segment numbers correctly.	2003-10-03 13:02:54 +00:00
yamt	0dc0c83b61	remove redundant prototypes.	2003-09-29 15:12:08 +00:00
yamt	61d5d4362b	fix a bug of lfs. genfs_getpages() can read in more blocks than it should due to faked filesize of lfs_gop_size(). it's a security problem and it makes gcc3 "internal error" to fix this, - in genfs_getpages(), always calculate diskeof and memeof separately so that filesystems (in this case, lfs) can use different strategies for them. - introduce GOP_SIZE_MEM flag and use it to request in-core filesize. (it was an intention of GOP_SIZE_READ, but after the above change _READ is not a straightforward name) after this, no one uses GOP_SIZE_{READ,WRITE} anymore but leave them for now.	2003-09-24 10:22:53 +00:00
yamt	67a5559821	cleanup IN_ADIROP/VDIROP handling a little.	2003-09-23 05:26:49 +00:00
yamt	e2fbe9d54d	remove unnecessary externs of lfs_do_flush.	2003-09-23 05:26:12 +00:00
yamt	17f9466183	some comments	2003-09-20 17:51:55 +00:00
yamt	f80b24474d	g/c CHECK_COPYIN.	2003-09-10 11:09:11 +00:00
yamt	753a6151b9	comments on lfs_issequential_hole.	2003-09-07 21:00:36 +00:00
yamt	d20e923a9c	- raise spl to bio in lfs_countlocked() rather than having callers to do so. - buffer cache MP locks. - assert B_CALL buffers are not on the free queue.	2003-09-07 11:53:57 +00:00
yamt	4a78faea0f	- buffer cache MP locks. - avoid changing buffer state on the free queue.	2003-09-07 11:47:07 +00:00
yamt	3ed90e8152	use LFS_DEBUG_COUNTLOCKED macro.	2003-09-07 11:44:22 +00:00
yamt	01e41ddfe3	don't call LFS_DEBUG_COUNTLOCKED after bread(). lfs_countlocked doesn't count buffers that isn't on the freelist.	2003-09-04 12:28:53 +00:00
agc	aad01611e7	Move UCB-licensed code from 4-clause to 3-clause licence. Patches provided by Joel Baker in PR 22364, verified by myself.	2003-08-07 16:26:28 +00:00
yamt	bdbaf98d1e	using normal bufcache buffer for cluster buffer head.	2003-07-30 13:36:40 +00:00
yamt	1bc98d3c14	- check EROFS earlier in lfs_markv. - remove wrong error recovery code (fake buffers are never on bufqueue) and put a comment instead.	2003-07-30 12:38:53 +00:00
yamt	3bded40734	remove an unused definition of LFS_VREF_THRESHOLD.	2003-07-30 12:34:00 +00:00
yamt	bddddad951	KNF.	2003-07-23 13:53:51 +00:00
yamt	d7aa0312b2	add parenthesis missed in rev.1.127.	2003-07-23 13:46:57 +00:00
yamt	9d61ee54c4	whitespace	2003-07-23 13:44:55 +00:00
yamt	f4fc192872	add KASSERTs in lfs_issequential_hole.	2003-07-23 13:38:18 +00:00
yamt	2ba5ae7ea6	more MP locks.	2003-07-12 16:19:00 +00:00
yamt	12ad26b293	- wrap long lines. - remove a mysterious blank line.	2003-07-12 16:17:52 +00:00
yamt	3852db2096	- protect global resource counts with lfs_subsys_lock. - clean up scattered externs a little.	2003-07-12 16:17:06 +00:00
yamt	3f84a2d3a1	a comment.	2003-07-02 14:07:16 +00:00
yamt	eb4e09d59f	use queue.h macros.	2003-07-02 13:43:02 +00:00
yamt	82659031f4	use VFSTOUFS macro.	2003-07-02 13:41:38 +00:00
yamt	102c8a6a74	- add a new functions, lfs_writer_enter/leave, and use them instead of duplicated code fragments. - add an assertion.	2003-07-02 13:40:51 +00:00
yamt	f3506a9599	drain dirops before aqcuiring seglock. otherwise it might deadlocks. PR/20676 (Karl Knutsson)	2003-07-02 13:39:03 +00:00
fvdl	d5aece61d6	Back out the lwp/ktrace changes. They contained a lot of colateral damage, and need to be examined and discussed more.	2003-06-29 22:28:00 +00:00
thorpej	a06b275edc	Undo part of the ktrace/lwp changes. In particular: * Remove the "lwp " argument that was added to vget(). Turns out that nothing actually used it! Remove the "lwp " arguments that were added to VFS_ROOT(), VFS_VGET(), and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted above, didn't use it). Remove all of the "lwp *" arguments to internal functions that were added just to appease the above.	2003-06-29 18:43:21 +00:00
bouyer	75208caf18	Adapt for struct proc* -> struct lwp* changes.	2003-06-28 22:53:35 +00:00
darrenr	960df3c8d1	Pass lwp pointers throughtout the kernel, as required, so that the lwpid can be inserted into ktrace records. The general change has been to replace "struct proc " with "struct lwp " in various function prototypes, pass the lwp through and use l_proc to get the process pointer when needed. Bump the kernel rev up to 1.6V	2003-06-28 14:20:43 +00:00
yamt	ab2238cfad	make is_sequential a callback in order to achieve better lfs write clustering. since lfs always rewrite blocks into the new segment, current on-disk place of the block doesn't affect to write clustering. ok'ed by Konrad Schroder.	2003-05-18 12:59:05 +00:00
nakayama	a6d8c9185d	Avoid comparison is always false warning in gcc 3.3 w/ 64-bit size_t.	2003-05-17 01:44:39 +00:00
ragge	97fa6ef77b	Add a missing ifdef DDB.	2003-05-07 18:49:29 +00:00
perseant	e18585deb2	Correct arguments to check_dirty, ensuring that all pages in a block are written if any of them are dirty. Pointed out by yamt.	2003-05-02 01:47:39 +00:00
perseant	100c8f971f	Restrict the run of cluster blocks to on-disk contiguous blocks (back out part of rev 1.115), to avoid writing over holes. This is the lesser of two evils, to be replaced soon.	2003-04-29 17:45:11 +00:00
yamt	982fbc7db4	add an assertion.	2003-04-29 07:44:04 +00:00
yamt	b4d5e11ffe	fix a comment.	2003-04-27 06:47:45 +00:00
yamt	c2b802ff24	fix b_interlock lock/unlock mismatches.	2003-04-27 06:46:38 +00:00
perseant	b691ba8a71	Don't change update time on block write; lets e.g. "tar xp" work properly.	2003-04-27 04:18:29 +00:00
perseant	ef3c60764c	Make LFS work better (though still not "well") as an NFS-exported filesystem (and other things that needed to be fixed before the tests would complete), to wit: * Include the fs ident in the filehandle; improve stale filehandle checks. * Change definition of blksize() to use the on-dinode size instead of the inode's i_size, so that fsck_lfs will work properly again. * Use b_interlock in lfs_vtruncbuf. * Postpone dirop reclamation until after the seglock has been released, so that lfs_truncate is not called with the segment lock held. * Don't loop in lfs_fsync(), just write everything and wait. * Be more careful about the interlock/uobjlock in lfs_putpages: when we lose this lock, we have to resynchronize dirtiness of pages in each block. * Be sure to always write indirect blocks and update metadata in lfs_putpages; fixes a bug that caused blocks to be accounted to the wrong segment.	2003-04-23 07:20:37 +00:00
christos	80ecd573c0	PR/1796: John Kohl: statfs misbehaves under chrooted environments. - Under chroot it displays only the visible filesystems with appropriate paths. - The statfs f_mntonname gets adjusted to contain the real path from root. - While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(), and factored out some of the vfsop statfs() code to copy_statfs_info(). This fixes the problem where some filesystems forgot to set fsid. - Made coda look more like a normal fs.	2003-04-16 21:44:18 +00:00
simonb	761de7345c	'#if 0' out a variable that is currently only used in other '#if 0'd out code.	2003-04-10 04:15:38 +00:00
thorpej	24a4b8faa6	Use PAGE_SIZE rather than NBPG.	2003-04-09 00:28:28 +00:00
fvdl	42614ed3f3	Add support for UFS2. UFS2 is an enhanced FFS, adding support for 64 bit block pointers, extended attribute storage, and a few other things. This commit does not yet include the code to manipulate the extended storage (for e.g. ACLs), this will be done later. Originally written by Kirk McKusick and Network Associates Laboratories for FreeBSD.	2003-04-02 10:39:19 +00:00
yamt	0296b9ddb2	add assertions and a debug check.	2003-04-01 14:58:43 +00:00
yamt	418bd96252	lfs_strategy is used only for read.	2003-04-01 14:31:50 +00:00
fvdl	691b2fa7db	The checkpoint loop always used (multiples of) lfs_sepb as the number of segments to mark. However, this may be much more than lfs_nseg. Originally this wasn't a big problem, since only the structures in the diskblock were changed, but nowadays there's a mirror of the segflags in the in-core superblock. This problem caused the code to walk way past the end of that allocated area, causing memory corruption in other kernel structures. So, use lfs_nseg as the maximum, as it should be. While here, simplify the loop; it had become an obfuscated piece of code overtime.	2003-03-28 22:39:42 +00:00
perseant	3f7016035a	Add a sleeper count, to prevent the cleaner from panicing the kernel when the filesystem is unmounted, relocking the Ifile when its lock is draining. (We can't use vfs_busy() since the process is sleeping for a good long time.) Clean up / organize lfs.h, while I'm here. In lfs_update_single, assert that disk addresses are either negative, or are still positive when converted to int32_t, to prevent recurrence of a negative/positive block problem.	2003-03-28 08:03:38 +00:00
perseant	17bba97d2e	Unlock ifile inode during streamlined VOP_INACTIVE.	2003-03-22 21:31:41 +00:00
dsl	bd99e3429d	Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).	2003-03-21 23:11:19 +00:00
perseant	78dff1cfa3	KNF (space after keywords).	2003-03-21 06:26:36 +00:00
perseant	a37f4cf7ec	Use VONWORKLST as a heuristic for vnode emptiness, rather than exhaustively checking the memq. Take greater care not to dirty the Ifile vnode when unmounting the filesystem. This should fix a "(vp->v_flag & VONWORKLST) == 0" assertion panic in vgonel that could occur when unmounting. Do not allow the Ifile to be mapped for writing.	2003-03-21 06:16:53 +00:00
yamt	e8d83a0b17	make this compilable with DIAGNOSTIC and without DEBUG. fix PR 20827 from FUKAUMI Naoki.	2003-03-21 06:09:08 +00:00
yamt	a8e8f3ea02	lfs_writevnodes: in the case of "starting over", kick lfs_writeseg in order to avoid deadlock in check_dirty.	2003-03-20 14:17:21 +00:00
yamt	91ce94db76	fix "more than one fragment" panics; direct and indirect block pointers are not valid in the case of shortlinks. while i'm here, move duplicated code in lfs_vget/fastvget into a new function, lfs_vinit.	2003-03-20 14:11:46 +00:00
perseant	12a78a5a7e	Don't break out of Ifile-writing loop in lfs_segwrite until nothing is left. Note however that blocks can be added to the Ifile even when the segment block is held because of inodes' atime. Do not panic with "dirty blocks" if these blocks are present.	2003-03-20 06:51:17 +00:00
perseant	c364d884f6	Hold the segment lock during truncation to prevent indirect blocks from being written by lfs_updatemeta while lfs_truncate is also writing them, a bug pointed out by YAMAMOTO Takashi <yamt@netbsd.org>.	2003-03-20 06:47:38 +00:00
perseant	20f569dad0	Remember to destroy lfs_inoext_pool when closing up the LFS subsystem.	2003-03-18 07:53:56 +00:00
perseant	ea03a1ac09	Add simple_lock protection for lfs_seglock and lfs_subsys_pages; these will be expanded to cover other per-fs and subsystem-wide data as well. Fix a case of IN_MODIFIED being set without updating lfs_uinodes, resulting in a "lfs_uinodes < 0" panic. Fix a deadlock in lfs_putpages arising from the need to busy all pages in a block; unbusy any that had already been busied before starting over.	2003-03-15 06:58:49 +00:00
kristerw	ea98786439	SO C requires a statement after a label.	2003-03-15 02:27:18 +00:00
perseant	ec13062af8	- Get rid of unused #ifdefs LFS_NO_PAGEMOVE and LFS_MALLOC_SUMMARY (both always true) and accompanying dead code. - When constructing write clusters in lfs_writeseg, if the block we are about to add is itself a cluster from GOP_WRITE, don't put a cluster in a cluster, just write the GOP_WRITE cluster on its own. This seems to represent a slight performance gain on my test machine. - Charge someone's rusage for writes on LFSes. It's difficult to tell who the "right" process to charge is; just charge whoever triggered the write.	2003-03-11 02:47:39 +00:00
perseant	a46b9ccf95	Only #define LFS if not already defined.	2003-03-08 23:18:54 +00:00
perseant	0493be6ba8	Take away "#ifdef LFS_UBC".	2003-03-08 22:14:31 +00:00
perseant	8feb2c22f5	Take away "#ifdef LFS_UBC".	2003-03-08 21:46:04 +00:00
perseant	4b4f884b89	Add an lfs_strategy() that checks to make sure we're not trying to read where the cleaner is trying to write, instead of tying up the "live" buffers (or pages). Fix a bug in the LFS_UBC case where oversized buffers would not be checksummed correctly, causing uncleanable segments. Make sure that wakeup(fs->lfs_iocount) is done if fs->lfs_iocount is 1 as well as 0, since we wait in some places for it to drop to 1. Activate all pages that make it into lfs_gop_write without the segment lock held, since they must have been dirtied very recently, even if PG_DELWRI is not set.	2003-03-08 02:55:47 +00:00
perseant	d51fdbef63	Make sure we hold the uobjlock when checking for dirty pages, in lfs_vflush. Note that pages can become dirty without our knowing it, anyway; don't panic if that happens.	2003-03-04 19:19:43 +00:00
perseant	003cfbd545	Don't add dirty blocks to the ifile in lfs_segunlock, if we're trying to unmount the filesystem. This avoids a "dirty blocks" panic.	2003-03-04 19:15:26 +00:00
perseant	958a4c008c	Don't force all truncations to be synchronous	2003-03-04 19:10:35 +00:00
perseant	9192f047ac	Account SEGUSE_ACTIVE correctly so that the automatic segment cleaning actually happens. Add a new fcntl call that will write the minimum necessary to checkpoint (i.e., for on-disk directory structure to be consistent, not including updates to file data) so that the cleaner can clean segments more quickly without sacrificing three-way commit for cleaning.	2003-03-02 04:34:30 +00:00
yamt	32a79f1dd0	use pid_t for pid.	2003-03-01 11:20:21 +00:00
perseant	cfc73a5fa9	Be careful to always zero pages on truncation/fragment extension, in the case where the filesystem block size is larger than PAGE_SIZE.	2003-03-01 05:07:51 +00:00
perseant	daeb6c37d1	Make lfs_truncate handle file extension correctly, in the LFS_UBC case.	2003-02-28 07:37:56 +00:00
perseant	c418e0c4d6	Fix a clrbuf() on an uninitialized pointer.	2003-02-28 07:36:32 +00:00
perseant	a94f9407dc	Quell a hasty panic in lfs_truncate: on-inode disk addresses can be different between the beginning and end of the call.	2003-02-28 04:37:07 +00:00
perseant	0b114d4e21	Do roundup and offset arithmetic in 64 bits, to allow >=2G files.	2003-02-27 07:10:27 +00:00
perseant	6f5626d112	Make fs-specific fcntl macros take three arguments (approved wrstuden). Let LFS use fcntl for cleaner functions.	2003-02-25 23:12:06 +00:00
thorpej	eb14e86676	Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and use it. This fixes a few places where either b_dep or b_interlock were not properly initialized.	2003-02-25 20:35:31 +00:00
yamt	2bad134129	fix simplelocks	2003-02-25 13:47:44 +00:00
perseant	95137c8477	Add lfs_ioctl vnode op, with ioctls to take over cleaner system call functionality (not including segment clean, since that is now done automatically as checkpoints happen).	2003-02-24 08:42:49 +00:00
simonb	2a4457bd46	Remove assigned-to but not used variable.	2003-02-23 03:32:55 +00:00
perseant	3ab94fed93	Fix a buffer overflow bug in the LFS_UBC case that manifested itself either as a mysterious UVM error or as "panic: dirty bufs". Verify maximum size in lfs_malloc. Teach lfs_updatemeta and lfs_shellsort about oversized cluster blocks from lfs_gop_write. When unwiring pages in lfs_gop_write, deactivate them, under the theory that the pagedaemon wanted to free them last we knew.	2003-02-23 00:22:33 +00:00
yamt	1dd4645db4	fix simple_lock/unlock mismatches.	2003-02-22 01:52:25 +00:00
perseant	fdf4bfe002	Tabify, and fix some comment alignment problems.	2003-02-20 04:27:23 +00:00
yamt	5f444770aa	add debug code to lfs_free.	2003-02-19 12:58:53 +00:00
yamt	65fda8e404	workaround for "another flush is..." infinity loop in writerd. if we're writerd, sleep in lfs_flush until another writer goes away instead of busy loop in writed.	2003-02-19 12:49:10 +00:00
yamt	d9a4f81d1c	wire the pages instead of just dequeue'ing them. advised by Chuck Silvers.	2003-02-19 12:22:51 +00:00
yamt	18e00c1196	init b_interlock.	2003-02-19 12:18:59 +00:00
yamt	2be86f2ff8	acquire v_interlock before calling VOP_PUTPAGES.	2003-02-19 12:02:38 +00:00
yamt	0ad89cf93e	init b_interlock.	2003-02-19 12:01:17 +00:00
soren	3291a4522e	Make libsa compile again.	2003-02-18 14:58:31 +00:00
perseant	e61877243d	Make it compile again, grr....	2003-02-18 02:00:08 +00:00
perseant	b397c875ae	Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now (there are still some details to work out) but expect that to go away soon. To support these basic changes (creation of lfs_putpages, lfs_gop_write, mods to lfs_balloc) several other changes were made, to wit: * Create a writer daemon kernel thread whose purpose is to handle page writes for the pagedaemon, but which also takes over some of the functions of lfs_check(). This thread is started the first time an LFS is mounted. * Add a "flags" parameter to GOP_SIZE. Current values are GOP_SIZE_READ, meaning that the call should return the size of the in-core version of the file, and GOP_SIZE_WRITE, meaning that it should return the on-disk size. One of GOP_SIZE_READ or GOP_SIZE_WRITE must be specified. * Instead of using malloc(...M_WAITOK) for everything, reserve enough resources to get by and use malloc(...M_NOWAIT), using the reserves if necessary. Use the pool subsystem for structures small enough that this is feasible. This also obsoletes LFS_THROTTLE. And a few that are not strictly necessary: * Moves the LFS inode extensions off onto a separately allocated structure; getting closer to LFS as an LKM. "Welcome to 1.6O." * Unified GOP_ALLOC between FFS and LFS. * Update LFS copyright headers to correct values. * Actually cast to unsigned in lfs_shellsort, like the comment says. * Keep track of which segments were empty before the previous checkpoint; any segments that pass two checkpoints both dirty and empty can be summarily cleaned. Do this. Right now lfs_segclean still works, but this should be turned into an effectless compatibility syscall.	2003-02-17 23:48:08 +00:00
pk	338f31f581	Make the buffer cache code MP-safe.	2003-02-05 21:38:38 +00:00
perseant	14c17e57b4	Don't call a dirop within a dirop: if lfs_rename is actually deleting a link, call lfs_remove directly before starting dirop rather than having ufs_rename do it.	2003-02-03 00:32:35 +00:00
tron	f1eeaa9020	Only use MALLOC_DECLARE() in kernel namespace.	2003-02-01 18:34:14 +00:00
thorpej	b193480908	Add extensible malloc types, adapted from FreeBSD. This turns malloc types into a structure, a pointer to which is passed around, instead of an int constant. Allow the limit to be adjusted when the malloc type is defined, or with a function call, as suggested by Jonathan Stone.	2003-02-01 06:23:35 +00:00
yamt	84d61a1dc4	there's no need to treat VOP_WHITEOUT as dirop because it modifies only one inode.	2003-01-30 14:18:32 +00:00
yamt	53d6eb47ee	don't use daddr_t for segment summary since it's an on-disk structure.	2003-01-29 13:14:33 +00:00
simonb	0adecbd12b	Remove variable that is only assigned to but not referenced.	2003-01-29 03:06:40 +00:00
yamt	e41d3a6f1c	make these compilable with lfs debug options. (follow daddr_t change) XXX maybe segment number should be 64bit.	2003-01-27 23:17:56 +00:00
kleink	865868a8b1	Further printf format fixes in the wake of daddr_t. Note that PRI?64 and long long int arguments aren't made for each other, nor are %lld and int64_t arguments.	2003-01-27 21:45:52 +00:00
kleink	4e0e5333ae	Fix further printf format warnings for DEBUG, in the wake of daddr_t having changed.	2003-01-25 23:00:09 +00:00
tron	5067836b9e	Use PRId64 instead of hard coding "%lld" to fix build problems under LP64 ports.	2003-01-25 18:12:31 +00:00

... 2 3 4 5 6 ...

642 Commits