NetBSD

Author	SHA1	Message	Date
yamt	560c0c565c	don't use g_glock directly.	2006-10-14 09:17:26 +00:00
yamt	b7cedb8e34	handle_workitem_freefrag/handle_workitem_freeblocks: don't fake up inode/vnode pair.	2006-10-14 07:26:29 +00:00
hannken	3e0dbf3bc5	Add __unused to unused function arguments.	2006-10-13 10:21:21 +00:00
thorpej	25433eb1b5	ufs_quotactl(): consume the arguments even if QUOTAS is not defined.	2006-10-12 04:24:40 +00:00
christos	4d595fd7b1	- sprinkle __unused on function decls. - fix a couple of unused bugs - no more -Wno-unused for i386	2006-10-12 01:30:41 +00:00
chs	33c1fd1917	add support for O_DIRECT (I/O directly to application memory, bypassing any kernel caching for file data).	2006-10-05 14:48:32 +00:00
christos	b64edcaded	fix empty if	2006-10-04 15:53:24 +00:00
christos	45234b0cee	Coverity CID 3690: Add KASSERT to check for reverse INULL.	2006-10-03 19:04:25 +00:00
christos	df9ed85b34	redo previous: It is better to add a KASSERT, since this is code is same with ufs.	2006-10-03 19:01:29 +00:00
christos	b6bf786e1c	Coverity CID 3690: Reverse INULL: Add KASSERT.	2006-10-03 18:59:22 +00:00
christos	e11b3365c9	Coverity CID 3689: dp cannot be NULL at this point, so don't check for it.	2006-10-03 18:54:08 +00:00
christos	2b372f902d	Coverity CID 3156: async = TRUE when LFS_READWRITE is defined, leading to dead code. Ifdef the dead code appropriately (from Arnaud Lacombe)	2006-10-03 18:24:48 +00:00
christos	f1a4e9cae0	Coverity CID 2949: comment out dead code (from Arnaud Lacombe)	2006-09-29 19:37:11 +00:00
perseant	2ac2813b6e	Use lockstatus instead of a homebrewed locking system to control LFCNWRAPSTOP and LFCNWRAPGO. Be less verbose about the various looping checks: use log() rather than printf(), and only log anything if we are really looping ("count = 2" is not an error condition). Allow dirops sleeping on available space to be interruptible.	2006-09-28 23:08:23 +00:00
jld	1b78265f0e	Change ffs_mount, in MNT_UPDATE case, to check dev_t's for equality instead of just vnode pointers. Fixes erroneous "does not match mounted device" errors from mount(8) in the presence of MFS /dev, init.root, &c. No objections on tech-kern.	2006-09-21 00:11:30 +00:00
perseant	8c43e08b21	Don't remark a locked inode with IN_MODIFIED after writing it to disk, if we ourselves hold the lock. This prevents e.g. mknod from hanging indefinitely. Also, always use the return value from VOP_ISLOCKED to determine whether we hold the lock or someone else does, rather than looking into the lock structure ourselves.	2006-09-15 18:50:49 +00:00
yamt	9d3e3eab23	merge yamt-pdpolicy branch. - separate page replacement policy from the rest of kernel - implement an alternative replacement policy	2006-09-15 15:51:12 +00:00
christos	7465b73617	add missing initializers	2006-09-02 07:04:01 +00:00
christos	d74781a938	- add missing initializers - comment out impossible code	2006-09-02 06:48:00 +00:00
christos	0dc26f6dcb	remove impossible test	2006-09-02 06:46:04 +00:00
perseant	437e855235	Changes to help the roll-forward agent, to wit: * Mark being-deleted files in the Ifile so we can finish deleting them at fs mount time. * Flag the Ifile with "cleaner must clean" when writers are waiting for the cleaner, rather than relying solely on the cleaner's estimation of whether it should clean or not. * Note partial segments written by a user agent (in particular, fsck_lfs) so that repeated rolls forward don't interfere with one another. * Add a new fcntl, LFCNPASS, that allows the log to wrap exactly once, for better testing of the validity of checkpoints. * Keep track of the on-disk nlink count when cleaning, so that we don't partially complete directory operations while cleaning. * Ensure that every single Ifile inode write represents a consistent view of the filesystem. In particular, the accounting for the segment we are writing the inode into must be correct, and the accounting for the segment that inode used to reside in must be correct. Rather than just rewriting the inode if we wrote it wrong, rewrite the necessary ifile blocks before writing the inode so we never write it wrong. * Don't unmark any VDIROP vnodes if we haven't written them to disk, avoiding yet another problem with the "wait for the cleaner" error return from lfs_putpages(). Also, move the last callback to an aiodone call, so we no longer do any memory management from interrupt context.	2006-09-01 19:41:28 +00:00
christos	676e77765a	fix missing initializers	2006-08-30 01:28:53 +00:00
christos	57b45699b2	fix incomplete initializer.	2006-08-30 01:26:47 +00:00
martin	12cf319c62	Fix size confusion with lfs_fhandle - and as it now turns out to be the same as the lfs compat_30_fhandle, g/c the latter. Add an alias for the LFCNIFILEFH fcntl, so that binaries compiled in the meantime (with too large lfs_fhandle) continue to work. This makes vfs_cleanerd work again after the kernel checks filehandle size more strictly (problem reported by Kurt Schreiner on current-users).	2006-08-06 12:34:12 +00:00
martin	b4cb63a646	Make filehandles opaque to userland	2006-07-31 16:34:42 +00:00
ad	f474dceb13	Use the LWP cached credentials where sane.	2006-07-23 22:06:03 +00:00
perseant	1e9b73d972	Oops, commit the correct version of lfs_rfw.c. The roll-forward functionality is known not to work in this version (as it did not previously) but it should at least compile.	2006-07-20 23:56:27 +00:00
perseant	83771be892	Separate the (non-working) LFS kernel roll-forward code into its own file, lfs_rfw.c.	2006-07-20 23:49:07 +00:00
perseant	20227e112e	Note partial segments that are written by the cleaner, to help out the roll-forward agent.	2006-07-20 23:16:50 +00:00
perseant	186ffd50ab	Loop on the check for lfs_nowrap, so we don't allow a process to squeeze by.	2006-07-20 23:15:39 +00:00
perseant	5fdcd70349	Move the kauth checks up front, so that all new LFS fcntl calls are subject to the check for superuser privilege.	2006-07-20 23:14:09 +00:00
perseant	8c161d1081	Don't try to write all the vnodes, when the cleaner needs a vnode to be recycled.	2006-07-20 23:12:26 +00:00
martin	74709a8860	Apply _KERNEL_OPT	2006-07-13 22:08:00 +00:00
martin	3fb505e6b2	Version the lfs_cleanerd internal fcntl() for filehandles too, so old cleaners should work with newer kernels.	2006-07-13 22:05:52 +00:00
martin	a3b5baed42	Fix alignement problems for fhandle_t, exposed by gcc4.1. While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ, version the getfh(2) syscall and explicitly pass the size available in the filehandle from userland. Discussed on tech-kern, with lots of help from yamt (thanks!).	2006-07-13 12:00:24 +00:00
perseant	a2aa7212a8	Protect lfs_order_freelist() with the segment lock.	2006-07-06 22:27:19 +00:00
perseant	b8ec630ade	Fix a typo that caused a "multiple free" panic on unmounting a resized lfs.	2006-07-06 22:14:18 +00:00
perseant	b99e4c8268	Don't wake up the cleaner if the filesystem is unwrappable, and fix the compatibility fcntls. Also includes one-line fixes for an MP locking bug and a zero-length FINFO problem that manifested during testing.	2006-06-29 19:28:21 +00:00
perseant	1c57171fe3	Change LFCNWRAP{STOP,GO} to make them more suitable for snapshotting; in particular, the caller can now choose whether to wait for the condition to be met, and if the caller of LFCNWRAPSTOP dies or otherwise closes the descriptor, the filesystem is started again. Updated the ckckp regression test to use the new semantics. dump_lfs(8) now uses the fcntls to implement LFS-style snapshotting through the -X flag, addressing PR#33457 albeit not using fss(4). Fixed a couple other problems with dump_lfs that manifested themselves during testing.	2006-06-24 05:28:54 +00:00
yamt	e408053d1b	fix a simonb-timecounters regression. the precision of getnanotime() is not suitable for file timestamps. esp. when it's nfs-exported. - introduce vfs_timestamp(). (the name is from freebsd. currently merely a wrapper of nanotime()) - for ufs-like filesystems, use it rather than getnanotime(). XXX check other filesystems.	2006-06-23 14:13:02 +00:00
hannken	442bf57d1c	softdep_sync_metadata: If vp is a block device it may have new I/O requests posted for it even if the vnode is locked. This will deadlock with wmesg "softgetdbuf" if it gets a BMSAFEMAP dependency as here we have "bp == nbp" and try to get a buffer we already own. Approved by: Frank van der Linden <fvdl@netbsd.org>	2006-06-12 16:37:00 +00:00
kardel	1276c3051e	PR 33697: complete timecounter conversion	2006-06-11 09:26:04 +00:00
kardel	de4337ab21	merge FreeBSD timecounters from branch simonb-timecounters - struct timeval time is gone time.tv_sec -> time_second - struct timeval mono_time is gone mono_time.tv_sec -> time_uptime - access to time via {get,}{micro,nano,bin}time() get* versions are fast but less precise - support NTP nanokernel implementation (NTP API 4) - further reading: Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html	2006-06-07 22:33:33 +00:00
perseant	402f3abc7a	Read the inode version number fro a more reliable source, quelling a diagnostic assertion panic.	2006-05-24 21:08:00 +00:00
cube	d897e3cfdb	Include <sys/kauth.h> because it's needed.	2006-05-21 22:51:27 +00:00
perseant	0e0bb04d7a	Fix a bug in which FINFOs were written with a version number of zero. Add assertions and add this to the DEBUG fip test in lfs_writeseg.	2006-05-20 01:10:18 +00:00
perseant	6e53d31f5c	Break out the finfo array manipulation code into two new functions, lfs_acquire_finfo() and lfs_release_finfo(). Add a debugging check for zero-length finfo arrays in the segment summary to avoid future regressions.	2006-05-18 23:15:09 +00:00
perseant	758cf626b4	Don't duplicate the LFS_STARVED_FOR_SEGS check (an oversight that came in with rev 1.210).	2006-05-18 00:57:13 +00:00
perseant	48e300c97f	Don't be quite so eager to error out from lfs_putpages() when pages are busy; if we've sensed a possible 3-way deadlock and are not the pagedaemon, relock and try again.	2006-05-17 19:47:09 +00:00
christos	f1e7ec5164	we need <sys/kauth.h> for the kernel.	2006-05-15 03:01:50 +00:00
christos	2536b870ce	Don't include <sys/kauth.h>; breaks userland (newfs_lfs)	2006-05-15 00:45:57 +00:00
elad	fc9422c9d9	integrate kauth.	2006-05-14 21:31:52 +00:00
christos	12b7ab5f0b	Correct a bogus expression gcc4 found.	2006-05-14 05:27:59 +00:00
perseant	285f68c114	Fixes to address the "vinvalbuf: dirty blocks" panic that can occur when many inodes are cleaned at once. Make sure that we write all the pages on vnodes that are being flushed, even if we don't think there's room; drain v_numoutput before lfs_vflush() completes. Also, don't allow a vnode that is in the process of being cleaned to be chosen by getnewvnode(); this avoids a segment accounting panic in the case that a large number of inodes are fed to lfs_markv() all at once.	2006-05-12 23:36:11 +00:00
mrg	084c052803	quell GCC 4.1 uninitialised variable warnings. XXX: we should audit the tree for which old ones are no longer needed after getting the older compilers out of the tree..	2006-05-10 21:53:14 +00:00
perseant	935530188d	Change VOP_FCNTL to take an unlocked vnode. Approved by wrstuden@.	2006-05-04 16:48:16 +00:00
perseant	ce053245eb	Introduce another per-filesystem parameter, lfs_resvseg, to separate the notion of "how many segments are reserved for the cleaner" from that of "how many segments are not counted in lfs_bfree". The default value used for existing filesystems is the same as the previous implicit value of (lfs_minfreeseg / 2 + 1), modulo some sanity checking. Count pending dirops on a per-filesystem basis, since once we start writing them we can't stop until we're done. This seems to help stave off the "no clean segments" panic in the case of filling the filesystem with directories and small files (e.g. simultaneously unpacking more copies of pkgsrc than will fit).	2006-05-04 04:22:55 +00:00
perseant	e807d08027	Fix a "locking against myself": lfs_flush_dirops() doesn't need to lock the vnodes to write their blocks, since it holds the segment lock.	2006-05-02 00:52:26 +00:00
perseant	8696fd25e2	Don't ever partially write dirops, even if we need the cleaner to run. This increases the chances of the "no clean segments" panic slightly, but allows us to run the ckckp regression test successfully to completion.	2006-05-01 19:47:29 +00:00
perseant	8fc4e510a9	Add an explicit list initialization that was missing from my last commit.	2006-04-30 21:59:58 +00:00
perseant	481da54fc1	Postpone the segment accounting changes coming from truncation until the inode that makes those changes valid is either written to disk by lfs_writeinode() or discarded by lfs_vfree(). A couple of locking fixes are also included as well.	2006-04-30 21:19:42 +00:00
yamt	1d3a67174f	remove unused FFS_NAMES and LFS_NAMES.	2006-04-23 14:15:12 +00:00
perseant	7119533fb9	Fix a fencepost error in the bitmap handling in extend_ifile(), and another in lfs_freelist_prev().	2006-04-22 00:12:45 +00:00
perseant	7cd0266a27	Regression test improvements: Move the stop for LFCNWRAPSTOP to the point at which writing at segment 0 is really about to commence, since this is what the test expects (and incidentally what a snapshotting utility wants as well). More correctly reconstruct the on-disk state at every checkpoint, rather than relying on the entire state at the point of wrapping to be accurate (that is only true the first time we wrap). Add a "make abort" target to make rerunning the test more convenient when it has failed and we're done analyzing the failure.	2006-04-22 00:10:54 +00:00
perseant	5f627fe958	Avoid a possible sign overflow condition in lfs_truncate, which would result in a buffer overflow (underflow). Coverity CID 1521.	2006-04-19 00:22:15 +00:00
perseant	80a505b9f7	Don't roll forward if we aren't given a process context. Coverity CID 1076.	2006-04-18 23:40:47 +00:00
perseant	e52cd940c0	Get rid of the LFS_FORCE_WRITE case. We never really used it, and it could panic the kernel if cleaner daemon passed the right combination of arguments. Coverity CID 2741.	2006-04-18 22:42:33 +00:00
perseant	f58c67b02f	Yet another MP locking issue.	2006-04-18 21:41:20 +00:00
christos	53ae068fc6	Coverity CID 746: Remove dead code. lbn >= NDADDR is mutually exclusive to snapshot_locked == 0.	2006-04-18 21:39:03 +00:00
perseant	0268059112	Introduce two fcntl calls that freeze the filesystem right at the point where segment 0 is being considered for writing. This allows for automated checkpoint vailidity scanning, and could be used (in conjunction with the existing LFCNREWIND) for e.g. snapshot dumps as well. Include a regression test that does such scanning. When writing the Ifile, loop through the dirty block list three times to make sure that the checkpoint is always consistent (the first and second times the Ifile blocks can cross a segment boundary; not so the third time unless the segments are very small). Discovered by using the aforementioned regression test.	2006-04-17 20:02:34 +00:00
christos	0bc8039fc6	Coverity CID 1166: Add KASSERT before deref.	2006-04-15 05:32:29 +00:00
christos	3d772305a8	Coverity CID 1169: Add KASSERT before deref.	2006-04-15 05:31:18 +00:00
christos	e14b3e8165	Coverity CID 2858: Avoid NULL deref.	2006-04-15 05:29:10 +00:00
christos	17ed031f90	Coverity CID 2499: Fix uninitialize variable use.	2006-04-15 05:19:08 +00:00
christos	6555ff0ad3	From my posting of April 3 to tech-kern: My understanding is that the CLRSIG() is supposed to clear the signal that was sent to the syncer process to prevent it from being delivered to the syncer process in case unmounting fails, so that the syncer process does not die while the filesystem is still mounted. The typical scenario is, the syncher process is tsleep()ing in the kernel, and waking up when it needs to do work. If someone sends a signal to it, eg. kill -TERM the mfs process, then the kernel will try to unmount the mfs filesystem before delivering the signal to the process. If that unmount fails, then we should not really kill the process because that will hang the mount. So we call CLRSIG() to stop the signal from being delivered. So the first call to issignal() will return the signal number that was sent to the syncer process (unless someone malicious was able to send a lower numbered signal between the time tsleep() returned and we called issignal()... something that is not really easy to do). But you are right, we should not be calling it many times as a side effect of this macro. Rewrite CLRSIG() clear all the signals and call issignal() the correct number of times.	2006-04-15 01:16:40 +00:00
perseant	81ded5df65	Make lfs_vref/lfs_vunref not need to know about VXLOCK and VFREEING explicitly (especially since we didn't know about VFREEING at all before), but notice the EBUSY return from vget() instead. Fix some more MP locking protocol issues, most of which were pointed out by Christian Ehrhardt this morning on tech-kern.	2006-04-13 23:46:28 +00:00
perseant	575f22cf94	Another MP locking fix.	2006-04-11 22:08:00 +00:00
perseant	74b70f471b	Remove mostly useless BUFPAGES warning message from lfs_{un,}mount.	2006-04-10 23:51:50 +00:00
bouyer	eb7f9aba74	Revert previous; I mixed bpp and *bpp when reading ffs_balloc_ufs1(). ffs_balloc() will always allocate a new buffer or leave it as NULL, so coverity is wrong here, we're not using a freed argument.	2006-04-10 22:01:06 +00:00
bouyer	a4181a9049	If we brelse ibp, set ibp to NULL, to avoid reusing it later in balloc() or in our code at the next iteration. Coverity ID 2706	2006-04-10 21:50:18 +00:00
perseant	07ebfab840	Optimize the free list search a little more; in particular use words instead of bytes for the index, and never search below fs->lfs_freehd. Fix a bug in the previous version of the search (an erroneous assumption that ino_t was signed). Free the bitmap when we unmount the filesystem.	2006-04-10 21:20:19 +00:00
perseant	017f856cba	Don't leak vnode references if we fail to lock a vnode in lfs_flush_pchain(). Also fix another (probably only academic) simple_lock protocol error.	2006-04-10 21:17:21 +00:00
perseant	fbf75b2bf7	Correct a locking bug in the recent pager optimization.	2006-04-10 18:42:48 +00:00
yamt	539544d937	ffs_gop_size: revert a problematic part of 1.78. problems reported by Kouichirou Hiratsuka and Jukka Salmi on current-users@.	2006-04-09 21:59:35 +00:00
perseant	39ce23c169	Implement a somewhat finer-grained mechanism for paging LFS-backed pages. The writer daemon, if it does not need to flush the whole filesystem, now only writes the vnodes for which the pagedaemon has requested pageouts (although it does not pay attention to the page ranges the pagedaemon supplies).	2006-04-08 00:26:34 +00:00
perseant	ff84dd347a	Keep the free list ordered. This solves a problem first pointed out to me by Michel Oey, in which an aged LFS writes up to an extra Ifile block for every file created; and paves the way for the truncation of the Ifile when many files are deleted.	2006-04-08 00:16:56 +00:00
perseant	7c22dcc8a6	Several minor bug fixes: * Correct (weak) segment lock assertions in lfs_fragextend and lfs_putpages. * Keep IN_MODIFIED set if we run out of avail in lfs_putpages. * Don't try to (re)write buffers on a VBLK vnode; fixes a panic I found while running with an LFS root. * Raise priority of LFCNSEGWAIT to PVFS; PUSER is way too low for something the pagedaemon is relying on.	2006-04-07 23:59:28 +00:00
perseant	d28248e84e	Make the segment lock aware of LWPs. Fixes a (somewhat confusing) "lockmgr: pid 3997, not exclusive lockholder 3997, unlocking" panic I encountered while running blogbench on an LFS.	2006-04-07 23:44:14 +00:00
uwe	7494d34448	Tell config to generate fs_ffs.h as vfs_bio.c checks for defined(FFS). Include that header in vfs_bio.c so that bioops are not redefined.	2006-04-05 00:52:16 +00:00
pavel	929734802b	Correct typo in a panic message.	2006-04-04 17:12:57 +00:00
perseant	51afd83ada	Make sure we unlock to zero when avoiding 3-way deadlock; otherwise we simply have a different form of deadlock.	2006-04-01 00:13:01 +00:00
perseant	418bf18f53	Handle the "filesystem is clean" flag correctly when upgrading from read-only to read-write mount. This makes "root on lfs" work for me, although it looks like a different traceback from PR#32667.	2006-03-31 02:31:37 +00:00
yamt	c5fcdd1719	some cleanups after the introduction of GOP_SIZE_MEM flag. - remove GOP_SIZE_READ/GOP_SIZE_WRITE flags. they have not been used since the change. - ufs_balloc_range: remove code which has been no-op since the change. thanks Konrad Schroder for explaining the original intention of the code. - ffs_gop_size: don't extend past eof, in the case of GOP_SIZE_MEM. otherwise genfs_getpages end up to allocate pages past eof unnecessarily.	2006-03-30 12:40:06 +00:00
perseant	0a4e8d80c1	Double-checkpoint on unmount. This ensures that vnodes belonging to removed files are really freed, preventing occasional spurious EBUSY returns from vflush().	2006-03-28 23:57:41 +00:00
perseant	afc725a1c7	Don't let the pagedaemon wait for pages, since that is just asking for a deadlock.	2006-03-28 01:29:55 +00:00
perseant	dddf5c5171	Improvements to LFS's paging mechanism, to wit: * Acknowledge that sometimes there are more dirty pages to be written to disk than clean segments. When we reach the danger line, lfs_gop_write() now returns EAGAIN. The caller of VOP_PUTPAGES(), if it holds the segment lock, drops it and waits for the cleaner to make room before continuing. * Note and avoid a three-way deadlock in lfs_putpages (a writer holding a page busy blocks on the cleaner while the cleaner blocks on the segment lock while lfs_putpages blocks on the page).	2006-03-24 20:05:32 +00:00
hannken	cd28767efa	ffs_balloc*(): Add an assertion for "bpp != NULL" if B_METAONLY is set. From Coverity CIDs 1170..1173	2006-03-23 11:16:47 +00:00
matt	0486735479	More MALLOC -> malloc changes.	2006-03-19 17:50:42 +00:00
rtr	aa6b2db95f	init struct vnode *vp = NULL coverity 2724 / run 6 XXX in future runs coverity may complain about deref NULL now but comment on line 382 indicates this should not be possible	2006-03-19 04:10:02 +00:00
rtr	7818c9e2d0	don't bother checking of ts == NULL before assigning since we know that it is. solves coverity 2725 / run 6	2006-03-19 03:58:34 +00:00

1 2 3 4 5 ...

1369 Commits