Commit Graph

164 Commits

Author SHA1 Message Date
christos
273df63602 - sprinkle const
- avoid shadow variables.
2005-05-29 21:25:24 +00:00
perseant
2f695b5476 Provide a resize_lfs(8), including kernel and cleaner support. The current
implementation requires the fs to be mounted while resizing.  Tested in both
directions, and everything appears to work happily, but ymmv.
2005-04-23 19:47:51 +00:00
perseant
f4a7694fc9 Keep per-inode, per-fs, and subsystem-wide counts of blocks allocated through
lfs_balloc(), and use that to estimate the number of dirty pages belonging
to LFS (subsystem or filesystem).  This is almost certainly wrong for
the case of a large mmap()ed region, but the accounting is tighter than
what we had before, and performs much better in the typical case of pages
dirtied through write().
2005-04-19 20:59:05 +00:00
perseant
f63fa194c2 Check the to-be-on-disk consistency of directories as well (correct a typo
in an earlier commit).
2005-04-18 23:03:08 +00:00
perseant
2ee78c4fa9 Keep track of the highest block held by an LFS inode, so that we can
be assured that the last byte of a file is always allocated.  Previously
a file extension could cause the filesystem to be flushed, writing an
inconsistent inode to disk.  Although this condition would be corrected
the next time blocks were written to disk, an intervening crash would leave
the filesystem in an inconsistent state, leaving fsck_lfs to complain
of an inode "partially truncated".
2005-04-14 00:02:46 +00:00
perseant
1ebfc508b6 Protect various per-fs structures with fs->lfs_interlock simple_lock, to
improve behavior in the multiprocessor case.  Add debugging segment-lock
assertion statements.
2005-04-01 21:59:46 +00:00
perseant
eefd94b8e2 Straighten out the maze of ifdefs. Instead, consolidate all the debugging
stuff under '#ifdef DEBUG', and use sysctl knobs to turn on/off particular
parts of the debugging reporting (if DEBUG is enabled).  Re-enable the LFS
statistics in sysctl, while I'm there.  A bit of a rototill.
2005-03-08 00:18:19 +00:00
perry
bcfcddbac1 nuke trailing whitespace 2005-02-26 22:31:44 +00:00
perseant
25f49c3c91 Various minor LFS improvements:
* Note when lfs_putpages(9) thinks it is not going to be writing any
  pages before calling genfs_putpages(9).  This prevents a situation in
  which blocks can be queued for writing without a segment header.
* Correct computation of NRESERVE(), though it is still a gross
  overestimate in most cases.  Note that if NRESERVE() is too high, it
  may be impossible to create files on the filesystem.  We catch this
  case on filesystem mount and refuse to mount r/w.
* Allow filesystems to be mounted whose block size is == MAXBSIZE.
* Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN
  entries in indirect blocks again, triggering a failed assertion "daddr
  <= LFS_MAX_DADDR".  Explicitly convert to and from int32_t to correct
  this.
* Add a high-water mark for the number of dirty pages any given LFS can
  hold before triggering a flush.  This is settable by sysctl, but off
  (zero) by default.
* Be more careful about the MAX_BYTES and MAX_BUFS computations so we
  shouldn't see "please increase to at least zero" messages.
* Note that VBLK and VCHR vnodes can have nonzero values in di_db[0]
  even though their v_size == 0.  Don't panic when we see this.
* Change lfs_bfree to a signed quantity.  The manner in which it is
  processed before being passed to the cleaner means that sometimes it
  may drop below zero, and the cleaner must be aware of this.
* Never report bfree < 0 (or higher than lfs_dsize) through
  lfs_statvfs(9).  This prevents df(1) from ever telling us that our full
  filesystems have 16TB free.
* Account space allocated through lfs_balloc(9) that does not have
  associated buffer headers, so that the pagedaemon doesn't run us out
  of segments.
* Return ENOSPC from lfs_balloc(9) when bfree drops to zero.
* Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being
  unmounted.  Because vfs_busy() is a shared lock, and
  lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be
  holding the lock that umount() is blocking on, then try to vfs_busy()
  again in getnewvnode().
2005-02-26 05:40:42 +00:00
yamt
22399b45d0 change some members of struct buf from long to int.
ride on 2.0H.
2004-09-18 16:40:11 +00:00
mycroft
bc25b30608 Add a new flag, IN_MODIFY. This is like IN_UPDATE|IN_CHANGE, but unlike
setting those flags, it does not cause the inode to be written in the periodic
sync.  This is used for writes to special files (devices and named pipes) and
FIFOs.

Do not preemptively sync updates to access times and modification times.  They
are now updated in the inode only opportunistically, or when the file or device
is closed.  (Really, it should be delayed beyond close, but this is enough to
help substantially with device nodes.)

And the most amusing part:
Trickle sync was broken on both FFS and ext2fs, in different ways.  In FFS, the
periodic call to VFS_SYNC(MNT_LAZY) was still causing all file data to be
synced.  In ext2fs, it was causing the metadata to *not* be synced.  We now
only call VOP_UPDATE() on the node if we're doing MNT_LAZY.  I've confirmed
that we do in fact trickle correctly now.
2004-08-14 01:08:02 +00:00
yamt
58912348a7 lfs_cluster_aiodone: turn an invariant condition into an assertion. 2004-05-19 11:29:32 +00:00
yamt
15c9d33810 calculate data checksum inline. 2004-03-09 07:43:49 +00:00
yamt
81ce5e8cc3 use correct segment size. this fixes memory corruption when using lfsv1. 2004-03-09 06:43:18 +00:00
yamt
a57f9a6ca5 lfs_update_single: add an assertion. 2004-01-29 12:10:07 +00:00
yamt
09ec20ca66 eliminate tricky usages of VOP_STRATEGY which are (no longer?) necessary. 2004-01-28 10:53:12 +00:00
hannken
3db4e2acd8 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:06:48 +00:00
yamt
7266a95907 store a i/o priority hint in struct buf for buffer queue discipline. 2004-01-10 14:39:50 +00:00
yamt
009640868e set VBWAIT when waiting v_numoutput to be drained. 2003-12-17 10:38:39 +00:00
yamt
ce11c3ce4e remove a redundant substitution. 2003-12-17 07:14:03 +00:00
yamt
3ea6756a92 use b_private rather than b_saveaddr.
XXX LFS_USE_B_INVAL
2003-12-04 14:57:47 +00:00
yamt
cc716087b0 - tweak lfs_update_single()'s prototype so that it can be used by
roll-forward code.
- reduce code duplication using the above in update_meta()
  this also fixes fragment accounting.
2003-11-07 17:55:29 +00:00
christos
372f57e757 Fix uninitialized variable warnings. 2003-10-25 18:26:46 +00:00
yamt
4e9f921204 be more strict about sa->vp.
(make sure the last lfs_updatemata in lfs_putpages takes effect.)
2003-10-18 15:52:42 +00:00
simonb
c25af55e8c Remove assigned-to but otherwise unused variable. 2003-10-18 04:03:22 +00:00
yamt
818ef92da6 add comments and tweak code a little for readability.
(no behaviour changes)
2003-10-17 14:20:12 +00:00
yamt
1508246f38 remove a redundant definition of LFS_MAX_ACTIVE. 2003-10-14 12:51:31 +00:00
yamt
dd4d591157 - a comment.
- bcopy -> memcpy
- increase 'p' only when needed.
2003-10-08 15:07:25 +00:00
yamt
4ce4892712 assertions. 2003-10-03 15:35:54 +00:00
yamt
33feb8e686 reassignbuf() when lfs_writeseg() takes away B_DELWRI. 2003-10-03 15:35:03 +00:00
yamt
656ff745cf when inactivating segments, compare segment numbers correctly. 2003-10-03 13:02:54 +00:00
yamt
0dc0c83b61 remove redundant prototypes. 2003-09-29 15:12:08 +00:00
yamt
4a78faea0f - buffer cache MP locks.
- avoid changing buffer state on the free queue.
2003-09-07 11:47:07 +00:00
agc
aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
yamt
bdbaf98d1e using normal bufcache buffer for cluster buffer head. 2003-07-30 13:36:40 +00:00
yamt
bddddad951 KNF. 2003-07-23 13:53:51 +00:00
yamt
12ad26b293 - wrap long lines.
- remove a mysterious blank line.
2003-07-12 16:17:52 +00:00
yamt
3852db2096 - protect global resource counts with lfs_subsys_lock.
- clean up scattered externs a little.
2003-07-12 16:17:06 +00:00
yamt
eb4e09d59f use queue.h macros. 2003-07-02 13:43:02 +00:00
yamt
102c8a6a74 - add a new functions, lfs_writer_enter/leave, and use them instead of
duplicated code fragments.
- add an assertion.
2003-07-02 13:40:51 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
thorpej
a06b275edc Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget().  Turns out
  that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
  and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
  above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
  just to appease the above.
2003-06-29 18:43:21 +00:00
darrenr
960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
yamt
ab2238cfad make is_sequential a callback in order to achieve better lfs write clustering.
since lfs always rewrite blocks into the new segment,
current on-disk place of the block doesn't affect to write clustering.

ok'ed by Konrad Schroder.
2003-05-18 12:59:05 +00:00
perseant
ef3c60764c Make LFS work better (though still not "well") as an NFS-exported
filesystem (and other things that needed to be fixed before the tests
would complete), to wit:

* Include the fs ident in the filehandle; improve stale filehandle checks.

* Change definition of blksize() to use the on-dinode size instead of
  the inode's i_size, so that fsck_lfs will work properly again.

* Use b_interlock in lfs_vtruncbuf.

* Postpone dirop reclamation until after the seglock has been released,
  so that lfs_truncate is not called with the segment lock held.

* Don't loop in lfs_fsync(), just write everything and wait.

* Be more careful about the interlock/uobjlock in lfs_putpages: when we
  lose this lock, we have to resynchronize dirtiness of pages in each
  block.

* Be sure to always write indirect blocks and update metadata in
  lfs_putpages; fixes a bug that caused blocks to be accounted to the
  wrong segment.
2003-04-23 07:20:37 +00:00
fvdl
42614ed3f3 Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
2003-04-02 10:39:19 +00:00
yamt
0296b9ddb2 add assertions and a debug check. 2003-04-01 14:58:43 +00:00
fvdl
691b2fa7db The checkpoint loop always used (multiples of) lfs_sepb as the number
of segments to mark. However, this may be much more than lfs_nseg.

Originally this wasn't a big problem, since only the structures in the
diskblock were changed, but nowadays there's a mirror of the segflags
in the in-core superblock. This problem caused the code to walk
way past the end of that allocated area, causing memory corruption
in other kernel structures. So, use lfs_nseg as the maximum, as it should be.

While here, simplify the loop; it had become an obfuscated piece of
code overtime.
2003-03-28 22:39:42 +00:00
perseant
3f7016035a Add a sleeper count, to prevent the cleaner from panicing the kernel
when the filesystem is unmounted, relocking the Ifile when its lock is
draining.  (We can't use vfs_busy() since the process is sleeping for a
good long time.)  Clean up / organize lfs.h, while I'm here.

In lfs_update_single, assert that disk addresses are either negative, or
are still positive when converted to int32_t, to prevent recurrence of a
negative/positive block problem.
2003-03-28 08:03:38 +00:00
perseant
78dff1cfa3 KNF (space after keywords). 2003-03-21 06:26:36 +00:00