Commit Graph

165 Commits

Author SHA1 Message Date
perseant
017f856cba Don't leak vnode references if we fail to lock a vnode in lfs_flush_pchain().
Also fix another (probably only academic) simple_lock protocol error.
2006-04-10 21:17:21 +00:00
perseant
39ce23c169 Implement a somewhat finer-grained mechanism for paging LFS-backed pages.
The writer daemon, if it does not need to flush the whole filesystem,
now only writes the vnodes for which the pagedaemon has requested pageouts
(although it does not pay attention to the page ranges the pagedaemon
supplies).
2006-04-08 00:26:34 +00:00
perseant
7c22dcc8a6 Several minor bug fixes:
* Correct (weak) segment lock assertions in lfs_fragextend and lfs_putpages.
* Keep IN_MODIFIED set if we run out of avail in lfs_putpages.
* Don't try to (re)write buffers on a VBLK vnode; fixes a panic I found
  while running with an LFS root.
* Raise priority of LFCNSEGWAIT to PVFS; PUSER is way too low for
  something the pagedaemon is relying on.
2006-04-07 23:59:28 +00:00
perseant
51afd83ada Make sure we unlock to zero when avoiding 3-way deadlock; otherwise we
simply have a different form of deadlock.
2006-04-01 00:13:01 +00:00
perseant
418bf18f53 Handle the "filesystem is clean" flag correctly when upgrading from
read-only to read-write mount.  This makes "root on lfs" work for me,
although it looks like a different traceback from PR#32667.
2006-03-31 02:31:37 +00:00
yamt
c5fcdd1719 some cleanups after the introduction of GOP_SIZE_MEM flag.
- remove GOP_SIZE_READ/GOP_SIZE_WRITE flags.
  they have not been used since the change.
- ufs_balloc_range: remove code which has been no-op since the change.
  thanks Konrad Schroder for explaining the original intention of the code.
- ffs_gop_size: don't extend past eof, in the case of GOP_SIZE_MEM.
  otherwise genfs_getpages end up to allocate pages past eof unnecessarily.
2006-03-30 12:40:06 +00:00
perseant
afc725a1c7 Don't let the pagedaemon wait for pages, since that is just asking for
a deadlock.
2006-03-28 01:29:55 +00:00
perseant
dddf5c5171 Improvements to LFS's paging mechanism, to wit:
* Acknowledge that sometimes there are more dirty pages to be written to
  disk than clean segments.  When we reach the danger line,
  lfs_gop_write() now returns EAGAIN.  The caller of VOP_PUTPAGES(), if
  it holds the segment lock, drops it and waits for the cleaner to make
  room before continuing.

* Note and avoid a three-way deadlock in lfs_putpages (a writer holding
  a page busy blocks on the cleaner while the cleaner blocks on the
  segment lock while lfs_putpages blocks on the page).
2006-03-24 20:05:32 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
yamt
a748ea88dd merge yamt-vop branch. remove following VOPs.
VOP_BLKATOFF
	VOP_VALLOC
	VOP_BALLOC
	VOP_REALLOCBLKS
	VOP_VFREE
	VOP_TRUNCATE
	VOP_UPDATE
2005-11-02 12:38:58 +00:00
christos
3544d898ac split out lfs_itimes(). It is used in fsck_lfs. 2005-09-13 04:13:25 +00:00
christos
a12024da06 Use nanotime() to update the time fields in filesystems. Convert the code
from macros to real functions. Original patch and review from chuq.
Note: ext2fs only keeps seconds in the on-disk inode, and msdosfs does not
have enough precision for all fields, so this is not very useful for those
two.
2005-09-12 16:24:41 +00:00
christos
50f8955b6e 64 bit inode changes. 2005-08-19 02:04:03 +00:00
christos
273df63602 - sprinkle const
- avoid shadow variables.
2005-05-29 21:25:24 +00:00
perseant
f8677583c3 VOP_LOCK drops the interlock; pick it up again to avoid an "already unlocked"
panic in lfs_putpages.
2005-05-20 19:09:25 +00:00
perseant
5ed293c5d5 Recognize that we hold the v_interlock when relocking after a flush in
lfs_putpages.
2005-04-27 20:35:10 +00:00
skrll
d1c90589d8 Use the right arg structure for lfs_setattr, i.e. s/getattr/setattr/. 2005-04-25 06:28:51 +00:00
perseant
2f695b5476 Provide a resize_lfs(8), including kernel and cleaner support. The current
implementation requires the fs to be mounted while resizing.  Tested in both
directions, and everything appears to work happily, but ymmv.
2005-04-23 19:47:51 +00:00
perseant
f4a7694fc9 Keep per-inode, per-fs, and subsystem-wide counts of blocks allocated through
lfs_balloc(), and use that to estimate the number of dirty pages belonging
to LFS (subsystem or filesystem).  This is almost certainly wrong for
the case of a large mmap()ed region, but the accounting is tighter than
what we had before, and performs much better in the typical case of pages
dirtied through write().
2005-04-19 20:59:05 +00:00
perseant
b2d19f57a3 Check for the inode having been previously freed, in UNMARK_VNODE().
Avoids a panic when calling mkdir() on a full filesystem.
2005-04-18 17:36:46 +00:00
perseant
5ed792ecb0 Use splay trees, rather than a hash table, to manage the accounting of
blocks allocated through VOP_BALLOC() for pages to be written to disk.
This accounting no longer takes a noticeable fraction of the system CPU.
2005-04-16 17:35:58 +00:00
perseant
94decdd25d Use lfs_malloc() to manage the blkiov arrays that the cleaner functions use,
since the cleaner is likely to operate in a low-memory condition.
2005-04-16 17:28:37 +00:00
perseant
9936b8ce7e Tabify leading whitespace 2005-04-14 00:58:26 +00:00
perseant
f08a1ca4fa Consolidate the hash table we use to maintain the integrity of lfs_avail
into a single, system-wide table, rather than having a separate hash table
per inode.  Significantly reduces the "system" cpu usage of your average
file write.
2005-04-14 00:44:16 +00:00
perseant
1ebfc508b6 Protect various per-fs structures with fs->lfs_interlock simple_lock, to
improve behavior in the multiprocessor case.  Add debugging segment-lock
assertion statements.
2005-04-01 21:59:46 +00:00
perseant
bb7bbb2d16 Don't sleep while holding the vnode interlock. Should take care of the
first panic case in PR #26043.
2005-03-25 01:45:05 +00:00
chs
f31a80ccd3 avoid the need for recursive locking lfs_flush_dirops() by unlocking
the vnode around the call to this in the caller.
2005-03-24 04:00:33 +00:00
perseant
c716c3d307 Make LFS dirops get their vnode first, before incrementing the dirop count,
to prevent a deadlock trying to call VOP_PUTPAGES() on a VDIROP vnode.
This can happen when a stacked filesystem is mounted on top of an LFS: an
LFS dirop needs to get a vnode, which is available from the upper layer.
The corresponding lower layer vnode, however, is VDIROP, so the upper layer
can't be cleaned out since its VOP_PUTPAGES() is passed through to the lower
layer, which waits for dirops to drain before it can proceed.  Deadlock.

Tweak ufs_makeinode() and ufs_mkdir() to pass the a_vpp argument through
to VOP_VALLOC().

Partially addresses PR # 26043, though it probably does not completely fix
the problem described there.
2005-03-23 00:12:51 +00:00
simonb
52c470b886 Tab Police. 2005-03-08 04:49:35 +00:00
perseant
eefd94b8e2 Straighten out the maze of ifdefs. Instead, consolidate all the debugging
stuff under '#ifdef DEBUG', and use sysctl knobs to turn on/off particular
parts of the debugging reporting (if DEBUG is enabled).  Re-enable the LFS
statistics in sysctl, while I'm there.  A bit of a rototill.
2005-03-08 00:18:19 +00:00
perry
bcfcddbac1 nuke trailing whitespace 2005-02-26 22:31:44 +00:00
perseant
25f49c3c91 Various minor LFS improvements:
* Note when lfs_putpages(9) thinks it is not going to be writing any
  pages before calling genfs_putpages(9).  This prevents a situation in
  which blocks can be queued for writing without a segment header.
* Correct computation of NRESERVE(), though it is still a gross
  overestimate in most cases.  Note that if NRESERVE() is too high, it
  may be impossible to create files on the filesystem.  We catch this
  case on filesystem mount and refuse to mount r/w.
* Allow filesystems to be mounted whose block size is == MAXBSIZE.
* Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN
  entries in indirect blocks again, triggering a failed assertion "daddr
  <= LFS_MAX_DADDR".  Explicitly convert to and from int32_t to correct
  this.
* Add a high-water mark for the number of dirty pages any given LFS can
  hold before triggering a flush.  This is settable by sysctl, but off
  (zero) by default.
* Be more careful about the MAX_BYTES and MAX_BUFS computations so we
  shouldn't see "please increase to at least zero" messages.
* Note that VBLK and VCHR vnodes can have nonzero values in di_db[0]
  even though their v_size == 0.  Don't panic when we see this.
* Change lfs_bfree to a signed quantity.  The manner in which it is
  processed before being passed to the cleaner means that sometimes it
  may drop below zero, and the cleaner must be aware of this.
* Never report bfree < 0 (or higher than lfs_dsize) through
  lfs_statvfs(9).  This prevents df(1) from ever telling us that our full
  filesystems have 16TB free.
* Account space allocated through lfs_balloc(9) that does not have
  associated buffer headers, so that the pagedaemon doesn't run us out
  of segments.
* Return ENOSPC from lfs_balloc(9) when bfree drops to zero.
* Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being
  unmounted.  Because vfs_busy() is a shared lock, and
  lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be
  holding the lock that umount() is blocking on, then try to vfs_busy()
  again in getnewvnode().
2005-02-26 05:40:42 +00:00
wrstuden
e384a44e9d Extend fsync_range(2) to support the FDISKSYNC flag, which requests
that the sync be propogated out through the disk drive caches.
2005-01-25 23:55:20 +00:00
yamt
2b17bf3d63 check_dirty: fix another PHOLD leak. ("goto top" path) 2004-04-22 10:45:00 +00:00
christos
6bd1d6d4db Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
2004-04-21 01:05:31 +00:00
yamt
aa514117d5 check_dirty: plug a PHOLD leak. from Greg Oster. 2004-04-20 11:52:17 +00:00
yamt
f9571060ef lfs_putpages: fix a simple_lock mismatch. 2004-02-26 22:41:36 +00:00
hannken
d6170777cf Fix xxx_strategy() to use the vnode arg instead of bp->b_vp. 2004-01-26 10:39:29 +00:00
hannken
3db4e2acd8 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:06:48 +00:00
yamt
6b95193071 - reduce code duplication.
- use boolean_t where appropriate.
2003-12-16 13:47:48 +00:00
yamt
98e9a8c373 g/c lfs_no_inactive. 2003-12-16 11:45:07 +00:00
yamt
5b12f94dde use FINFOSIZE macro. 2003-11-25 15:14:57 +00:00
simonb
a2facef339 Remove some assigned-to but otherwise unused variables. 2003-10-30 01:43:08 +00:00
christos
372f57e757 Fix uninitialized variable warnings. 2003-10-25 18:26:46 +00:00
fvdl
c6019338cd Correct preempt() calls. 2003-10-21 00:39:03 +00:00
yamt
4e9f921204 be more strict about sa->vp.
(make sure the last lfs_updatemata in lfs_putpages takes effect.)
2003-10-18 15:52:42 +00:00
dbj
fe7c786886 add mnt_iflag field to struct mount for internal flags
mv MNT_GONE, MNT_UNMOUNT and MNT_WANTRDWR to this field
additonally add mnt_writeopcountupper and mnt_writeopcountlower fields
in preparation for pending write suspension support work
bump kernel version to 1.6ZD
2003-10-14 14:02:56 +00:00
yamt
61d5d4362b fix a bug of lfs.
genfs_getpages() can read in more blocks than it should due to faked filesize
of lfs_gop_size().  it's a security problem and it makes gcc3 "internal error"

to fix this,
- in genfs_getpages(), always calculate diskeof and memeof separately
  so that filesystems (in this case, lfs) can use different strategies
  for them.
- introduce GOP_SIZE_MEM flag and use it to request in-core filesize.
  (it was an intention of GOP_SIZE_READ,
  but after the above change _READ is not a straightforward name)

after this, no one uses GOP_SIZE_{READ,WRITE} anymore but leave them for now.
2003-09-24 10:22:53 +00:00
yamt
67a5559821 cleanup IN_ADIROP/VDIROP handling a little. 2003-09-23 05:26:49 +00:00
yamt
e2fbe9d54d remove unnecessary externs of lfs_do_flush. 2003-09-23 05:26:12 +00:00