Commit Graph

416 Commits

Author SHA1 Message Date
darrenr 960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
yamt ab2238cfad make is_sequential a callback in order to achieve better lfs write clustering.
since lfs always rewrite blocks into the new segment,
current on-disk place of the block doesn't affect to write clustering.

ok'ed by Konrad Schroder.
2003-05-18 12:59:05 +00:00
nakayama a6d8c9185d Avoid comparison is always false warning in gcc 3.3 w/ 64-bit size_t. 2003-05-17 01:44:39 +00:00
ragge 97fa6ef77b Add a missing ifdef DDB. 2003-05-07 18:49:29 +00:00
perseant e18585deb2 Correct arguments to check_dirty, ensuring that all pages in a block are
written if any of them are dirty.  Pointed out by yamt.
2003-05-02 01:47:39 +00:00
perseant 100c8f971f Restrict the run of cluster blocks to on-disk contiguous blocks (back out
part of rev 1.115), to avoid writing over holes.  This is the lesser of two
evils, to be replaced soon.
2003-04-29 17:45:11 +00:00
yamt 982fbc7db4 add an assertion. 2003-04-29 07:44:04 +00:00
yamt b4d5e11ffe fix a comment. 2003-04-27 06:47:45 +00:00
yamt c2b802ff24 fix b_interlock lock/unlock mismatches. 2003-04-27 06:46:38 +00:00
perseant b691ba8a71 Don't change update time on block write; lets e.g. "tar xp" work properly. 2003-04-27 04:18:29 +00:00
perseant ef3c60764c Make LFS work better (though still not "well") as an NFS-exported
filesystem (and other things that needed to be fixed before the tests
would complete), to wit:

* Include the fs ident in the filehandle; improve stale filehandle checks.

* Change definition of blksize() to use the on-dinode size instead of
  the inode's i_size, so that fsck_lfs will work properly again.

* Use b_interlock in lfs_vtruncbuf.

* Postpone dirop reclamation until after the seglock has been released,
  so that lfs_truncate is not called with the segment lock held.

* Don't loop in lfs_fsync(), just write everything and wait.

* Be more careful about the interlock/uobjlock in lfs_putpages: when we
  lose this lock, we have to resynchronize dirtiness of pages in each
  block.

* Be sure to always write indirect blocks and update metadata in
  lfs_putpages; fixes a bug that caused blocks to be accounted to the
  wrong segment.
2003-04-23 07:20:37 +00:00
christos 80ecd573c0 PR/1796: John Kohl: statfs misbehaves under chrooted environments.
- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
  and factored out some of the vfsop statfs() code to copy_statfs_info(). This
  fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
2003-04-16 21:44:18 +00:00
simonb 761de7345c '#if 0' out a variable that is currently only used in other '#if 0'd out
code.
2003-04-10 04:15:38 +00:00
thorpej 24a4b8faa6 Use PAGE_SIZE rather than NBPG. 2003-04-09 00:28:28 +00:00
fvdl 42614ed3f3 Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
2003-04-02 10:39:19 +00:00
yamt 0296b9ddb2 add assertions and a debug check. 2003-04-01 14:58:43 +00:00
yamt 418bd96252 lfs_strategy is used only for read. 2003-04-01 14:31:50 +00:00
fvdl 691b2fa7db The checkpoint loop always used (multiples of) lfs_sepb as the number
of segments to mark. However, this may be much more than lfs_nseg.

Originally this wasn't a big problem, since only the structures in the
diskblock were changed, but nowadays there's a mirror of the segflags
in the in-core superblock. This problem caused the code to walk
way past the end of that allocated area, causing memory corruption
in other kernel structures. So, use lfs_nseg as the maximum, as it should be.

While here, simplify the loop; it had become an obfuscated piece of
code overtime.
2003-03-28 22:39:42 +00:00
perseant 3f7016035a Add a sleeper count, to prevent the cleaner from panicing the kernel
when the filesystem is unmounted, relocking the Ifile when its lock is
draining.  (We can't use vfs_busy() since the process is sleeping for a
good long time.)  Clean up / organize lfs.h, while I'm here.

In lfs_update_single, assert that disk addresses are either negative, or
are still positive when converted to int32_t, to prevent recurrence of a
negative/positive block problem.
2003-03-28 08:03:38 +00:00
perseant 17bba97d2e Unlock ifile inode during streamlined VOP_INACTIVE. 2003-03-22 21:31:41 +00:00
dsl bd99e3429d Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).
2003-03-21 23:11:19 +00:00
perseant 78dff1cfa3 KNF (space after keywords). 2003-03-21 06:26:36 +00:00
perseant a37f4cf7ec Use VONWORKLST as a heuristic for vnode emptiness, rather than exhaustively
checking the memq.

Take greater care not to dirty the Ifile vnode when unmounting the filesystem.
This should fix a "(vp->v_flag & VONWORKLST) == 0" assertion panic in vgonel
that could occur when unmounting.

Do not allow the Ifile to be mapped for writing.
2003-03-21 06:16:53 +00:00
yamt e8d83a0b17 make this compilable with DIAGNOSTIC and without DEBUG.
fix PR 20827 from FUKAUMI Naoki.
2003-03-21 06:09:08 +00:00
yamt a8e8f3ea02 lfs_writevnodes:
in the case of "starting over", kick lfs_writeseg
in order to avoid deadlock in check_dirty.
2003-03-20 14:17:21 +00:00
yamt 91ce94db76 fix "more than one fragment" panics;
direct and indirect block pointers are not valid in the case of shortlinks.
while i'm here, move duplicated code in lfs_vget/fastvget into a new
function, lfs_vinit.
2003-03-20 14:11:46 +00:00
perseant 12a78a5a7e Don't break out of Ifile-writing loop in lfs_segwrite until nothing is left.
Note however that blocks can be added to the Ifile even when the segment
block is held because of inodes' atime.  Do not panic with "dirty blocks"
if these blocks are present.
2003-03-20 06:51:17 +00:00
perseant c364d884f6 Hold the segment lock during truncation to prevent indirect blocks from
being written by lfs_updatemeta while lfs_truncate is also writing them,
a bug pointed out by YAMAMOTO Takashi <yamt@netbsd.org>.
2003-03-20 06:47:38 +00:00
perseant 20f569dad0 Remember to destroy lfs_inoext_pool when closing up the LFS subsystem. 2003-03-18 07:53:56 +00:00
perseant ea03a1ac09 Add simple_lock protection for lfs_seglock and lfs_subsys_pages; these will
be expanded to cover other per-fs and subsystem-wide data as well.

Fix a case of IN_MODIFIED being set without updating lfs_uinodes, resulting
in a "lfs_uinodes < 0" panic.

Fix a deadlock in lfs_putpages arising from the need to busy all pages in a
block; unbusy any that had already been busied before starting over.
2003-03-15 06:58:49 +00:00
kristerw ea98786439 SO C requires a statement after a label. 2003-03-15 02:27:18 +00:00
perseant ec13062af8 - Get rid of unused #ifdefs LFS_NO_PAGEMOVE and LFS_MALLOC_SUMMARY (both
always true) and accompanying dead code.

- When constructing write clusters in lfs_writeseg, if the block we are
  about to add is itself a cluster from GOP_WRITE, don't put a cluster
  in a cluster, just write the GOP_WRITE cluster on its own.  This seems
  to represent a slight performance gain on my test machine.

- Charge someone's rusage for writes on LFSes.  It's difficult to tell
  who the "right" process to charge is; just charge whoever triggered
  the write.
2003-03-11 02:47:39 +00:00
perseant a46b9ccf95 Only #define LFS if not already defined. 2003-03-08 23:18:54 +00:00
perseant 0493be6ba8 Take away "#ifdef LFS_UBC". 2003-03-08 22:14:31 +00:00
perseant 8feb2c22f5 Take away "#ifdef LFS_UBC". 2003-03-08 21:46:04 +00:00
perseant 4b4f884b89 Add an lfs_strategy() that checks to make sure we're not trying to read
where the cleaner is trying to write, instead of tying up the "live"
buffers (or pages).

Fix a bug in the LFS_UBC case where oversized buffers would not be
checksummed correctly, causing uncleanable segments.

Make sure that wakeup(fs->lfs_iocount) is done if fs->lfs_iocount is 1
as well as 0, since we wait in some places for it to drop to 1.

Activate all pages that make it into lfs_gop_write without the segment
lock held, since they must have been dirtied very recently, even if
PG_DELWRI is not set.
2003-03-08 02:55:47 +00:00
perseant d51fdbef63 Make sure we hold the uobjlock when checking for dirty pages, in lfs_vflush.
Note that pages can become dirty without our knowing it, anyway; don't
panic if that happens.
2003-03-04 19:19:43 +00:00
perseant 003cfbd545 Don't add dirty blocks to the ifile in lfs_segunlock, if we're trying to
unmount the filesystem.  This avoids a "dirty blocks" panic.
2003-03-04 19:15:26 +00:00
perseant 958a4c008c Don't force all truncations to be synchronous 2003-03-04 19:10:35 +00:00
perseant 9192f047ac Account SEGUSE_ACTIVE correctly so that the automatic segment cleaning
actually happens.

Add a new fcntl call that will write the minimum necessary to checkpoint
(i.e., for on-disk directory structure to be consistent, not including
updates to file data) so that the cleaner can clean segments more quickly
without sacrificing three-way commit for cleaning.
2003-03-02 04:34:30 +00:00
yamt 32a79f1dd0 use pid_t for pid. 2003-03-01 11:20:21 +00:00
perseant cfc73a5fa9 Be careful to always zero pages on truncation/fragment extension,
in the case where the filesystem block size is larger than PAGE_SIZE.
2003-03-01 05:07:51 +00:00
perseant daeb6c37d1 Make lfs_truncate handle file extension correctly, in the LFS_UBC case. 2003-02-28 07:37:56 +00:00
perseant c418e0c4d6 Fix a clrbuf() on an uninitialized pointer. 2003-02-28 07:36:32 +00:00
perseant a94f9407dc Quell a hasty panic in lfs_truncate: on-inode disk addresses can be
different between the beginning and end of the call.
2003-02-28 04:37:07 +00:00
perseant 0b114d4e21 Do roundup and offset arithmetic in 64 bits, to allow >=2G files. 2003-02-27 07:10:27 +00:00
perseant 6f5626d112 Make fs-specific fcntl macros take three arguments (approved wrstuden).
Let LFS use fcntl for cleaner functions.
2003-02-25 23:12:06 +00:00
thorpej eb14e86676 Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it.  This fixes a few places where either b_dep or b_interlock were
not properly initialized.
2003-02-25 20:35:31 +00:00
yamt 2bad134129 fix simplelocks 2003-02-25 13:47:44 +00:00
perseant 95137c8477 Add lfs_ioctl vnode op, with ioctls to take over cleaner system call
functionality (not including segment clean, since that is now done
automatically as checkpoints happen).
2003-02-24 08:42:49 +00:00
simonb 2a4457bd46 Remove assigned-to but not used variable. 2003-02-23 03:32:55 +00:00
perseant 3ab94fed93 Fix a buffer overflow bug in the LFS_UBC case that manifested itself
either as a mysterious UVM error or as "panic: dirty bufs".  Verify
maximum size in lfs_malloc.

Teach lfs_updatemeta and lfs_shellsort about oversized cluster blocks from
lfs_gop_write.

When unwiring pages in lfs_gop_write, deactivate them, under the theory
that the pagedaemon wanted to free them last we knew.
2003-02-23 00:22:33 +00:00
yamt 1dd4645db4 fix simple_lock/unlock mismatches. 2003-02-22 01:52:25 +00:00
perseant fdf4bfe002 Tabify, and fix some comment alignment problems. 2003-02-20 04:27:23 +00:00
yamt 5f444770aa add debug code to lfs_free. 2003-02-19 12:58:53 +00:00
yamt 65fda8e404 workaround for "another flush is..." infinity loop in writerd.
if we're writerd, sleep in lfs_flush until another writer goes away
instead of busy loop in writed.
2003-02-19 12:49:10 +00:00
yamt d9a4f81d1c wire the pages instead of just dequeue'ing them.
advised by Chuck Silvers.
2003-02-19 12:22:51 +00:00
yamt 18e00c1196 init b_interlock. 2003-02-19 12:18:59 +00:00
yamt 2be86f2ff8 acquire v_interlock before calling VOP_PUTPAGES. 2003-02-19 12:02:38 +00:00
yamt 0ad89cf93e init b_interlock. 2003-02-19 12:01:17 +00:00
soren 3291a4522e Make libsa compile again. 2003-02-18 14:58:31 +00:00
perseant e61877243d Make it compile again, grr.... 2003-02-18 02:00:08 +00:00
perseant b397c875ae Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now
(there are still some details to work out) but expect that to go
away soon.  To support these basic changes (creation of lfs_putpages,
lfs_gop_write, mods to lfs_balloc) several other changes were made, to
wit:

* Create a writer daemon kernel thread whose purpose is to handle page
  writes for the pagedaemon, but which also takes over some of the
  functions of lfs_check().  This thread is started the first time an
  LFS is mounted.

* Add a "flags" parameter to GOP_SIZE.  Current values are
  GOP_SIZE_READ, meaning that the call should return the size of the
  in-core version of the file, and GOP_SIZE_WRITE, meaning that it
  should return the on-disk size.  One of GOP_SIZE_READ or
  GOP_SIZE_WRITE must be specified.

* Instead of using malloc(...M_WAITOK) for everything, reserve enough
  resources to get by and use malloc(...M_NOWAIT), using the reserves if
  necessary.  Use the pool subsystem for structures small enough that
  this is feasible.  This also obsoletes LFS_THROTTLE.

And a few that are not strictly necessary:

* Moves the LFS inode extensions off onto a separately allocated
  structure; getting closer to LFS as an LKM.  "Welcome to 1.6O."

* Unified GOP_ALLOC between FFS and LFS.

* Update LFS copyright headers to correct values.

* Actually cast to unsigned in lfs_shellsort, like the comment says.

* Keep track of which segments were empty before the previous
  checkpoint; any segments that pass two checkpoints both dirty and
  empty can be summarily cleaned.  Do this.  Right now lfs_segclean
  still works, but this should be turned into an effectless
  compatibility syscall.
2003-02-17 23:48:08 +00:00
pk 338f31f581 Make the buffer cache code MP-safe. 2003-02-05 21:38:38 +00:00
perseant 14c17e57b4 Don't call a dirop within a dirop: if lfs_rename is actually deleting
a link, call lfs_remove directly before starting dirop rather than
having ufs_rename do it.
2003-02-03 00:32:35 +00:00
tron f1eeaa9020 Only use MALLOC_DECLARE() in kernel namespace. 2003-02-01 18:34:14 +00:00
thorpej b193480908 Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant.  Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
2003-02-01 06:23:35 +00:00
yamt 84d61a1dc4 there's no need to treat VOP_WHITEOUT as dirop
because it modifies only one inode.
2003-01-30 14:18:32 +00:00
yamt 53d6eb47ee don't use daddr_t for segment summary since it's an on-disk structure. 2003-01-29 13:14:33 +00:00
simonb 0adecbd12b Remove variable that is only assigned to but not referenced. 2003-01-29 03:06:40 +00:00
yamt e41d3a6f1c make these compilable with lfs debug options.
(follow daddr_t change)

XXX maybe segment number should be 64bit.
2003-01-27 23:17:56 +00:00
kleink 865868a8b1 Further printf format fixes in the wake of daddr_t.
Note that PRI?64 and long long int arguments aren't made for each other,
nor are %lld and int64_t arguments.
2003-01-27 21:45:52 +00:00
kleink 4e0e5333ae Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.
2003-01-25 23:00:09 +00:00
tron 5067836b9e Use PRId64 instead of hard coding "%lld" to fix build problems under
LP64 ports.
2003-01-25 18:12:31 +00:00
fvdl a138610cac The oldblks and newblks arrays are used to store direct copies of
on-disk block pointers, so they should be int32_t. Error found
by Izumi Tsutsui.
2003-01-25 16:40:28 +00:00
tron 63dda858c6 Fix printf() format strings problems caused by "daddr_t" change. 2003-01-25 12:50:38 +00:00
fvdl a3ff3a3038 Bump daddr_t to 64 bits. Replace it with int32_t in all places where
it was used on-disk, so that on-disk formats remain the same.
Remove ufs_daddr_t and ufs_lbn_t for the time being.
2003-01-24 21:55:02 +00:00
thorpej b78f59b443 Merge the nathanw_sa branch. 2003-01-18 08:51:40 +00:00
yamt 03e1a46833 - zerofill struct lfs when allocating it.
- use M_ZERO instead of memset after malloc.
2003-01-12 13:04:52 +00:00
yamt 5f254d46cc backout wrong assertions that i added. 2003-01-08 17:16:52 +00:00
yamt 99d625b53c for lfs_remove/lfs_rmdir, keep removed vnodes marked VDIROP.
(backout parts of rev.1.40)
otherwise, directory structures can be corrupted because checkpoints can
occur via eg. lfs_vflush before parent directory is written.
2003-01-08 17:14:58 +00:00
yamt 69f1c0cb29 in set_dirop/endop, use normal vref/vrele instead of lfs versions
so that we don't miss lfs_inactivate.
2003-01-08 15:43:29 +00:00
yamt ee36fccabb add assertions. 2003-01-08 15:40:54 +00:00
yamt 49d2b56b43 use lfs_unmark_vnode instead of duplicated code fragments. 2003-01-08 15:40:04 +00:00
yamt 140a8e56ca write ifile only when it has dirty buffers. 2002-12-31 14:54:32 +00:00
yamt cb9613feef comment and assertions 2002-12-30 05:34:17 +00:00
yamt 6fc496c67a move check of lfs_unlockvp from lfs_reserveavail to lfs_reserve
because lfs_reservebuf needs same check as well.
2002-12-30 05:31:53 +00:00
yamt a05fbf74c0 fix vref/vunref mismatch. 2002-12-29 14:08:12 +00:00
yamt 88ae33f9e0 backout assertions in lfs_inactive.
they can be false when unmounting forcibly.
2002-12-29 07:05:55 +00:00
christos ae2bf40b7e fix compile problem. 2002-12-28 20:08:36 +00:00
yamt d840722863 avoid warnings without DIAGNOSTIC.
pointed by Andreas Wrede.
2002-12-28 17:22:47 +00:00
yamt a428d8a5af dirop inode can't be passed to lfs_inactivate. 2002-12-28 15:12:26 +00:00
yamt 59be5399b7 - in lfs_reserve, vref vnodes that we're locking so that cleaner doesn't
try to reclaim them.
  (workaround for deadlock noted in the comment in lfs_reserveavail)
- in lfs_rename, mark vnodes which are being moved as well as directry vnodes.
2002-12-28 14:39:08 +00:00
yamt 4b9c604ba7 - in lfs_reserve, reserve locked buffer count as well.
- don't wait for locking buf in lfs_bwrite_ext to avoid deadlocks.
- skip lfs_reserve when we're doing dirop.
  reserve more (for lfs_truncate) in set_dirop instead.

this mostly solves PR 18972. (and hopefully PR 19196)
2002-12-26 13:37:18 +00:00
yamt e9bd1836a5 don't try to write all blocks passed to lfs_markv at once
since it likely causes buf starvation.
2002-12-26 13:04:39 +00:00
yamt 362c57a2d2 add a XXX comment. (description of possible deadlock) 2002-12-22 17:31:52 +00:00
yamt 4370be165c add a XXX comment 2002-12-21 05:35:54 +00:00
yamt 0d95cc5d66 correct/add assertion. 2002-12-18 14:05:50 +00:00
yamt beff0dd387 #if 0 out vnode unlock/lock in lfs_reserve for now and add a comment about it.
deadlock is better than corruption (or panic), IMO.
2002-12-17 15:23:37 +00:00
yamt a999523301 no need for cleaner to hold vnode locks.
cleaner and normal vnode operations are synchronized enough by
seglock/fraglock and buf's B_BUSY-ness.
2002-12-17 14:37:49 +00:00