Commit Graph

642 Commits

Author SHA1 Message Date
christos
6bd1d6d4db Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
2004-04-21 01:05:31 +00:00
yamt
aa514117d5 check_dirty: plug a PHOLD leak. from Greg Oster. 2004-04-20 11:52:17 +00:00
oster
87d110abfa If we bail out due to an error, we need 'unreserve' the space that
we'd reserved earlier.

Approved by: yamt
2004-03-30 14:50:46 +00:00
atatat
284a91c3ab Manually attach malloc types when being built as an lkm. 2004-03-27 04:43:43 +00:00
atatat
19af35fd0d Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
2004-03-24 15:34:46 +00:00
yamt
15c9d33810 calculate data checksum inline. 2004-03-09 07:43:49 +00:00
yamt
81ce5e8cc3 use correct segment size. this fixes memory corruption when using lfsv1. 2004-03-09 06:43:18 +00:00
oster
19eeec0a9c Add a missing:
pool_destroy(&lfs_dinode_pool);

to lfs_done().

Approved-by: yamt
2004-02-26 22:56:55 +00:00
yamt
f9571060ef lfs_putpages: fix a simple_lock mismatch. 2004-02-26 22:41:36 +00:00
wiz
73e1501b98 parameter with two es. From Peter Postma. 2004-02-24 15:22:01 +00:00
yamt
a57f9a6ca5 lfs_update_single: add an assertion. 2004-01-29 12:10:07 +00:00
he
11544aaa71 Let the cast to (long long) for using the result as a printf argument
apply to the whole expression, not just the first factor.
2004-01-28 20:57:15 +00:00
yamt
3e9d8d6772 use bufmem instead of bufpages to make lfs a little less broken. 2004-01-28 10:54:23 +00:00
yamt
09ec20ca66 eliminate tricky usages of VOP_STRATEGY which are (no longer?) necessary. 2004-01-28 10:53:12 +00:00
hannken
d6170777cf Fix xxx_strategy() to use the vnode arg instead of bp->b_vp. 2004-01-26 10:39:29 +00:00
hannken
3db4e2acd8 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:06:48 +00:00
yamt
7266a95907 store a i/o priority hint in struct buf for buffer queue discipline. 2004-01-10 14:39:50 +00:00
pk
70f20a1217 Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes).  It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms.  Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.
2003-12-30 12:33:13 +00:00
simonb
740725d725 Fix usage of fifth argument to pool_init(). 2003-12-21 07:53:58 +00:00
yamt
009640868e set VBWAIT when waiting v_numoutput to be drained. 2003-12-17 10:38:39 +00:00
yamt
ce11c3ce4e remove a redundant substitution. 2003-12-17 07:14:03 +00:00
yamt
6b95193071 - reduce code duplication.
- use boolean_t where appropriate.
2003-12-16 13:47:48 +00:00
yamt
98e9a8c373 g/c lfs_no_inactive. 2003-12-16 11:45:07 +00:00
atatat
13f8d2ce5f Dynamic sysctl.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al.  Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded.  Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment.  I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
2003-12-04 19:38:21 +00:00
yamt
3ea6756a92 use b_private rather than b_saveaddr.
XXX LFS_USE_B_INVAL
2003-12-04 14:57:47 +00:00
yamt
5b12f94dde use FINFOSIZE macro. 2003-11-25 15:14:57 +00:00
yamt
cc716087b0 - tweak lfs_update_single()'s prototype so that it can be used by
roll-forward code.
- reduce code duplication using the above in update_meta()
  this also fixes fragment accounting.
2003-11-07 17:55:29 +00:00
yamt
71602f6ec9 fix spec vnode aliasing. 2003-11-07 14:52:27 +00:00
yamt
b43ed49269 - tell filesize changes to vm when roll-forwarding data blocks.
- handle fragment extension better during roll-forward.
- related assertions.
2003-11-07 14:50:18 +00:00
yamt
cd2445d8d3 more assertion about file truncation to zero. 2003-11-07 14:48:28 +00:00
simonb
a2facef339 Remove some assigned-to but otherwise unused variables. 2003-10-30 01:43:08 +00:00
mycroft
be505a4f82 Adjust to remove bogus initializer. 2003-10-29 01:25:04 +00:00
christos
372f57e757 Fix uninitialized variable warnings. 2003-10-25 18:26:46 +00:00
fvdl
c6019338cd Correct preempt() calls. 2003-10-21 00:39:03 +00:00
yamt
4e9f921204 be more strict about sa->vp.
(make sure the last lfs_updatemata in lfs_putpages takes effect.)
2003-10-18 15:52:42 +00:00
simonb
c25af55e8c Remove assigned-to but otherwise unused variable. 2003-10-18 04:03:22 +00:00
yamt
818ef92da6 add comments and tweak code a little for readability.
(no behaviour changes)
2003-10-17 14:20:12 +00:00
dbj
fe7c786886 add mnt_iflag field to struct mount for internal flags
mv MNT_GONE, MNT_UNMOUNT and MNT_WANTRDWR to this field
additonally add mnt_writeopcountupper and mnt_writeopcountlower fields
in preparation for pending write suspension support work
bump kernel version to 1.6ZD
2003-10-14 14:02:56 +00:00
yamt
1fb76f9bad add a prototype of check_segsum(). 2003-10-14 13:51:51 +00:00
yamt
d457c892fa when roll-forwarding, check segment serial numbers correctly. 2003-10-14 13:46:30 +00:00
yamt
73e762ca69 add a missing fsbtodb() to read a correct block for roll-forwarding. 2003-10-14 12:52:28 +00:00
yamt
1508246f38 remove a redundant definition of LFS_MAX_ACTIVE. 2003-10-14 12:51:31 +00:00
yamt
dd4d591157 - a comment.
- bcopy -> memcpy
- increase 'p' only when needed.
2003-10-08 15:07:25 +00:00
yamt
4ce4892712 assertions. 2003-10-03 15:35:54 +00:00
yamt
33feb8e686 reassignbuf() when lfs_writeseg() takes away B_DELWRI. 2003-10-03 15:35:03 +00:00
yamt
656ff745cf when inactivating segments, compare segment numbers correctly. 2003-10-03 13:02:54 +00:00
yamt
0dc0c83b61 remove redundant prototypes. 2003-09-29 15:12:08 +00:00
yamt
61d5d4362b fix a bug of lfs.
genfs_getpages() can read in more blocks than it should due to faked filesize
of lfs_gop_size().  it's a security problem and it makes gcc3 "internal error"

to fix this,
- in genfs_getpages(), always calculate diskeof and memeof separately
  so that filesystems (in this case, lfs) can use different strategies
  for them.
- introduce GOP_SIZE_MEM flag and use it to request in-core filesize.
  (it was an intention of GOP_SIZE_READ,
  but after the above change _READ is not a straightforward name)

after this, no one uses GOP_SIZE_{READ,WRITE} anymore but leave them for now.
2003-09-24 10:22:53 +00:00
yamt
67a5559821 cleanup IN_ADIROP/VDIROP handling a little. 2003-09-23 05:26:49 +00:00
yamt
e2fbe9d54d remove unnecessary externs of lfs_do_flush. 2003-09-23 05:26:12 +00:00
yamt
17f9466183 some comments 2003-09-20 17:51:55 +00:00
yamt
f80b24474d g/c CHECK_COPYIN. 2003-09-10 11:09:11 +00:00
yamt
753a6151b9 comments on lfs_issequential_hole. 2003-09-07 21:00:36 +00:00
yamt
d20e923a9c - raise spl to bio in lfs_countlocked() rather than having callers to do so.
- buffer cache MP locks.
- assert B_CALL buffers are not on the free queue.
2003-09-07 11:53:57 +00:00
yamt
4a78faea0f - buffer cache MP locks.
- avoid changing buffer state on the free queue.
2003-09-07 11:47:07 +00:00
yamt
3ed90e8152 use LFS_DEBUG_COUNTLOCKED macro. 2003-09-07 11:44:22 +00:00
yamt
01e41ddfe3 don't call LFS_DEBUG_COUNTLOCKED after bread().
lfs_countlocked doesn't count buffers that isn't on the freelist.
2003-09-04 12:28:53 +00:00
agc
aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
yamt
bdbaf98d1e using normal bufcache buffer for cluster buffer head. 2003-07-30 13:36:40 +00:00
yamt
1bc98d3c14 - check EROFS earlier in lfs_markv.
- remove wrong error recovery code (fake buffers are never on bufqueue)
  and put a comment instead.
2003-07-30 12:38:53 +00:00
yamt
3bded40734 remove an unused definition of LFS_VREF_THRESHOLD. 2003-07-30 12:34:00 +00:00
yamt
bddddad951 KNF. 2003-07-23 13:53:51 +00:00
yamt
d7aa0312b2 add parenthesis missed in rev.1.127. 2003-07-23 13:46:57 +00:00
yamt
9d61ee54c4 whitespace 2003-07-23 13:44:55 +00:00
yamt
f4fc192872 add KASSERTs in lfs_issequential_hole. 2003-07-23 13:38:18 +00:00
yamt
2ba5ae7ea6 more MP locks. 2003-07-12 16:19:00 +00:00
yamt
12ad26b293 - wrap long lines.
- remove a mysterious blank line.
2003-07-12 16:17:52 +00:00
yamt
3852db2096 - protect global resource counts with lfs_subsys_lock.
- clean up scattered externs a little.
2003-07-12 16:17:06 +00:00
yamt
3f84a2d3a1 a comment. 2003-07-02 14:07:16 +00:00
yamt
eb4e09d59f use queue.h macros. 2003-07-02 13:43:02 +00:00
yamt
82659031f4 use VFSTOUFS macro. 2003-07-02 13:41:38 +00:00
yamt
102c8a6a74 - add a new functions, lfs_writer_enter/leave, and use them instead of
duplicated code fragments.
- add an assertion.
2003-07-02 13:40:51 +00:00
yamt
f3506a9599 drain dirops before aqcuiring seglock. otherwise it might deadlocks.
PR/20676 (Karl Knutsson)
2003-07-02 13:39:03 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
thorpej
a06b275edc Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget().  Turns out
  that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
  and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
  above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
  just to appease the above.
2003-06-29 18:43:21 +00:00
bouyer
75208caf18 Adapt for struct proc* -> struct lwp* changes. 2003-06-28 22:53:35 +00:00
darrenr
960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
yamt
ab2238cfad make is_sequential a callback in order to achieve better lfs write clustering.
since lfs always rewrite blocks into the new segment,
current on-disk place of the block doesn't affect to write clustering.

ok'ed by Konrad Schroder.
2003-05-18 12:59:05 +00:00
nakayama
a6d8c9185d Avoid comparison is always false warning in gcc 3.3 w/ 64-bit size_t. 2003-05-17 01:44:39 +00:00
ragge
97fa6ef77b Add a missing ifdef DDB. 2003-05-07 18:49:29 +00:00
perseant
e18585deb2 Correct arguments to check_dirty, ensuring that all pages in a block are
written if any of them are dirty.  Pointed out by yamt.
2003-05-02 01:47:39 +00:00
perseant
100c8f971f Restrict the run of cluster blocks to on-disk contiguous blocks (back out
part of rev 1.115), to avoid writing over holes.  This is the lesser of two
evils, to be replaced soon.
2003-04-29 17:45:11 +00:00
yamt
982fbc7db4 add an assertion. 2003-04-29 07:44:04 +00:00
yamt
b4d5e11ffe fix a comment. 2003-04-27 06:47:45 +00:00
yamt
c2b802ff24 fix b_interlock lock/unlock mismatches. 2003-04-27 06:46:38 +00:00
perseant
b691ba8a71 Don't change update time on block write; lets e.g. "tar xp" work properly. 2003-04-27 04:18:29 +00:00
perseant
ef3c60764c Make LFS work better (though still not "well") as an NFS-exported
filesystem (and other things that needed to be fixed before the tests
would complete), to wit:

* Include the fs ident in the filehandle; improve stale filehandle checks.

* Change definition of blksize() to use the on-dinode size instead of
  the inode's i_size, so that fsck_lfs will work properly again.

* Use b_interlock in lfs_vtruncbuf.

* Postpone dirop reclamation until after the seglock has been released,
  so that lfs_truncate is not called with the segment lock held.

* Don't loop in lfs_fsync(), just write everything and wait.

* Be more careful about the interlock/uobjlock in lfs_putpages: when we
  lose this lock, we have to resynchronize dirtiness of pages in each
  block.

* Be sure to always write indirect blocks and update metadata in
  lfs_putpages; fixes a bug that caused blocks to be accounted to the
  wrong segment.
2003-04-23 07:20:37 +00:00
christos
80ecd573c0 PR/1796: John Kohl: statfs misbehaves under chrooted environments.
- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
  and factored out some of the vfsop statfs() code to copy_statfs_info(). This
  fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
2003-04-16 21:44:18 +00:00
simonb
761de7345c '#if 0' out a variable that is currently only used in other '#if 0'd out
code.
2003-04-10 04:15:38 +00:00
thorpej
24a4b8faa6 Use PAGE_SIZE rather than NBPG. 2003-04-09 00:28:28 +00:00
fvdl
42614ed3f3 Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
2003-04-02 10:39:19 +00:00
yamt
0296b9ddb2 add assertions and a debug check. 2003-04-01 14:58:43 +00:00
yamt
418bd96252 lfs_strategy is used only for read. 2003-04-01 14:31:50 +00:00
fvdl
691b2fa7db The checkpoint loop always used (multiples of) lfs_sepb as the number
of segments to mark. However, this may be much more than lfs_nseg.

Originally this wasn't a big problem, since only the structures in the
diskblock were changed, but nowadays there's a mirror of the segflags
in the in-core superblock. This problem caused the code to walk
way past the end of that allocated area, causing memory corruption
in other kernel structures. So, use lfs_nseg as the maximum, as it should be.

While here, simplify the loop; it had become an obfuscated piece of
code overtime.
2003-03-28 22:39:42 +00:00
perseant
3f7016035a Add a sleeper count, to prevent the cleaner from panicing the kernel
when the filesystem is unmounted, relocking the Ifile when its lock is
draining.  (We can't use vfs_busy() since the process is sleeping for a
good long time.)  Clean up / organize lfs.h, while I'm here.

In lfs_update_single, assert that disk addresses are either negative, or
are still positive when converted to int32_t, to prevent recurrence of a
negative/positive block problem.
2003-03-28 08:03:38 +00:00
perseant
17bba97d2e Unlock ifile inode during streamlined VOP_INACTIVE. 2003-03-22 21:31:41 +00:00
dsl
bd99e3429d Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).
2003-03-21 23:11:19 +00:00
perseant
78dff1cfa3 KNF (space after keywords). 2003-03-21 06:26:36 +00:00
perseant
a37f4cf7ec Use VONWORKLST as a heuristic for vnode emptiness, rather than exhaustively
checking the memq.

Take greater care not to dirty the Ifile vnode when unmounting the filesystem.
This should fix a "(vp->v_flag & VONWORKLST) == 0" assertion panic in vgonel
that could occur when unmounting.

Do not allow the Ifile to be mapped for writing.
2003-03-21 06:16:53 +00:00
yamt
e8d83a0b17 make this compilable with DIAGNOSTIC and without DEBUG.
fix PR 20827 from FUKAUMI Naoki.
2003-03-21 06:09:08 +00:00
yamt
a8e8f3ea02 lfs_writevnodes:
in the case of "starting over", kick lfs_writeseg
in order to avoid deadlock in check_dirty.
2003-03-20 14:17:21 +00:00
yamt
91ce94db76 fix "more than one fragment" panics;
direct and indirect block pointers are not valid in the case of shortlinks.
while i'm here, move duplicated code in lfs_vget/fastvget into a new
function, lfs_vinit.
2003-03-20 14:11:46 +00:00
perseant
12a78a5a7e Don't break out of Ifile-writing loop in lfs_segwrite until nothing is left.
Note however that blocks can be added to the Ifile even when the segment
block is held because of inodes' atime.  Do not panic with "dirty blocks"
if these blocks are present.
2003-03-20 06:51:17 +00:00
perseant
c364d884f6 Hold the segment lock during truncation to prevent indirect blocks from
being written by lfs_updatemeta while lfs_truncate is also writing them,
a bug pointed out by YAMAMOTO Takashi <yamt@netbsd.org>.
2003-03-20 06:47:38 +00:00
perseant
20f569dad0 Remember to destroy lfs_inoext_pool when closing up the LFS subsystem. 2003-03-18 07:53:56 +00:00
perseant
ea03a1ac09 Add simple_lock protection for lfs_seglock and lfs_subsys_pages; these will
be expanded to cover other per-fs and subsystem-wide data as well.

Fix a case of IN_MODIFIED being set without updating lfs_uinodes, resulting
in a "lfs_uinodes < 0" panic.

Fix a deadlock in lfs_putpages arising from the need to busy all pages in a
block; unbusy any that had already been busied before starting over.
2003-03-15 06:58:49 +00:00
kristerw
ea98786439 SO C requires a statement after a label. 2003-03-15 02:27:18 +00:00
perseant
ec13062af8 - Get rid of unused #ifdefs LFS_NO_PAGEMOVE and LFS_MALLOC_SUMMARY (both
always true) and accompanying dead code.

- When constructing write clusters in lfs_writeseg, if the block we are
  about to add is itself a cluster from GOP_WRITE, don't put a cluster
  in a cluster, just write the GOP_WRITE cluster on its own.  This seems
  to represent a slight performance gain on my test machine.

- Charge someone's rusage for writes on LFSes.  It's difficult to tell
  who the "right" process to charge is; just charge whoever triggered
  the write.
2003-03-11 02:47:39 +00:00
perseant
a46b9ccf95 Only #define LFS if not already defined. 2003-03-08 23:18:54 +00:00
perseant
0493be6ba8 Take away "#ifdef LFS_UBC". 2003-03-08 22:14:31 +00:00
perseant
8feb2c22f5 Take away "#ifdef LFS_UBC". 2003-03-08 21:46:04 +00:00
perseant
4b4f884b89 Add an lfs_strategy() that checks to make sure we're not trying to read
where the cleaner is trying to write, instead of tying up the "live"
buffers (or pages).

Fix a bug in the LFS_UBC case where oversized buffers would not be
checksummed correctly, causing uncleanable segments.

Make sure that wakeup(fs->lfs_iocount) is done if fs->lfs_iocount is 1
as well as 0, since we wait in some places for it to drop to 1.

Activate all pages that make it into lfs_gop_write without the segment
lock held, since they must have been dirtied very recently, even if
PG_DELWRI is not set.
2003-03-08 02:55:47 +00:00
perseant
d51fdbef63 Make sure we hold the uobjlock when checking for dirty pages, in lfs_vflush.
Note that pages can become dirty without our knowing it, anyway; don't
panic if that happens.
2003-03-04 19:19:43 +00:00
perseant
003cfbd545 Don't add dirty blocks to the ifile in lfs_segunlock, if we're trying to
unmount the filesystem.  This avoids a "dirty blocks" panic.
2003-03-04 19:15:26 +00:00
perseant
958a4c008c Don't force all truncations to be synchronous 2003-03-04 19:10:35 +00:00
perseant
9192f047ac Account SEGUSE_ACTIVE correctly so that the automatic segment cleaning
actually happens.

Add a new fcntl call that will write the minimum necessary to checkpoint
(i.e., for on-disk directory structure to be consistent, not including
updates to file data) so that the cleaner can clean segments more quickly
without sacrificing three-way commit for cleaning.
2003-03-02 04:34:30 +00:00
yamt
32a79f1dd0 use pid_t for pid. 2003-03-01 11:20:21 +00:00
perseant
cfc73a5fa9 Be careful to always zero pages on truncation/fragment extension,
in the case where the filesystem block size is larger than PAGE_SIZE.
2003-03-01 05:07:51 +00:00
perseant
daeb6c37d1 Make lfs_truncate handle file extension correctly, in the LFS_UBC case. 2003-02-28 07:37:56 +00:00
perseant
c418e0c4d6 Fix a clrbuf() on an uninitialized pointer. 2003-02-28 07:36:32 +00:00
perseant
a94f9407dc Quell a hasty panic in lfs_truncate: on-inode disk addresses can be
different between the beginning and end of the call.
2003-02-28 04:37:07 +00:00
perseant
0b114d4e21 Do roundup and offset arithmetic in 64 bits, to allow >=2G files. 2003-02-27 07:10:27 +00:00
perseant
6f5626d112 Make fs-specific fcntl macros take three arguments (approved wrstuden).
Let LFS use fcntl for cleaner functions.
2003-02-25 23:12:06 +00:00
thorpej
eb14e86676 Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it.  This fixes a few places where either b_dep or b_interlock were
not properly initialized.
2003-02-25 20:35:31 +00:00
yamt
2bad134129 fix simplelocks 2003-02-25 13:47:44 +00:00
perseant
95137c8477 Add lfs_ioctl vnode op, with ioctls to take over cleaner system call
functionality (not including segment clean, since that is now done
automatically as checkpoints happen).
2003-02-24 08:42:49 +00:00
simonb
2a4457bd46 Remove assigned-to but not used variable. 2003-02-23 03:32:55 +00:00
perseant
3ab94fed93 Fix a buffer overflow bug in the LFS_UBC case that manifested itself
either as a mysterious UVM error or as "panic: dirty bufs".  Verify
maximum size in lfs_malloc.

Teach lfs_updatemeta and lfs_shellsort about oversized cluster blocks from
lfs_gop_write.

When unwiring pages in lfs_gop_write, deactivate them, under the theory
that the pagedaemon wanted to free them last we knew.
2003-02-23 00:22:33 +00:00
yamt
1dd4645db4 fix simple_lock/unlock mismatches. 2003-02-22 01:52:25 +00:00
perseant
fdf4bfe002 Tabify, and fix some comment alignment problems. 2003-02-20 04:27:23 +00:00
yamt
5f444770aa add debug code to lfs_free. 2003-02-19 12:58:53 +00:00
yamt
65fda8e404 workaround for "another flush is..." infinity loop in writerd.
if we're writerd, sleep in lfs_flush until another writer goes away
instead of busy loop in writed.
2003-02-19 12:49:10 +00:00
yamt
d9a4f81d1c wire the pages instead of just dequeue'ing them.
advised by Chuck Silvers.
2003-02-19 12:22:51 +00:00
yamt
18e00c1196 init b_interlock. 2003-02-19 12:18:59 +00:00
yamt
2be86f2ff8 acquire v_interlock before calling VOP_PUTPAGES. 2003-02-19 12:02:38 +00:00
yamt
0ad89cf93e init b_interlock. 2003-02-19 12:01:17 +00:00
soren
3291a4522e Make libsa compile again. 2003-02-18 14:58:31 +00:00
perseant
e61877243d Make it compile again, grr.... 2003-02-18 02:00:08 +00:00
perseant
b397c875ae Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now
(there are still some details to work out) but expect that to go
away soon.  To support these basic changes (creation of lfs_putpages,
lfs_gop_write, mods to lfs_balloc) several other changes were made, to
wit:

* Create a writer daemon kernel thread whose purpose is to handle page
  writes for the pagedaemon, but which also takes over some of the
  functions of lfs_check().  This thread is started the first time an
  LFS is mounted.

* Add a "flags" parameter to GOP_SIZE.  Current values are
  GOP_SIZE_READ, meaning that the call should return the size of the
  in-core version of the file, and GOP_SIZE_WRITE, meaning that it
  should return the on-disk size.  One of GOP_SIZE_READ or
  GOP_SIZE_WRITE must be specified.

* Instead of using malloc(...M_WAITOK) for everything, reserve enough
  resources to get by and use malloc(...M_NOWAIT), using the reserves if
  necessary.  Use the pool subsystem for structures small enough that
  this is feasible.  This also obsoletes LFS_THROTTLE.

And a few that are not strictly necessary:

* Moves the LFS inode extensions off onto a separately allocated
  structure; getting closer to LFS as an LKM.  "Welcome to 1.6O."

* Unified GOP_ALLOC between FFS and LFS.

* Update LFS copyright headers to correct values.

* Actually cast to unsigned in lfs_shellsort, like the comment says.

* Keep track of which segments were empty before the previous
  checkpoint; any segments that pass two checkpoints both dirty and
  empty can be summarily cleaned.  Do this.  Right now lfs_segclean
  still works, but this should be turned into an effectless
  compatibility syscall.
2003-02-17 23:48:08 +00:00
pk
338f31f581 Make the buffer cache code MP-safe. 2003-02-05 21:38:38 +00:00
perseant
14c17e57b4 Don't call a dirop within a dirop: if lfs_rename is actually deleting
a link, call lfs_remove directly before starting dirop rather than
having ufs_rename do it.
2003-02-03 00:32:35 +00:00
tron
f1eeaa9020 Only use MALLOC_DECLARE() in kernel namespace. 2003-02-01 18:34:14 +00:00
thorpej
b193480908 Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant.  Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
2003-02-01 06:23:35 +00:00
yamt
84d61a1dc4 there's no need to treat VOP_WHITEOUT as dirop
because it modifies only one inode.
2003-01-30 14:18:32 +00:00
yamt
53d6eb47ee don't use daddr_t for segment summary since it's an on-disk structure. 2003-01-29 13:14:33 +00:00
simonb
0adecbd12b Remove variable that is only assigned to but not referenced. 2003-01-29 03:06:40 +00:00
yamt
e41d3a6f1c make these compilable with lfs debug options.
(follow daddr_t change)

XXX maybe segment number should be 64bit.
2003-01-27 23:17:56 +00:00
kleink
865868a8b1 Further printf format fixes in the wake of daddr_t.
Note that PRI?64 and long long int arguments aren't made for each other,
nor are %lld and int64_t arguments.
2003-01-27 21:45:52 +00:00
kleink
4e0e5333ae Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.
2003-01-25 23:00:09 +00:00
tron
5067836b9e Use PRId64 instead of hard coding "%lld" to fix build problems under
LP64 ports.
2003-01-25 18:12:31 +00:00
fvdl
a138610cac The oldblks and newblks arrays are used to store direct copies of
on-disk block pointers, so they should be int32_t. Error found
by Izumi Tsutsui.
2003-01-25 16:40:28 +00:00
tron
63dda858c6 Fix printf() format strings problems caused by "daddr_t" change. 2003-01-25 12:50:38 +00:00
fvdl
a3ff3a3038 Bump daddr_t to 64 bits. Replace it with int32_t in all places where
it was used on-disk, so that on-disk formats remain the same.
Remove ufs_daddr_t and ufs_lbn_t for the time being.
2003-01-24 21:55:02 +00:00
thorpej
b78f59b443 Merge the nathanw_sa branch. 2003-01-18 08:51:40 +00:00
yamt
03e1a46833 - zerofill struct lfs when allocating it.
- use M_ZERO instead of memset after malloc.
2003-01-12 13:04:52 +00:00
yamt
5f254d46cc backout wrong assertions that i added. 2003-01-08 17:16:52 +00:00
yamt
99d625b53c for lfs_remove/lfs_rmdir, keep removed vnodes marked VDIROP.
(backout parts of rev.1.40)
otherwise, directory structures can be corrupted because checkpoints can
occur via eg. lfs_vflush before parent directory is written.
2003-01-08 17:14:58 +00:00
yamt
69f1c0cb29 in set_dirop/endop, use normal vref/vrele instead of lfs versions
so that we don't miss lfs_inactivate.
2003-01-08 15:43:29 +00:00
yamt
ee36fccabb add assertions. 2003-01-08 15:40:54 +00:00
yamt
49d2b56b43 use lfs_unmark_vnode instead of duplicated code fragments. 2003-01-08 15:40:04 +00:00
yamt
140a8e56ca write ifile only when it has dirty buffers. 2002-12-31 14:54:32 +00:00
yamt
cb9613feef comment and assertions 2002-12-30 05:34:17 +00:00
yamt
6fc496c67a move check of lfs_unlockvp from lfs_reserveavail to lfs_reserve
because lfs_reservebuf needs same check as well.
2002-12-30 05:31:53 +00:00
yamt
a05fbf74c0 fix vref/vunref mismatch. 2002-12-29 14:08:12 +00:00
yamt
88ae33f9e0 backout assertions in lfs_inactive.
they can be false when unmounting forcibly.
2002-12-29 07:05:55 +00:00
christos
ae2bf40b7e fix compile problem. 2002-12-28 20:08:36 +00:00
yamt
d840722863 avoid warnings without DIAGNOSTIC.
pointed by Andreas Wrede.
2002-12-28 17:22:47 +00:00
yamt
a428d8a5af dirop inode can't be passed to lfs_inactivate. 2002-12-28 15:12:26 +00:00
yamt
59be5399b7 - in lfs_reserve, vref vnodes that we're locking so that cleaner doesn't
try to reclaim them.
  (workaround for deadlock noted in the comment in lfs_reserveavail)
- in lfs_rename, mark vnodes which are being moved as well as directry vnodes.
2002-12-28 14:39:08 +00:00
yamt
4b9c604ba7 - in lfs_reserve, reserve locked buffer count as well.
- don't wait for locking buf in lfs_bwrite_ext to avoid deadlocks.
- skip lfs_reserve when we're doing dirop.
  reserve more (for lfs_truncate) in set_dirop instead.

this mostly solves PR 18972. (and hopefully PR 19196)
2002-12-26 13:37:18 +00:00
yamt
e9bd1836a5 don't try to write all blocks passed to lfs_markv at once
since it likely causes buf starvation.
2002-12-26 13:04:39 +00:00
yamt
362c57a2d2 add a XXX comment. (description of possible deadlock) 2002-12-22 17:31:52 +00:00
yamt
4370be165c add a XXX comment 2002-12-21 05:35:54 +00:00
yamt
0d95cc5d66 correct/add assertion. 2002-12-18 14:05:50 +00:00
yamt
beff0dd387 #if 0 out vnode unlock/lock in lfs_reserve for now and add a comment about it.
deadlock is better than corruption (or panic), IMO.
2002-12-17 15:23:37 +00:00
yamt
a999523301 no need for cleaner to hold vnode locks.
cleaner and normal vnode operations are synchronized enough by
seglock/fraglock and buf's B_BUSY-ness.
2002-12-17 14:37:49 +00:00
yamt
b2d5b49e2b use ufs_daddr_t instead of int where appropriate. 2002-12-17 14:28:54 +00:00
yamt
a79cb6db43 - in lfs_bwrite_ext, if we're cleaner,
mark inode IN_CLEANING rather then IN_MODIFIED.
  otherwise cleaned (indirect) blocks belongs to the inode isn't written
  until next sync.
- add assertions.
2002-12-14 13:41:25 +00:00
yamt
e5ea55e4ea in lfs_writefile, check v_type==VNON earlier.
to avoid null dereference with DEBUG_LFS_VERBOSE.
2002-12-14 11:54:47 +00:00
yamt
8fe8a4ced8 save a segment write when doing checkpoint. 2002-12-13 14:40:02 +00:00
yamt
275b3a47a2 correct DIAGNOSTIC code for duplicated inodes in a segment and su_nbytes. 2002-12-12 12:28:13 +00:00
yamt
9097ffce96 take care of B_CLRBUF in lfs_balloc.
otherwise you'll see uninitialized blocks.
2002-12-11 13:34:14 +00:00
matt
60db16d1ff Add multiple inclusion protection for headers. Fix mismatched
variable declarations (missing const's) as needed.
2002-12-01 00:12:06 +00:00
yamt
2331faab98 more XXX comment. 2002-11-27 11:36:40 +00:00
yamt
3af39ea015 in lfs_fakebuf, make corresponding buffer busy to avoid
reading blocks that isn't written yet.
it's needed because we'll update metadatas in lfs_updatemeta
before data pointed by them is actually written to disk.

XXX should be solved with fake inode/indirect blocks instead?
2002-11-24 16:39:13 +00:00
yamt
290fa35864 add a XXX comment to lfs_reserve.
* it isn't safe to unlock vp here
 * because we're passing data using inode from namei.
 * (eg. i_offset)
2002-11-24 16:09:50 +00:00
yamt
eca07565c3 make sure i_lfs_fragsize is initialized.
fix panic "lfs_writefile: more than one fragment!"
PR 18974.
2002-11-24 08:43:26 +00:00
yamt
16a26d41e8 lfs_sync should wait at lfs_writer, not lfs_dirops.
PR 18973.
2002-11-24 08:37:43 +00:00
yamt
feacf34c09 lfs_reserve shouldn't block for lfs_unlockvp.
otherwise cleaner deadlocks.
PR 19134.
2002-11-24 08:32:22 +00:00
yamt
37b4f42285 blksize() macro shouldn't used for indirect blocks.
this fixes "getblk: block size invariant failed" panic.
PR 18977.
2002-11-24 08:27:00 +00:00
yamt
7d0ba73802 correct locking for lfs_rmdir. PR 18976. 2002-11-24 08:23:41 +00:00
jdolecek
e0cc03a09b merge kqueue branch into -current
kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
2002-10-23 09:10:23 +00:00
provos
0f09ed48a5 remove trailing \n in panic(). approved perry. 2002-09-27 15:35:29 +00:00
jdolecek
e305eb63e8 don't need <sys/conf.h> here 2002-09-22 19:32:54 +00:00
christos
6f3945a88d MNT_GETARGS support 2002-09-21 18:10:34 +00:00
gehenna
77a6b82b27 Merge the gehenna-devsw branch into the trunk.
This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

	device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
  by using this grammer.

- Added the new naming convention.
  The name of the device switch must be <prefix>_[bc]devsw for auto-generation
  of device switch tables.

- The backward compatibility of loading block/character device
  switch by LKM framework is broken. This is necessary to convert
  from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
  We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
  the LKM framework will refer it to assign device major number dynamically.
2002-09-06 13:18:43 +00:00
itojun
8dd04cdcd7 correct range check, have overflow check, fix type mismatches,
for cmap args and some other calls.  from openbsd
2002-08-03 00:12:48 +00:00
soren
178d83d503 Die, qaddr_t, die! - mnt_data in struct mount is already effectively
a void *, so stop pretending otherwise.
2002-07-30 07:40:07 +00:00
perseant
8f30dc2c9b Remove lying comment on SEGM_PROT seglock. 2002-07-11 21:09:00 +00:00
briggs
77f5558791 Fix a printf format warning. 2002-07-07 14:29:06 +00:00
perseant
32ae84b188 Deal with fragment size changes better. For each fragment that can
exist on an on-disk inode, we keep a record of its size in struct inode,
which is updated when we write the block to disk.  The cleaner routines
thus have ready access to what size is the correct size for this block,
on disk.

Fixed a related bug: if a file with fragments is being cleaned
(fragments being cleaned) at the same time it is being extended beyond
NDADDR blocks, we could write a bogus FINFO record that has a frag in the
middle; when it was cleaned this would give back bogus file data.  Don't
write the indirect blocks in this case, since there is no need.

lfs_fragextend and lfs_truncate no longer require the seglock, but instead
take a shared lock, which the seglock locks exclusively.
2002-07-06 01:30:11 +00:00
yamt
d566a58b5e fix printf format for DEBUG_LFS. 2002-07-02 19:07:03 +00:00
perseant
0418a2c352 Fix miscalculation in lfs_fits found by Trevin Beattie <trevin@xmission.com>.
Change some of the variable names from "nb", "db" to "fsb" to reflect their
calling conventions.
2002-06-20 22:10:24 +00:00
perseant
ae37d9d186 Don't bomb out of lfs_bmapv if the caller is requesting blocks that
live in the current segment.  There's nothing wrong with this, and
it is necessary for the correct operation of the coaleascer.
2002-06-20 20:43:17 +00:00
perseant
ddfb1dbb92 For synchronous writes, keep separate i/o counters for each write, so
processes don't have to wait for one another to finish (e.g., nfsd seems
to be a little happier now, though I haven't measured the difference).
Synchronous checkpoints, however, must always wait for all i/o to finish.

Take the contents of the callback functions and have them run in thread
context instead (aiodoned thread).  lfs_iocount no longer has to be
protected in splbio(), and quite a bit less of the segment construction
loop needs to be in splbio() as well.

If lfs_markv is handed a block that is not the correct size according to
the inode, refuse to process it.  (Formerly it was extended to the "correct"
size.)  This is possibly more prone to deadlock, but less prone to corruption.

lfs_segclean now outright refuses to clean segments that appear to have live
bytes in them.  Again this may be more prone to deadlock but avoids
corruption.

Replace ufsspec_close and ufsfifo_close with LFS equivalents; this means
that no UFS functions need to know about LFS_ITIMES any more.  Remove
the reference from ufs/inode.h.

Tested on i386, test-compiled on alpha.
2002-06-16 00:13:15 +00:00
perseant
c13ae45a2a Let lfs_bmapv fill in the bi_size member of the BLOCK_INFO structure,
as well as bi_daddr.  This lets the cleaner have an idea of what the size
of this block was at the time it was written without having to refer to
a segment header (e.g., in the file coalescing case).

Tested on i386.
2002-06-06 00:46:24 +00:00
perseant
d67a5bbb21 Fix a couple of instances where reassignbuf() was not done at splbio.
Tested on i386.
2002-05-24 22:13:57 +00:00
perseant
43ca783b4a Back out rev 1.174 of vfs_subr.c, because the splbio() wasn't protecting
enough to be useful, and broadening it so that it did would have meant
that operations possibly requiring synchronous disk activity would have
to be done in splbio().  This clearly was not going to work.

Worked around this in the LFS case by having lfs_cluster_callback put an
extra hold on the vnode before calling biodone(), and taking the hold
off without HOLDRELE's problematic list swapping.  lfs_vunref() will take
care of that---in thread context---on the next write if need be.

Also, ensure that the list walking in lfs_{writevnodes,segunlock,gather}
takes into account the possibility that the list may change
underneath it (possibly because it itself deleted an element).

Tested on i386, test-compiled on alpha.
2002-05-23 23:05:25 +00:00
perseant
ec0ca919be Protect v_freelist with splbio(), since HOLDRELE can be called in
interrupt context (through brelvp).  (LFS may be the only subsystem
affected by this problem.)

Tested on i386.
2002-05-20 22:50:57 +00:00
perseant
36efaa3565 use macros from <sys/queue.h> 2002-05-17 21:42:38 +00:00
thorpej
6c1654256e Fix LP64 printf format warning. 2002-05-16 02:23:55 +00:00
perseant
8886b0f4b2 Phase one of my three-phase plan to make LFS play nice with UBC, and bug-fixes
I found while making sure there weren't any new ones.

* Make the write clusters keep track of the buffers whose blocks they contain.
  This should make it possible to (1) write clusters using a page mapping
  instead of malloc, if desired, and (2) schedule blocks for rewriting
  (somewhere else) if a write error occurs.  Code is present to use
  pagemove() to construct the clusters but that is untested and will go away
  anyway in favor of page mapping.
* DEBUG now keeps a log of Ifile writes, so that any lingering instances of
  the "dirty bufs" problem can be properly debugged.
* Keep track of whether the Ifile has been dirtied by various routines that
  can be called by lfs_segwrite, and loop on that until it is clean, for
  a checkpoint.  Checkpoints need to be squeaky clean.
* Warn the user (once) if the Ifile grows larger than is reasonable for their
  buffer cache.  Both lfs_mountfs and lfs_unmount check since the Ifile can
  grow.
* If an inode is not found in a disk block, try rereading the block, under
  the assumption that the block was copied to a cluster and then freed.
* Protect WRITEINPROG() with splbio() to fix a hang in lfs_update.
2002-05-14 20:03:53 +00:00
matt
0cb85bc7b9 Eliminate commons. 2002-05-12 23:06:27 +00:00
perseant
76d2795556 Make exported LFSes not panic on the first file create. 2002-04-27 01:00:46 +00:00
thorpej
a180cee23b Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map).  Try to deal with this:

* Group all information about the backend allocator for a pool in a
  separate structure.  The pool references this structure, rather than
  the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
  to become available, but will still fail if it cannot callocate KVA
  space for the pages.  If this happens, carefully drain all pools using
  the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
  some pages, and use that information to make draining easier and more
  efficient.
* Get rid of PR_URGENT.  There was only one use of it, and it could be
  dealt with by the caller.

From art@openbsd.org.
2002-03-08 20:48:27 +00:00
perseant
f41358613c Include the space taken by inodes in the count made by lfs_check();
make VOP_SETATTR call lfs_check.  This prevents large numbers of inode
changes (say, at the end of tar(1)) from filling the buffer cache.
2002-02-11 02:47:29 +00:00
perseant
8ded9a2c7d Correct free list tail pointer, when adding blocks of new inodes to v2
filesystems.  Should fix PR #14408.
2002-02-04 03:32:16 +00:00
chs
0d70d731c2 use the new compatibility routines to allow mmap() to work
(in the same non-coherent fashion that it worked pre-UBC)
until someone has time to do it the right way.
2001-12-18 07:51:16 +00:00
chs
a106161b5a add spaces for KNF. confirmed to produce identical objects. 2001-11-23 21:44:25 +00:00
lukem
2565646230 don't need <sys/types.h> when including <sys/param.h> 2001-11-15 09:47:59 +00:00
lukem
ec6245465a add RCSID 2001-11-08 02:39:06 +00:00
simonb
c56d879335 Remove some variables that are set but never used. 2001-11-06 07:11:29 +00:00
lukem
99147a7648 remove #include <ufs/ufs/quota.h> where it was just to appease
<ufs/ufs/inode.h>, since the latter now includes the former.  leave the former
in source that obviously uses specific bits of it (for completeness.)
2001-10-26 05:56:06 +00:00
chs
a2e3e57398 initialize the vnode's copy of the size in lfs_ialloc(). 2001-10-14 19:06:16 +00:00
chs
80373b7e54 don't depend on other headers to include sys/proc.h for us. 2001-09-28 11:59:51 +00:00
sommerfeld
181c4513dc Add fifo_putpages() placebo so that the vnode's uobj is unlocked. 2001-09-22 22:35:18 +00:00
chs
64c6d1d2dc a whole bunch of changes to improve performance and robustness under load:
- remove special treatment of pager_map mappings in pmaps.  this is
   required now, since I've removed the globals that expose the address range.
   pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
   no longer any need to special-case it.
 - eliminate struct uvm_vnode by moving its fields into struct vnode.
 - rewrite the pageout path.  the pager is now responsible for handling the
   high-level requests instead of only getting control after a bunch of work
   has already been done on its behalf.  this will allow us to UBCify LFS,
   which needs tighter control over its pages than other filesystems do.
   writing a page to disk no longer requires making it read-only, which
   allows us to write wired pages without causing all kinds of havoc.
 - use a new PG_PAGEOUT flag to indicate that a page should be freed
   on behalf of the pagedaemon when it's unlocked.  this flag is very similar
   to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
   pageout fails due to eg. an indirect-block buffer being locked.
   this allows us to remove the "version" field from struct vm_page,
   and together with shrinking "loan_count" from 32 bits to 16,
   struct vm_page is now 4 bytes smaller.
 - no longer use PG_RELEASED for swap-backed pages.  if the page is busy
   because it's being paged out, we can't release the swap slot to be
   reallocated until that write is complete, but unlike with vnodes we
   don't keep a count of in-progress writes so there's no good way to
   know when the write is done.  instead, when we need to free a busy
   swap-backed page, just sleep until we can get it busy ourselves.
 - implement a fast-path for extending writes which allows us to avoid
   zeroing new pages.  this substantially reduces cpu usage.
 - encapsulate the data used by the genfs code in a struct genfs_node,
   which must be the first element of the filesystem-specific vnode data
   for filesystems which use genfs_{get,put}pages().
 - eliminate many of the UVM pagerops, since they aren't needed anymore
   now that the pager "put" operation is a higher-level operation.
 - enhance the genfs code to allow NFS to use the genfs_{get,put}pages
   instead of a modified copy.
 - clean up struct vnode by removing all the fields that used to be used by
   the vfs_cluster.c code (which we don't use anymore with UBC).
 - remove kmem_object and mb_object since they were useless.
   instead of allocating pages to these objects, we now just allocate
   pages with no object.  such pages are mapped in the kernel until they
   are freed, so we can use the mapping to find the page to free it.
   this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
2001-09-15 20:36:31 +00:00
chs
adf5d360a7 add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl.  file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value.  the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
2001-09-15 16:12:54 +00:00
chs
cb3b720183 disable mmap() for LFS until it is fixed. 2001-08-24 06:42:46 +00:00
chs
f0af9f581b add getpages/putpages entries for spec vnodes. 2001-08-17 05:54:36 +00:00
jdolecek
58ed62e500 Constraint 'blkcnt' of lfs_markv() syscall by 64KB. Reviewed by
Konrad Schroder <perseant@NetBSD.org>.
2001-08-03 06:02:42 +00:00
jdolecek
bd21ec5d2e lfs_writeseg(): make el_size a size_t (cosmetic only, no functional change) 2001-07-26 20:20:15 +00:00
assar
bec71dc090 change vop_symlink and vop_mknod to return vpp (the created node)
refed, so that the caller can actually use it.  update callers and
file systems that implement these vnode operations
2001-07-24 15:39:30 +00:00
perseant
4e3fced95b Merge the short-lived perseant-lfsv2 branch into the trunk.
Kernels and tools understand both v1 and v2 filesystems; newfs_lfs
generates v2 by default.  Changes for the v2 layout include:

- Segments of non-PO2 size and arbitrary block offset, so these can be
  matched to convenient physical characteristics of the partition (e.g.,
  stripe or track size and offset).

- Address by fragment instead of by disk sector, paving the way for
  non-512-byte-sector devices.  In theory fragments can be as large
  as you like, though in reality they must be smaller than MAXBSIZE in size.

- Use serial number and filesystem identifier to ensure that roll-forward
  doesn't get old data and think it's new.  Roll-forward is enabled for
  v2 filesystems, though not for v1 filesystems by default.

- The inode free list is now a tailq, paving the way for undelete (undelete
  is not yet implemented, but can be without further non-backwards-compatible
  changes to disk structures).

- Inode atime information is kept in the Ifile, instead of on the inode;
  that is, the inode is never written *just* because atime was changed.
  Because of this the inodes remain near the file data on the disk, rather
  than wandering all over as the disk is read repeatedly.  This speeds up
  repeated reads by a small but noticeable amount.

Other changes of note include:

- The ifile written by newfs_lfs can now be of arbitrary length, it is no
  longer restricted to a single indirect block.

- Fixed an old bug where ctime was changed every time a vnode was created.
  I need to look more closely to make sure that the times are only updated
  during write(2) and friends, not after-the-fact during a segment write,
  and certainly not by the cleaner.
2001-07-13 20:30:18 +00:00
toshii
4866f1a22b Fix typo. s/extention/extension/ 2001-07-05 08:38:24 +00:00
mrg
67afbd6270 use _KERNEL_OPT 2001-05-30 11:57:16 +00:00
christos
bf4fd5e39c don't include lfs_extern.h; ufs/inode.h does too. 2001-02-04 21:51:19 +00:00
itohy
7c338ddc48 Call inittodr() from lfs_mountroot() so that the system time is set properly
when booted from LFS.
2001-01-26 07:59:23 +00:00
jdolecek
d9466585b7 make filesystem vnodeop, specop, fifoop and vnodeopv_* arrays const 2001-01-22 12:17:35 +00:00
joff
a6ef389457 If DIAGNOSTIC and the segment writer gets a badly sized buffer, panic()
instead of silently corrupting the filesystem.
2001-01-09 05:05:35 +00:00
cgd
1a1dca038e replace \<space(s)><newline> (wrong!) with \<newline> 2000-12-20 00:24:23 +00:00
perseant
32d11b86a5 Call uvm_vmp_setsize() in lfs_{fast,}vget to set initial vnode size. 2000-12-03 07:34:49 +00:00
perseant
72633be8c6 Fix typo in 'malloc' for non-MALLOCLOG case 2000-12-03 06:43:36 +00:00
perseant
2a53ff5ab9 Get rid of some old unnecessary code that cleared B_NEEDCOMMIT from buffers in
lfs_writeseg (possibly after they had been freed).

If MALLOCLOG is defined, make lfs_newbuf and lfs_freebuf pass along the
caller's file and line to _malloc and _free.
2000-12-03 05:56:27 +00:00
chs
65a9d68fda don't forget to set um_lognindir (now required by ufs_bmaparray()). 2000-12-03 05:27:51 +00:00
jdolecek
bf558e3b3e only include opt_ddb.h for !LKM 2000-11-30 15:59:47 +00:00
jdolecek
734f246738 no need to include fs_lfs.h, define LFS directly 2000-11-30 15:57:35 +00:00
chs
aeda8d3b77 Initial integration of the Unified Buffer Cache project. 2000-11-27 08:39:39 +00:00
perseant
0055236dda If LFS_DO_ROLLFORWARD is defined, roll forward from the older checkpoint
on mount, through the newer checkpoint and on through any newer
partial-segments that may have been written but not checkpointed because
of an intervening crash.

LFS_DO_ROLLFORWARD is not defined by default.
2000-11-27 03:33:57 +00:00
perseant
77b518b85d Use u_int32_t instead of u_long to compute LFS checksums, since the
checksum is stored in a u_int32_t.
2000-11-25 02:39:34 +00:00