Commit Graph

106 Commits

Author SHA1 Message Date
yamt
560c0c565c don't use g_glock directly. 2006-10-14 09:17:26 +00:00
elad
fc9422c9d9 integrate kauth. 2006-05-14 21:31:52 +00:00
christos
12b7ab5f0b Correct a bogus expression gcc4 found. 2006-05-14 05:27:59 +00:00
perseant
481da54fc1 Postpone the segment accounting changes coming from truncation until the
inode that makes those changes valid is either written to disk by
lfs_writeinode() or discarded by lfs_vfree().

A couple of locking fixes are also included as well.
2006-04-30 21:19:42 +00:00
perseant
5f627fe958 Avoid a possible sign overflow condition in lfs_truncate, which would result
in a buffer overflow (underflow).  Coverity CID 1521.
2006-04-19 00:22:15 +00:00
perseant
39ce23c169 Implement a somewhat finer-grained mechanism for paging LFS-backed pages.
The writer daemon, if it does not need to flush the whole filesystem,
now only writes the vnodes for which the pagedaemon has requested pageouts
(although it does not pay attention to the page ranges the pagedaemon
supplies).
2006-04-08 00:26:34 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
yamt
6a17dd42f4 - ignore truncation for VCHR/VBLK/VFIFO as it used to be
before yamt-vop merge.  PR/32049 from Atsushi Onoe.
- reject setattr which attempts to change size of VLNK/VSOCK.
2005-11-11 15:50:57 +00:00
yamt
a748ea88dd merge yamt-vop branch. remove following VOPs.
VOP_BLKATOFF
	VOP_VALLOC
	VOP_BALLOC
	VOP_REALLOCBLKS
	VOP_VFREE
	VOP_TRUNCATE
	VOP_UPDATE
2005-11-02 12:38:58 +00:00
christos
a12024da06 Use nanotime() to update the time fields in filesystems. Convert the code
from macros to real functions. Original patch and review from chuq.
Note: ext2fs only keeps seconds in the on-disk inode, and msdosfs does not
have enough precision for all fields, so this is not very useful for those
two.
2005-09-12 16:24:41 +00:00
christos
273df63602 - sprinkle const
- avoid shadow variables.
2005-05-29 21:25:24 +00:00
perseant
2f695b5476 Provide a resize_lfs(8), including kernel and cleaner support. The current
implementation requires the fs to be mounted while resizing.  Tested in both
directions, and everything appears to work happily, but ymmv.
2005-04-23 19:47:51 +00:00
perseant
5ed792ecb0 Use splay trees, rather than a hash table, to manage the accounting of
blocks allocated through VOP_BALLOC() for pages to be written to disk.
This accounting no longer takes a noticeable fraction of the system CPU.
2005-04-16 17:35:58 +00:00
perseant
94decdd25d Use lfs_malloc() to manage the blkiov arrays that the cleaner functions use,
since the cleaner is likely to operate in a low-memory condition.
2005-04-16 17:28:37 +00:00
perseant
2ee78c4fa9 Keep track of the highest block held by an LFS inode, so that we can
be assured that the last byte of a file is always allocated.  Previously
a file extension could cause the filesystem to be flushed, writing an
inconsistent inode to disk.  Although this condition would be corrected
the next time blocks were written to disk, an intervening crash would leave
the filesystem in an inconsistent state, leaving fsck_lfs to complain
of an inode "partially truncated".
2005-04-14 00:02:46 +00:00
perseant
1ebfc508b6 Protect various per-fs structures with fs->lfs_interlock simple_lock, to
improve behavior in the multiprocessor case.  Add debugging segment-lock
assertion statements.
2005-04-01 21:59:46 +00:00
perseant
eefd94b8e2 Straighten out the maze of ifdefs. Instead, consolidate all the debugging
stuff under '#ifdef DEBUG', and use sysctl knobs to turn on/off particular
parts of the debugging reporting (if DEBUG is enabled).  Re-enable the LFS
statistics in sysctl, while I'm there.  A bit of a rototill.
2005-03-08 00:18:19 +00:00
perry
bcfcddbac1 nuke trailing whitespace 2005-02-26 22:31:44 +00:00
mycroft
bb17450999 Don't write out the extra zero pages with PGO_SYNCIO. We start an asynchronous
write anyway, and they will not be freed until that write is finished.
2004-08-15 19:01:16 +00:00
mycroft
4303882b7e Copy the current partial-truncate logic from FFS. In the process, fix a
potential overrun when truncating a fragment.
2004-08-15 17:37:07 +00:00
mycroft
f3fbefe76a Minor simplification to some arithmetic. 2004-08-15 16:17:37 +00:00
mycroft
45a21b76f0 Fixing age old cruft:
* Rather than using mnt_maxsymlinklen to indicate that a file systems returns
  d_type fields(!), add a new internal flag, IMNT_DTYPE.

Add 3 new elements to ufsmount:
* um_maxsymlinklen, replaces mnt_maxsymlinklen (which never should have existed
  in the first place).
* um_dirblksiz, which tracks the current directory block size, eliminating the
  FS-specific checks littered throughout the code.  This may be used later to
  make the block size variable.
* um_maxfilesize, which is the maximum file size, possibly adjusted lower due
  to implementation issues.

Sync some bug fixes from FFS into ext2fs, particularly:
* ffs_lookup.c 1.21, 1.28, 1.33, 1.48
* ffs_inode.c 1.43, 1.44, 1.45, 1.66, 1.67
* ffs_vnops.c 1.84, 1.85, 1.86

Clean up some crappy pointer frobnication.
2004-08-15 07:19:54 +00:00
mycroft
bc25b30608 Add a new flag, IN_MODIFY. This is like IN_UPDATE|IN_CHANGE, but unlike
setting those flags, it does not cause the inode to be written in the periodic
sync.  This is used for writes to special files (devices and named pipes) and
FIFOs.

Do not preemptively sync updates to access times and modification times.  They
are now updated in the inode only opportunistically, or when the file or device
is closed.  (Really, it should be delayed beyond close, but this is enough to
help substantially with device nodes.)

And the most amusing part:
Trickle sync was broken on both FFS and ext2fs, in different ways.  In FFS, the
periodic call to VFS_SYNC(MNT_LAZY) was still causing all file data to be
synced.  In ext2fs, it was causing the metadata to *not* be synced.  We now
only call VOP_UPDATE() on the node if we're doing MNT_LAZY.  I've confirmed
that we do in fact trickle correctly now.
2004-08-14 01:08:02 +00:00
oster
87d110abfa If we bail out due to an error, we need 'unreserve' the space that
we'd reserved earlier.

Approved by: yamt
2004-03-30 14:50:46 +00:00
hannken
3db4e2acd8 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:06:48 +00:00
pk
70f20a1217 Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes).  It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms.  Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.
2003-12-30 12:33:13 +00:00
yamt
cd2445d8d3 more assertion about file truncation to zero. 2003-11-07 14:48:28 +00:00
agc
aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
yamt
3852db2096 - protect global resource counts with lfs_subsys_lock.
- clean up scattered externs a little.
2003-07-12 16:17:06 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
darrenr
960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
yamt
c2b802ff24 fix b_interlock lock/unlock mismatches. 2003-04-27 06:46:38 +00:00
perseant
ef3c60764c Make LFS work better (though still not "well") as an NFS-exported
filesystem (and other things that needed to be fixed before the tests
would complete), to wit:

* Include the fs ident in the filehandle; improve stale filehandle checks.

* Change definition of blksize() to use the on-dinode size instead of
  the inode's i_size, so that fsck_lfs will work properly again.

* Use b_interlock in lfs_vtruncbuf.

* Postpone dirop reclamation until after the seglock has been released,
  so that lfs_truncate is not called with the segment lock held.

* Don't loop in lfs_fsync(), just write everything and wait.

* Be more careful about the interlock/uobjlock in lfs_putpages: when we
  lose this lock, we have to resynchronize dirtiness of pages in each
  block.

* Be sure to always write indirect blocks and update metadata in
  lfs_putpages; fixes a bug that caused blocks to be accounted to the
  wrong segment.
2003-04-23 07:20:37 +00:00
simonb
761de7345c '#if 0' out a variable that is currently only used in other '#if 0'd out
code.
2003-04-10 04:15:38 +00:00
fvdl
42614ed3f3 Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
2003-04-02 10:39:19 +00:00
perseant
c364d884f6 Hold the segment lock during truncation to prevent indirect blocks from
being written by lfs_updatemeta while lfs_truncate is also writing them,
a bug pointed out by YAMAMOTO Takashi <yamt@netbsd.org>.
2003-03-20 06:47:38 +00:00
perseant
8feb2c22f5 Take away "#ifdef LFS_UBC". 2003-03-08 21:46:04 +00:00
perseant
958a4c008c Don't force all truncations to be synchronous 2003-03-04 19:10:35 +00:00
perseant
cfc73a5fa9 Be careful to always zero pages on truncation/fragment extension,
in the case where the filesystem block size is larger than PAGE_SIZE.
2003-03-01 05:07:51 +00:00
perseant
daeb6c37d1 Make lfs_truncate handle file extension correctly, in the LFS_UBC case. 2003-02-28 07:37:56 +00:00
perseant
a94f9407dc Quell a hasty panic in lfs_truncate: on-inode disk addresses can be
different between the beginning and end of the call.
2003-02-28 04:37:07 +00:00
perseant
fdf4bfe002 Tabify, and fix some comment alignment problems. 2003-02-20 04:27:23 +00:00
perseant
b397c875ae Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now
(there are still some details to work out) but expect that to go
away soon.  To support these basic changes (creation of lfs_putpages,
lfs_gop_write, mods to lfs_balloc) several other changes were made, to
wit:

* Create a writer daemon kernel thread whose purpose is to handle page
  writes for the pagedaemon, but which also takes over some of the
  functions of lfs_check().  This thread is started the first time an
  LFS is mounted.

* Add a "flags" parameter to GOP_SIZE.  Current values are
  GOP_SIZE_READ, meaning that the call should return the size of the
  in-core version of the file, and GOP_SIZE_WRITE, meaning that it
  should return the on-disk size.  One of GOP_SIZE_READ or
  GOP_SIZE_WRITE must be specified.

* Instead of using malloc(...M_WAITOK) for everything, reserve enough
  resources to get by and use malloc(...M_NOWAIT), using the reserves if
  necessary.  Use the pool subsystem for structures small enough that
  this is feasible.  This also obsoletes LFS_THROTTLE.

And a few that are not strictly necessary:

* Moves the LFS inode extensions off onto a separately allocated
  structure; getting closer to LFS as an LKM.  "Welcome to 1.6O."

* Unified GOP_ALLOC between FFS and LFS.

* Update LFS copyright headers to correct values.

* Actually cast to unsigned in lfs_shellsort, like the comment says.

* Keep track of which segments were empty before the previous
  checkpoint; any segments that pass two checkpoints both dirty and
  empty can be summarily cleaned.  Do this.  Right now lfs_segclean
  still works, but this should be turned into an effectless
  compatibility syscall.
2003-02-17 23:48:08 +00:00
fvdl
a138610cac The oldblks and newblks arrays are used to store direct copies of
on-disk block pointers, so they should be int32_t. Error found
by Izumi Tsutsui.
2003-01-25 16:40:28 +00:00
fvdl
a3ff3a3038 Bump daddr_t to 64 bits. Replace it with int32_t in all places where
it was used on-disk, so that on-disk formats remain the same.
Remove ufs_daddr_t and ufs_lbn_t for the time being.
2003-01-24 21:55:02 +00:00
yamt
59be5399b7 - in lfs_reserve, vref vnodes that we're locking so that cleaner doesn't
try to reclaim them.
  (workaround for deadlock noted in the comment in lfs_reserveavail)
- in lfs_rename, mark vnodes which are being moved as well as directry vnodes.
2002-12-28 14:39:08 +00:00
provos
0f09ed48a5 remove trailing \n in panic(). approved perry. 2002-09-27 15:35:29 +00:00
perseant
32ae84b188 Deal with fragment size changes better. For each fragment that can
exist on an on-disk inode, we keep a record of its size in struct inode,
which is updated when we write the block to disk.  The cleaner routines
thus have ready access to what size is the correct size for this block,
on disk.

Fixed a related bug: if a file with fragments is being cleaned
(fragments being cleaned) at the same time it is being extended beyond
NDADDR blocks, we could write a bogus FINFO record that has a frag in the
middle; when it was cleaned this would give back bogus file data.  Don't
write the indirect blocks in this case, since there is no need.

lfs_fragextend and lfs_truncate no longer require the seglock, but instead
take a shared lock, which the seglock locks exclusively.
2002-07-06 01:30:11 +00:00
yamt
d566a58b5e fix printf format for DEBUG_LFS. 2002-07-02 19:07:03 +00:00
perseant
8886b0f4b2 Phase one of my three-phase plan to make LFS play nice with UBC, and bug-fixes
I found while making sure there weren't any new ones.

* Make the write clusters keep track of the buffers whose blocks they contain.
  This should make it possible to (1) write clusters using a page mapping
  instead of malloc, if desired, and (2) schedule blocks for rewriting
  (somewhere else) if a write error occurs.  Code is present to use
  pagemove() to construct the clusters but that is untested and will go away
  anyway in favor of page mapping.
* DEBUG now keeps a log of Ifile writes, so that any lingering instances of
  the "dirty bufs" problem can be properly debugged.
* Keep track of whether the Ifile has been dirtied by various routines that
  can be called by lfs_segwrite, and loop on that until it is clean, for
  a checkpoint.  Checkpoints need to be squeaky clean.
* Warn the user (once) if the Ifile grows larger than is reasonable for their
  buffer cache.  Both lfs_mountfs and lfs_unmount check since the Ifile can
  grow.
* If an inode is not found in a disk block, try rereading the block, under
  the assumption that the block was copied to a cluster and then freed.
* Protect WRITEINPROG() with splbio() to fix a hang in lfs_update.
2002-05-14 20:03:53 +00:00