Commit Graph

1199 Commits

Author SHA1 Message Date
drochner
e32ba1775e fix crash in mount error handling: don't free storage which was not
malloc'd
2005-07-25 11:42:38 +00:00
yamt
b7bfe82866 update file timestamps for nfsd loaned-read and mmap.
PR/25279.  discussed on tech-kern@.
2005-07-23 12:18:41 +00:00
yamt
6afb995fea ffs_full_fsync: because VBLK/VCHR can be mmap'ed,
do VOP_PUTPAGES for them as well.
2005-07-21 22:00:08 +00:00
yamt
2a6dc9d02d - introduce PGO_NOBLOCKALLOC and use it for ubc mapping
to prevent unnecessary block allocations in the case that
  page size > block size.

- ufs_balloc_range: use VM_PROT_WRITE+PGO_NOBLOCKALLOC rather than
  VM_PROT_READ.
2005-07-17 09:13:35 +00:00
thorpej
29af9583d2 Use ANSI function decls. 2005-07-15 05:01:16 +00:00
thorpej
4457fd076f Defflag UFS_DIRHASH. 2005-07-10 01:08:52 +00:00
thorpej
175c3312a8 - Use ANSI function decls.
- Sprinkle some static.
2005-07-10 00:18:52 +00:00
kml
dab4c6d721 Ensure that we change the size of the vnode at the same time as
we change the size of the inode, and use ext2fs_size uniformly.
This fixes a crash that occurs when I create a directory, then
move it, all on an ext2 filesystem.
2005-06-28 16:53:14 +00:00
yamt
44d128fa8e - constify genfs_ops.
- use member designators.
2005-06-28 09:30:37 +00:00
atatat
df13e3579e Change the rest of the sysctl subsystem to use const consistently.
The __UNCONST macro is now used only where necessary and the RW macros
are gone.  Most of the changes here are consumers of the
sysctl_createv(9) interface that now takes a pair of const pointers
which used not to be.
2005-06-20 02:49:18 +00:00
atatat
420d91208b Properly fix the constipated lossage wrt -Wcast-qual and the sysctl
code.  I know it's not the prettiest code, but it seems to work rather
well in spite of itself.
2005-06-09 02:19:59 +00:00
dbj
7753d41b8e remove (long) cast on bpref, which is daddr_t 2005-06-06 17:10:25 +00:00
dbj
331e001f0c the cluster summary must be swapped even for ufs2 2005-06-03 01:14:07 +00:00
is
4daeda666d fix copy/paste/don'tupdate bug (fix from PR 22232 by Robert Elz). 2005-06-02 10:08:36 +00:00
christos
c76e17575e s/buf/sbuf. 2005-05-31 02:37:50 +00:00
christos
07d1f24ff5 rename delay because it is a function on sparc. 2005-05-30 22:13:22 +00:00
christos
273df63602 - sprinkle const
- avoid shadow variables.
2005-05-29 21:25:24 +00:00
hannken
a69fbd6a18 - Use an empty snap block list to set the initial file size. Snapshot is
now valid from the beginning.  No need to copy the last fs block two times.
- No need to allocate the cylinder group blocks twice.
- cgbuf -> sbbuf
2005-05-25 11:07:13 +00:00
perseant
96f8f74d91 Don't update lfs_stats.segs_reclaimed if we're not keeping statistics.
Patch from Juan RP.
2005-05-25 01:50:01 +00:00
hannken
ffa83f8f0d ffs/ffs_alloc.c:
- Add a missing ACTIVECG_CLR().

ffs/ffs_snapshot.c:
- Use async/delayed writes for snapshot creation and sync/uncache these buffers
  on end. Reduces the time the file system must be suspended.
- Remove um_snaplistsize. Was a duplicate of um_snapblklist[0].
- Byte swap the list of preallocated blocks on read/write instead of access.
- Always keep this list on ip->i_snapblklist so it may be rolled back when the
  newest snapshot gets removed. Fixes a rare snapshot corruption when using
  more than one snapshot on a file system.

ufs/ufsmount.h:
  - Make TAILQ_LAST() possible on member um_snapshots.
  - Remove um_snaplistsize. Was a duplicate of um_snapblklist[0].
2005-05-22 08:35:28 +00:00
perseant
2ecd1730c0 Keep track of the number of segments reclaimed, since the cleaner doesn't
do this anymore (it hasn't for quite some time).  Add a couple of conditional
debugging messages to indicate why segments are not cleaned, in the event
that lfs_segclean is used.

Make the LFCNSEGWAITALL fcntl work again.
2005-05-20 19:48:25 +00:00
perseant
f8677583c3 VOP_LOCK drops the interlock; pick it up again to avoid an "already unlocked"
panic in lfs_putpages.
2005-05-20 19:09:25 +00:00
perseant
5760c21b8b Fill in the lfs_fsmnt field in the superblock when we mount the filesystem,
so fsck(8) can tell where it was last mounted.
2005-05-20 19:03:11 +00:00
hannken
a71c653aca flush_inodedep_deps(): If softdep_lookupvp() returns NULL it means the
inode has been reclaimed.  Skip the VOP_PUTPAGES() in this case.

Reviewed by: Chuck Silvers <chs@netbsd.org>
2005-05-07 14:24:14 +00:00
perseant
0d41dd0d46 Don't let the pager_map deadlock avoidance code in lfs_putpages() write
segments containing zero-block FINFO records.  These records cause segments
to become uncleanable, which would eventually result in a "no clean segments"
panic.
2005-05-04 04:58:22 +00:00
hannken
cad9d39281 Fix last commit. The last block of the file system may have changed
even if the last cylinder group is not modified.
2005-05-03 09:43:23 +00:00
perseant
5ed293c5d5 Recognize that we hold the v_interlock when relocking after a flush in
lfs_putpages.
2005-04-27 20:35:10 +00:00
skrll
d1c90589d8 Use the right arg structure for lfs_setattr, i.e. s/getattr/setattr/. 2005-04-25 06:28:51 +00:00
hannken
dc13562a0c Fix an inconsistency where the last block of the snapshot contains old data.
The last block of the file system is written to the snapshot before the
file system is suspended.  If the last cylinder group is modified after
the file system is suspended the last block of the snapshot may contain
old data.  So update this block again.
2005-04-24 15:49:37 +00:00
perseant
2f695b5476 Provide a resize_lfs(8), including kernel and cleaner support. The current
implementation requires the fs to be mounted while resizing.  Tested in both
directions, and everything appears to work happily, but ymmv.
2005-04-23 19:47:51 +00:00
yamt
5241cb4bbc don't assign to non-lvalue. found by gcc4. 2005-04-21 14:02:02 +00:00
perseant
f4a7694fc9 Keep per-inode, per-fs, and subsystem-wide counts of blocks allocated through
lfs_balloc(), and use that to estimate the number of dirty pages belonging
to LFS (subsystem or filesystem).  This is almost certainly wrong for
the case of a large mmap()ed region, but the accounting is tighter than
what we had before, and performs much better in the typical case of pages
dirtied through write().
2005-04-19 20:59:05 +00:00
perseant
f63fa194c2 Check the to-be-on-disk consistency of directories as well (correct a typo
in an earlier commit).
2005-04-18 23:03:08 +00:00
perseant
b2d19f57a3 Check for the inode having been previously freed, in UNMARK_VNODE().
Avoids a panic when calling mkdir() on a full filesystem.
2005-04-18 17:36:46 +00:00
perseant
5923fa20f1 Make userland compile again. 2005-04-16 19:52:09 +00:00
perseant
ad0169af41 Remove left-over reference to "lfs_blist", for _LKM case. 2005-04-16 18:10:12 +00:00
perseant
5ed792ecb0 Use splay trees, rather than a hash table, to manage the accounting of
blocks allocated through VOP_BALLOC() for pages to be written to disk.
This accounting no longer takes a noticeable fraction of the system CPU.
2005-04-16 17:35:58 +00:00
perseant
94decdd25d Use lfs_malloc() to manage the blkiov arrays that the cleaner functions use,
since the cleaner is likely to operate in a low-memory condition.
2005-04-16 17:28:37 +00:00
perseant
9936b8ce7e Tabify leading whitespace 2005-04-14 00:58:26 +00:00
perseant
f08a1ca4fa Consolidate the hash table we use to maintain the integrity of lfs_avail
into a single, system-wide table, rather than having a separate hash table
per inode.  Significantly reduces the "system" cpu usage of your average
file write.
2005-04-14 00:44:16 +00:00
perseant
2ee78c4fa9 Keep track of the highest block held by an LFS inode, so that we can
be assured that the last byte of a file is always allocated.  Previously
a file extension could cause the filesystem to be flushed, writing an
inconsistent inode to disk.  Although this condition would be corrected
the next time blocks were written to disk, an intervening crash would leave
the filesystem in an inconsistent state, leaving fsck_lfs to complain
of an inode "partially truncated".
2005-04-14 00:02:46 +00:00
perseant
af48a6d91c Clean up the handling of the pager_map deadlock in lfs_putpages, after
realizing that it is safe to sleep the second time through the loop.
2005-04-08 00:08:42 +00:00
perseant
c9d4fa4c0d Fix some locking issues that appeared with the simple_lock work.
Address a "pager_map" deadlock in lfs_putpages().
2005-04-06 04:30:46 +00:00
perseant
1ebfc508b6 Protect various per-fs structures with fs->lfs_interlock simple_lock, to
improve behavior in the multiprocessor case.  Add debugging segment-lock
assertion statements.
2005-04-01 21:59:46 +00:00
thorpej
e633e8b61b - Define a VFS_ATTACH() macro that places a reference to a vfsops structure
into the "vfsops" link set.
- Use VFS_ATTACH() where vfsops are declared for individual file systems.
- In vfsinit(), traverse the "vfsops" link set, rather than vfs_list_initial[].
2005-03-29 02:41:05 +00:00
christos
f2b82c7f8a make this compile again :-( 2005-03-26 19:40:31 +00:00
christos
aca59c847f Use vlog(9). Open-coding vlog here breaks lkm's because including
<sys/kprintf.h> includes opt_multiprocessor.h. One could argue
that the lock stuff should just move to subr_prf.c since nothing
else uses it.
2005-03-26 19:39:08 +00:00
perseant
bb7bbb2d16 Don't sleep while holding the vnode interlock. Should take care of the
first panic case in PR #26043.
2005-03-25 01:45:05 +00:00
bouyer
303cafe4e5 getblk() can return NULL if we are the pagedaemon. Check for this. 2005-03-24 20:13:17 +00:00
chs
f31a80ccd3 avoid the need for recursive locking lfs_flush_dirops() by unlocking
the vnode around the call to this in the caller.
2005-03-24 04:00:33 +00:00
perseant
c716c3d307 Make LFS dirops get their vnode first, before incrementing the dirop count,
to prevent a deadlock trying to call VOP_PUTPAGES() on a VDIROP vnode.
This can happen when a stacked filesystem is mounted on top of an LFS: an
LFS dirop needs to get a vnode, which is available from the upper layer.
The corresponding lower layer vnode, however, is VDIROP, so the upper layer
can't be cleaned out since its VOP_PUTPAGES() is passed through to the lower
layer, which waits for dirops to drain before it can proceed.  Deadlock.

Tweak ufs_makeinode() and ufs_mkdir() to pass the a_vpp argument through
to VOP_VALLOC().

Partially addresses PR # 26043, though it probably does not completely fix
the problem described there.
2005-03-23 00:12:51 +00:00
perseant
8e578e185f Be more careful about handling of flags to lfs_flush, to ensure that
the lfs_writing mutex is respected.
2005-03-09 22:12:15 +00:00
simonb
52c470b886 Tab Police. 2005-03-08 04:49:35 +00:00
perseant
eefd94b8e2 Straighten out the maze of ifdefs. Instead, consolidate all the debugging
stuff under '#ifdef DEBUG', and use sysctl knobs to turn on/off particular
parts of the debugging reporting (if DEBUG is enabled).  Re-enable the LFS
statistics in sysctl, while I'm there.  A bit of a rototill.
2005-03-08 00:18:19 +00:00
perseant
8de99480fa Move "ifile is too large for your NBUFS/BUFPAGES" messages into a function.
Use log(9) to warn the user instead of printf(9).  Since the theory is that
the Ifile is "always in cache", but the greater performance risk is
when the inode entries can't be held in cache, note these two cases
separately, at different log levels (notice and warning, respectively).
2005-03-04 22:19:05 +00:00
christos
cac7cf0758 PR/26823: Michael L. Hitch: Endianness flag were not preserved in the compat
superblock read routine.
2005-03-04 21:45:29 +00:00
perseant
871beffabf Put the ISSPACE() check where it belongs. This allows rewriting a file
on a full filesystem while still returning ENOSPC on an attempt to allocate
new blocks.
2005-03-02 21:16:09 +00:00
perry
bcfcddbac1 nuke trailing whitespace 2005-02-26 22:31:44 +00:00
perseant
25f49c3c91 Various minor LFS improvements:
* Note when lfs_putpages(9) thinks it is not going to be writing any
  pages before calling genfs_putpages(9).  This prevents a situation in
  which blocks can be queued for writing without a segment header.
* Correct computation of NRESERVE(), though it is still a gross
  overestimate in most cases.  Note that if NRESERVE() is too high, it
  may be impossible to create files on the filesystem.  We catch this
  case on filesystem mount and refuse to mount r/w.
* Allow filesystems to be mounted whose block size is == MAXBSIZE.
* Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN
  entries in indirect blocks again, triggering a failed assertion "daddr
  <= LFS_MAX_DADDR".  Explicitly convert to and from int32_t to correct
  this.
* Add a high-water mark for the number of dirty pages any given LFS can
  hold before triggering a flush.  This is settable by sysctl, but off
  (zero) by default.
* Be more careful about the MAX_BYTES and MAX_BUFS computations so we
  shouldn't see "please increase to at least zero" messages.
* Note that VBLK and VCHR vnodes can have nonzero values in di_db[0]
  even though their v_size == 0.  Don't panic when we see this.
* Change lfs_bfree to a signed quantity.  The manner in which it is
  processed before being passed to the cleaner means that sometimes it
  may drop below zero, and the cleaner must be aware of this.
* Never report bfree < 0 (or higher than lfs_dsize) through
  lfs_statvfs(9).  This prevents df(1) from ever telling us that our full
  filesystems have 16TB free.
* Account space allocated through lfs_balloc(9) that does not have
  associated buffer headers, so that the pagedaemon doesn't run us out
  of segments.
* Return ENOSPC from lfs_balloc(9) when bfree drops to zero.
* Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being
  unmounted.  Because vfs_busy() is a shared lock, and
  lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be
  holding the lock that umount() is blocking on, then try to vfs_busy()
  again in getnewvnode().
2005-02-26 05:40:42 +00:00
hannken
1d85e05ec4 Make `options FFS_NO_SNAPSHOT' only disable snapshot creation
while not trashing existing snapshots.

Approved by: core@
2005-02-21 17:52:11 +00:00
dsl
bd99144a6b change ffs_snapshot to !ffs_no_snapshot 2005-02-18 21:15:38 +00:00
chs
9cc4bd69b2 fix typoe in previous. 2005-02-14 02:22:48 +00:00
dsl
25579adfb4 Make ffs snapshots be enabled by 'option FFS_SNAPSHOT' 2005-02-10 22:23:19 +00:00
dsl
2f6f4269bd Add a stub file so that snapshot support can be compiled out.
Will allow INSTALL_TINY to fit back in its designated space.
Since the calling code doesn't allow a snapshot mount to fail, this code
will output a warning and delete any snapshots it finds.
This only happend on rw mounts - snapshots don't seem to be created
when mounting ro.
The whole way the snapshots gets mounted is a PITA anyway, the superblock
'last mounted' time should be used to validate that the fs hasn't been
mounted elsewhere.
2005-02-10 22:22:32 +00:00
ws
5387f7217c Add support for large files (>2GB).
Like Linux, automagically convert old filesystem to use this,
if they are already at revision 1.
For revision 0, just punt (unlike Linux; makes me a bit too nervous.)

There should be an option to fsck_ext2fs to upgrade revision 0 to revision 1.

Reviewd by Manuel (bouyer@).
2005-02-09 23:02:10 +00:00
hannken
8bb5af4d2e Fss device only checks read access to snapshot vode. On snapshot creation
check we are either super-user or owner of the snapshot vnode.
2005-02-09 16:05:29 +00:00
hannken
c13136f43f No longer needed. Ffs snapshots are enabled by default. 2005-01-31 22:21:17 +00:00
hannken
d5fbb6936f Add file system snapshots to kernel configs.
- Ffs internal snapshots get compiled in unconditionally.

- File system snapshot device fss(4) added to all kernel configs that
  have a disk.  Device is commented out on all non-GENERIC kernels.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
2005-01-31 16:54:32 +00:00
wrstuden
442d792d00 Fix pasto in previous. We only perform the DIOCCACHESYNC call if
FSYNC_CACHE is set, not if FSYNC_WAIT is set.
2005-01-27 02:16:42 +00:00
wrstuden
e384a44e9d Extend fsync_range(2) to support the FDISKSYNC flag, which requests
that the sync be propogated out through the disk drive caches.
2005-01-25 23:55:20 +00:00
dbj
d681cb1ea9 check _KERNEL_OPT instead of !_LKM to conditionalize opt includes 2005-01-24 21:34:48 +00:00
rumble
468646676a Remove dirhash.h. 2005-01-24 01:32:22 +00:00
rumble
32386a4e99 Bring in Ian Dowse's Dirhash from FreeBSD. Hash tables of
directories are created on the fly and used to increase
performance by circumventing ufs_lookup's linear search.

Dirhash is enabled by the UFS_DIRHASH option, but not
by default.
2005-01-23 19:37:05 +00:00
hannken
5b0fdd5c72 Protect calls to ffs_*_swap' with #ifdef FFS_EI'. 2005-01-18 10:40:21 +00:00
mycroft
7f1fe4e81f Rearrange some code slightly to avoid uninitialized variable warnings. 2005-01-11 00:19:36 +00:00
chs
8975a0856f adjust the UBC mapping code to support non-vnode uvm_objects.
this means we can no longer look at the vnode size to determine how many
pages to request in a fault, which is good since for NFS the size can change
out from under us on the server anyway.  there's also a new flag UBC_UNMAP
for ubc_release(), so that the file system code can make the decision about
whether to cache mappings for files being used as executables.
2005-01-09 16:42:43 +00:00
mycroft
e72fc6717e Whoops -- move the location of the VOP_OPEN()/VOP_CLOSE(), et al, from
foo_mountfs() to foo_mount(), to match the new mountroot API.
Also, for ext2fs and lfs, copy some restructuring from ffs to allow changing
file system parameters without specifying the device name.
(ntfs could use some more work.)
2005-01-09 09:27:17 +00:00
mycroft
0461b30ac3 Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions.  This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC.  I've been running this for a
long time.
2005-01-09 03:11:48 +00:00
thorpej
1c95472d01 Add the system call and VFS infrastructure for file system extended
attributes.

From FreeBSD.
2005-01-02 16:08:28 +00:00
dbj
0c5a27af69 remove opt_compat_netbsd.h, afaict it is no longer needed.
i think it was previously used to pull in COMPAT_09 for ffs_statfs
2004-12-26 17:34:39 +00:00
dbj
9b0bad335a use #if defined(_KERNEL_OPT) around opt includes
fix arg to pool_init() when _LKM is defined
2004-12-20 03:12:20 +00:00
mycroft
5ac91d4849 Remove some unnecessary (int32_t) casts that would cause us to screw up the
top bit in block addresses.

Also, change some daddr_t->int32_t casts (mostly as arguments to ufs_rw32(),
where they would get promoted anyway) to u_int32_t.
2004-12-15 07:11:51 +00:00
jdolecek
3524fe97e8 allow changes of the sysctl values 2004-11-21 19:21:51 +00:00
christos
67197f4e02 Remove erroneous KASSERT; i_size is one of the fields mentioned in
<ufs/inode.h> as unused by ext2fs.
2004-11-14 19:42:13 +00:00
christos
e98accb116 Put the correct fragment size in struct statvfs. From Kevin Lahey. 2004-11-11 01:32:12 +00:00
yamt
3bc2a57904 - hide bufq_state in mfsnode from userland.
- move bufq.h into obsolete set.

tested to compile pkgsrc/sysutils/lsof.
2004-11-09 08:46:08 +00:00
yamt
05f25dcc2a move buffer queue related stuffs from buf.h to their own header, bufq.h. 2004-10-28 07:07:35 +00:00
dbj
2bd0f63fd0 print absolute inode number in debug output when freeing free inode occurs.
previously, the number was relative to the cylinder group, which was confusing.
prefix debug message with "ifree:" so this can be differentiated in bug reports.
2004-10-11 17:15:36 +00:00
dbj
c45b4268dc remove diagnostic check for modified inactive inodes in ufs_reclaim
this condition can occur if ufs_inactive experiences failure attempting
to write the inode out.  Instead, have ufs_reclaim always call VOP_UPDATE
which will only write out the inode if there are unflushed changes
2004-10-08 18:43:50 +00:00
thorpej
11afd11faa Add a new VNODE_LOCKDEBUG option, which enables checks in the VOP_*()
calls to ensure that the vnode lock state is as expected when the VOP
call is made.  Modify vnode_if.src to set the expected state according
to the documenting lock table for each VOP.  Modify vnode_if.sh to emit
the checks.

Notes:
- The checks are only performed if the vnode has the VLOCKSWORK bit
  set.  Some file systems (e.g. specfs) don't even bother with vnode
  locks, so of course the checks will fail.
- We can't actually run with VNODE_LOCKDEBUG because there are so many
  vnode locking problems, not the least of which is the "use SHARED for
  VOP_READ()" issue, which screws things up for the entire call chain.

Inspired by similar changes in OpenBSD, but implemented differently.
2004-09-21 03:10:35 +00:00
yamt
cc047d3821 um_maxfilesize should be set after
ffs_oldfscompat_read adjusted fs_maxfilesize.
2004-09-19 11:58:29 +00:00
yamt
22399b45d0 change some members of struct buf from long to int.
ride on 2.0H.
2004-09-18 16:40:11 +00:00
skrll
f7155e40f6 There's no need to pass a proc value when using UIO_SYSSPACE with
vn_rdwr(9) and uiomove(9).

OK'd by Jason Thorpe
2004-09-17 14:11:20 +00:00
yamt
6f3db818ea ufs_getlbns:
- fix an integer overflow when calculating lbns of indirect blocks.
- remove a redundant calculation of blockcnt.
2004-09-15 09:52:49 +00:00
yamt
77641b98f3 g/c no longer used definition of fs_maxfilesize. 2004-09-10 09:36:05 +00:00
hannken
5816dad45e While creating a snapshot inodes must be freed from the
snapshot, not from the file system.
ffs_freefile() needs explicit "fs" and "devvp" arguments.
2004-08-29 10:13:48 +00:00
mycroft
2070a0c580 Make sure to set IMNT_DTYPE here... 2004-08-16 12:49:55 +00:00
mycroft
84b11bd3be Another piece of FFS_EI flotsam. 2004-08-15 21:53:03 +00:00
mycroft
0d7e3746eb Repair some FFS_EI code for ufsmount changes. 2004-08-15 21:34:14 +00:00
mycroft
bb17450999 Don't write out the extra zero pages with PGO_SYNCIO. We start an asynchronous
write anyway, and they will not be freed until that write is finished.
2004-08-15 19:01:16 +00:00