Commit Graph

762 Commits

Author SHA1 Message Date
ad
c35a9dfad1 Put a TNF copyright on it. 2008-05-31 21:39:13 +00:00
ad
3592ae4882 XXX softdep:
If the number of deletes in progress is getting too high, newdirrem()
requests the syncer to flush faster, and in some cases will block to
prevent deletes accumulating faster than the disk can service them.

The syncer will try to lock vnodes that the remover holds locked, leading
to the syncer and remover proceeding in lockstep and making very little
overall forward progress.

Put a hook into ufs_rmdir() and ufs_remove() so that the softdep code
can pace itself without holding vnode locks if the number of deletes is
running out of control.
2008-05-31 21:37:08 +00:00
hannken
336f2a69f4 ffs_copyonwrite(): stop abusing ffs_balloc() to get a block address.
Use ufs_getlbns()/bread() instead.
Saves some reads and removes deep recursion with possible deadlock
when ffs_balloc() runs copy-on-write on the buffer returned.
2008-05-29 10:00:50 +00:00
hannken
5d2bff060a Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write.  Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn().  If set the caller
  intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
  may clear the buffer and runs copy-on-write.  Process possible errors
  from getblk() or fscow_run().  Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2008-05-16 09:21:59 +00:00
rumble
a1221b6d4a Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
2008-05-10 02:26:09 +00:00
ad
42d0626726 PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
  sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
  and is only ever write locked in dounmount(). A write hold can't be taken
  on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
  example when going r/o -> r/w, and is only present to serialize updates.
  In order to take this lock, a read hold must first be taken on
  mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
2008-05-06 18:43:44 +00:00
ad
e071d39c84 - Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
2008-05-05 17:11:16 +00:00
ad
928a6b2096 PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
2008-04-30 12:49:16 +00:00
ad
baa3395f8f PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
2008-04-29 18:18:08 +00:00
hannken
2fd9e21242 Replace get/setspecific with a void pointer in struct ufsmount. Use explicit
initialization/finalization of snapshot private data on creation/deletion
of struct ufsmount.
Snapshot mounts no longer may fail silently because kmem_alloc() fails.

Welcome to 4.99.60

Ok: Andrew Doran <ad@netbsd.org>
2008-04-17 09:52:47 +00:00
ad
0701eb1ec7 newdirrem: if the number of deletes in progress is getting too high, start
pushing the syncer before considering rate limiting the deletes. We hold
vnodes locked and it's likely that the syncer will try to lock them while
flushing, leading to the syncer and remover proceeding in lockstep and
making very little forward progress. XXX this is not a solution.
2008-04-11 16:25:38 +00:00
ad
be04ac4896 Make rusage collection per-LWP and collate in the appropriate places.
cloned threads need a little bit more work but the locking needs to
be fixed first.
2008-03-27 19:06:51 +00:00
matt
e2ca3f7504 Merge all the *different* definitions of bufqueues into one common one. 2008-02-20 17:13:29 +00:00
ad
fb00b83874 Give bbusy() an interlock argument. If the we need to wait for the buffer,
the interlock is dropped and reacquired when awoken. This allows for
busying buffers attached to a list that is not locked by bufcache_lock.
2008-02-15 13:46:04 +00:00
hannken
a524d758da Make it work after lockmgr -> vlockmgr conversion:
- Initialize si_vnlock in si_mount_init().
  - Also initialize vl_recursecnt to zero.
- Destroy it only in si_mount_dtor().
- Simplify the v_lock <-> si_vnlock exchange.
- Don't abuse the overall error variable for LK_NOWAIT errors.
- ffs_snapremove: release the vnode one instead of three times.
2008-01-30 17:20:04 +00:00
ad
7356aff6af Replace use of LK_SLEEPFAIL. 2008-01-30 14:50:28 +00:00
ad
25153c3ec9 PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a
  mount takes a reference, and in turn the mount takes a reference to the
  vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
  locking inherited from 4.4BSD with a recursable rwlock.
2008-01-30 11:46:59 +00:00
ad
3490efcc63 Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
2008-01-30 09:50:19 +00:00
hannken
89cea1c2c4 - Always destroy si_vnlock after use.
- Take care of vnodes without file system data.
2008-01-28 17:49:06 +00:00
dholland
717e1785a5 Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
2008-01-28 14:31:15 +00:00
ad
1997a1e1f4 Remove VOP_LEASE. Discussed on tech-kern. 2008-01-25 14:32:11 +00:00
pooka
1bff00ecec Destroy extattr lock when destroying extattrs associated with the
mountpoint.  Make stopping extattrs always succesful to facilitate
always being able to free resources.
2008-01-25 10:49:32 +00:00
ad
703069c0e9 specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
  vnode can describe a block device. Instead, prohibit concurrent opens of
  block devices. As a bonus remove the unreliable code that prevents
  multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
  goes away, instead of abusing vnode::v_usecount to tell if the device is
  open.
2008-01-24 17:32:52 +00:00
hannken
0409981d90 si_mount_dtor(): destroy si_vnlock before free. 2008-01-24 17:29:56 +00:00
hannken
ca2bbf9908 Fix a typo from the vmlocking2 merge: vmark() the right vnode. 2008-01-24 16:26:44 +00:00
pooka
c40d447080 Sprinkle comments about um_lock status on function entry and exit.
No functional change.
2008-01-21 23:36:26 +00:00
ad
08c770b434 Initialize caches at IPL_SOFTBIO (not IPL_NONE) so that we are allocating
from kmem_map.
2008-01-12 00:34:10 +00:00
ad
dd68496f43 Fix hangs on 'biolock' when creating a directory under / with softdep. 2008-01-09 18:20:54 +00:00
ad
0b52913dee Go back to freeing on disk inodes in the inactive routine. It would be
better not to do this, but it rules out potential side effects with softdep.
2008-01-09 16:15:22 +00:00
ad
52603a7d6d Fix 'panic: softdep_update_inodeblock: update failed'. 2008-01-07 16:56:27 +00:00
tnn
51f964a289 softdep_freefile: don't acquire ufsmount lock twice. 2008-01-07 05:20:25 +00:00
ad
e01dd1a1f8 Use pool_cache. 2008-01-03 19:28:48 +00:00
pooka
d53e261066 valloc -> vnalloc, vfree -> vnfree
Avoids collision with userland valloc(3).

no functional change
ad ok
2008-01-03 01:26:28 +00:00
ad
4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
perry
b6a2ef7569 Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
2007-12-25 18:33:32 +00:00
dyoung
c9b4d7e0b1 Call genfs_node_init a little earlier to avoid a vput()ing an
uninitialized node, later, which leads to a kernel panic.  Patch
by Antti Kantee.
2007-12-20 16:18:57 +00:00
pooka
db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
ad
5870896633 Grab ump->um_lock in another spot. 2007-12-08 15:23:32 +00:00
ad
c27abd2a99 Add some comments. 2007-12-08 15:21:19 +00:00
hannken
d556dc98b0 Fscow_run(): add a flag "bool data_valid" to note still valid data.
Buffers run through copy-on-write are marked B_COWDONE.  This condition
is valid until the buffer has run through bwrite() and gets cleared from
biodone().

Welcome to 4.99.39.

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2007-12-02 13:56:15 +00:00
pooka
61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
dholland
9c31b45f5e Change the fs_clean member of the ffs superblock to be unsigned
(uint8_t instead of int8_t) - this prevents an ugly sign-extension
printing bug as well as formally undefined behavior when you mount an
unclean fs enough times.

From (my own) PR kern/28134; I've been carrying this patch for three
years, long enough to forget about it, and it's had no ill effects in
that time.

reviewed: pooka
2007-11-23 18:00:46 +00:00
ad
d18c6ca4de Merge from vmlocking:
- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
2007-11-07 00:23:13 +00:00
hannken
75bdfc0dac Avoid doing bawrite to initialize inode block while holding cylinder
group block buffer busy.  If filesystem has any active snapshots, bawrite
can come back trying to allocate new snapshot data block from the same
cylinder group and cause deadlock.

From FreeBSD Rev. 1.117
2007-11-01 06:31:59 +00:00
hannken
df70316e3f Ffs_blkfree() and ffs_freefile() take a devvp that may be a regular file whencalled from snapshot creation. Be sure to use the right mount.
Ok: Andrew Doran <ad@netbsd.org>
2007-10-18 17:39:04 +00:00
ad
7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad
5c3b2b3f2d Merge ffs locking & brelse changes from the vmlocking branch. 2007-10-08 18:01:27 +00:00
hannken
3856acafe2 Update the file system copy-on-write handler.
- Instead of hooking the handler on the specdev of a mounted file system
  hook directly on the `struct mount'.

- Rename from `vn_cow_*' to `fscow_*' and move to `kern/vfs_trans.c'.  Use
  `mount_*specific' instead of clobbering `struct mount' or `struct specinfo'.

- Replace the hand-made reader/writer lock with a krwlock.

- Keep `vn_cow_*' functions and mark as obsolete.

- Welcome to NetBSD 4.99.32 - `struct specinfo' changed size.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
2007-10-07 13:38:53 +00:00
pooka
094e51ae36 Fix comment inaccurate from prehistoric times: default MINFREE is 5, not 10 2007-09-24 16:20:50 +00:00
pooka
3f3cac88a3 Make bioops a pointer and point it to the softdeps struct in softdep
init.  Decouples "options SOFTDEP" from the main kernel and ffs code.
2007-09-01 23:40:21 +00:00
hannken
5657bcacb9 Modify ffs_lock() to take care for changed v_vnlock. Snapshots do not need
transferlockers() anymore.

From FreeBSD ffs_vnops.c Rev. 1.159

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2007-08-21 09:27:33 +00:00
hannken
f43278a84f - Use a mutex to protect snapinfo.
- Move the snapshot lock to snapinfo.
- ffs_snapblkfree(),ffs_copyonwrite(): replace lockmgr() with VOP_LOCK().
2007-08-18 10:11:01 +00:00
hannken
7d3c830b7c Expunge traces of unlinked snapshot files when making a new snapshot.
From FreeBSD Rev. 1.123
2007-08-18 09:48:33 +00:00
hannken
d56e8d1456 Move the fstrans-aware lock vnops from ufs to ffs. Other ufs file systems
do not need them.

Ride on 4.99.28
2007-08-09 09:22:34 +00:00
hannken
70cc392f1e Move snapshot per-mount data from struct ufsmount to mount specific data.
No functional changes.

Welcome to 4.99.28  (struct ufsmount changed size)
2007-08-09 07:34:27 +00:00
pooka
8d1f899239 * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
  use VFS_PROTOS() instead of manually prototyping the methods
2007-07-31 21:14:15 +00:00
ad
a0d1fd8d0c It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
2007-07-29 13:31:07 +00:00
pooka
9137aeda4b In sync, skip over vnodes based on if they are clean rather than
if they have pages.
2007-07-20 16:46:43 +00:00
pooka
e24b0872a4 Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
2007-07-17 11:19:31 +00:00
pooka
395899ddd0 When allocating blocks, check minfree before asking kauth about
suser.  The latter has unknown cost and rarely needs to be called.
2007-07-16 14:26:08 +00:00
dsl
2721ab6c7b Change the VFS_MOUNT() interface so that the 'data' buffer passed to the
fs code is a kernel buffer, pass though the length of the buffer as well.
Since the length of the userspace buffer isn'it (yet) passed through the mount
system call, add a field to the vfsops structure containing the default length.
Split sys_mount() for calls from compat code.
Ride one of the recent kernel version changes - old fs LKMs will load, but
sys_mount() will reject any attempt to use them.
2007-07-12 19:35:32 +00:00
hannken
a8c44dfbf5 ffs_snapshot_mount: No persistent snapshots on an Apple UFS file system.
From Thor Lancelot Simon <tls@netbsd.org>
2007-07-12 09:30:04 +00:00
hannken
af689d2468 Restore the special lkt_held handling for softdep_disk_write_complete().
No more panics 'worklist_remove: lock not held' on DEBUG kernels.

Ok Andrew Doran <ad@netbsd.org>
2007-07-10 10:47:07 +00:00
hannken
149abfc10f Move `struct dquot' and its supporting functions from quota.h to ufs_quota.c.
- Make quota-internal functions static.
- Clean up declarations in quota.h and ufs_extern.h.  quota.h now has the
  description of quota criterions, on-disk structure, user-kernel interface and
  declaration of init/done functions.  All ufs quota related function
  prototypes go to ufs_extern.h.
- New functions ufsquota_init() and ufsquota_free() create or destroy the
  quota fields of `struct inode'.
- chkdq() and chkiq() always update the quota fields of `struct inode' first.
- Only ufs_access() explicitely calls getinoquota().

No objections on tech-kern@
2007-07-10 09:50:07 +00:00
ad
e8f24929ec Fix build with DEBUG. 2007-07-09 22:52:14 +00:00
ad
c63e04d3bd We got LWPs years ago.. 2007-07-09 22:02:00 +00:00
ad
88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
pooka
835b0326c5 Using POOL_INIT here makes no sense, since file systems always have
an init method.  So get rid of it and #ifdef _LKM and just always
init in the init method.  Give malloc types the same treatment.
Makes file systems nicer to work with in linksetless environments
and fixes a few LKM discrepancies.
2007-06-30 09:37:53 +00:00
pooka
e01e324216 remove redundant KASSERTs 2007-06-29 15:34:59 +00:00
yamt
7225d589de remove a duplicated definition of FFS_ITIMES. 2007-06-07 05:34:48 +00:00
yamt
da51d139a4 improve post-ubc file overwrite performance in common cases.
ie. when it's safe, actually overwrite blocks rather than doing
read-modify-write.

also fixes PR/33152 and PR/36303.
2007-06-05 12:31:30 +00:00
tsutsui
cd07663368 Fix inconsistent changes in rev 1.153 and 1.154:
Adjust fs->fs_maxfilesize instead of ump->um_maxfilesize
in ffs_oldfscompat_read() because the latter is overrided
by the former after ffs_oldfscompat_read() returned.

Fixes EFBIG errors on read(2) and "exec /sbin/init: error 8"
problem on mac68k after mountroot() on old 4.3BSD UFS created
by the Mkfs tool for MacOS (reported and confirmed on port-mac68k).
2007-05-29 11:30:17 +00:00
ad
1ae6657a7b Fix lock order inversion between vnode locks and ufs_hashlock. Addresses
kern/36331 (MP deadlock between ufs_ihashget() and VOP_LOOKUP()) for ffs,
other file systems to follow. Reported by perseant@, debugged by Sverre
Froyen, patch posted/tested by Blair Sadewitz.
2007-05-28 23:42:56 +00:00
hannken
64b7e5637e Fstrans_start() always returns zero, so change its type to void. 2007-05-17 07:26:21 +00:00
yamt
d4bb61958b flush_inodedep_deps: fix access after free. PR/29724. 2007-05-07 11:13:01 +00:00
hannken
fc6776f366 Remove now obsolete vn_start_write() and vn_finished_write() and
corresponding flags.

Revert softdep_trackbufs() to its state before vn_start_write() was added.

Remove from struct mount now unneeded flags IMNT_SUSPEND* and
members mnt_writeopcountupper, mnt_writeopcountlower and mnt_leaf.

Welcome to 4.99.17
2007-04-08 11:20:42 +00:00
hannken
11601689e7 Remove calls to now obsolete vn_start_write() and vn_finished_write(). 2007-04-07 14:21:52 +00:00
ad
59d979c5f1 Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
2007-03-12 18:18:22 +00:00
christos
53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
thorpej
b3667ada6d TRUE -> true, FALSE -> false 2007-02-22 06:05:00 +00:00
thorpej
712239e366 Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
2007-02-21 22:59:35 +00:00
ad
adbb9ec2fa Call genfs_node_destroy() where appropriate. 2007-02-20 16:21:03 +00:00
pavel
934634a18c Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.
2007-02-17 22:31:36 +00:00
hannken
198beb0314 Make fstrans(9) the default helper for file system suspension.
Replaces the now obsolete vn_start_write()/vn_finished_write().
2007-02-16 17:23:53 +00:00
ad
9abeea588a Replace some uses of lockmgr() / simplelocks. 2007-02-15 15:40:50 +00:00
ad
b07ec3fc38 Merge newlock2 to head. 2007-02-09 21:55:00 +00:00
hannken
4d607243ba Change fstrans enum types to upper case.
No functional change.

From Antti Kantee <pooka@netbsd.org>
2007-01-29 15:42:50 +00:00
hubertf
eda05c6413 Remove more duplicate headers.
Patch by Slava Semushin <slava.semushin@gmail.com>

Again, this was tested by comparing obj files from a pristine and a patched
source tree against an i386/ALL kernel, and also for src/sbin/fsck_ffs,
src/sbin/fsdb and src/usr.sbin/makefs. Only changes in assert() line numbers
were detected in 'objdump -d' output.
2007-01-29 01:52:43 +00:00
hannken
1b9c6382e3 New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE.  This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).
2007-01-19 14:49:08 +00:00
isaki
be241a0e51 Correct indent. 2007-01-07 09:33:18 +00:00
elad
1e70d64818 Consistent usage of KAUTH_GENERIC_ISSUSER. 2007-01-04 16:55:29 +00:00
hannken
818b049a35 On snapshot creation be sure the snapshot vnode has valid quota information.
Fixes PR kern/35121
2006-12-02 17:21:11 +00:00
christos
5d48d92007 ifdef out an unused function if !FFS_NO_SNAPSHOT 2006-11-16 21:21:34 +00:00
christos
168cd830d2 __unused removal on arguments; approved by core. 2006-11-16 01:32:37 +00:00
reinoud
dc5b5420b9 Revisit mnt_vnodelist TAILQ patch. Remove all suspicious TAILQ_FOREACH()
loops where vnodes can get removed or added during the loops. This could
lead to panic's on unmount since nodes are skipped or otherwise
TAILQ_NEXT(0xdeadbeef, ...) was dereferenced.
2006-10-25 22:01:54 +00:00
drochner
bffbce1754 import a fix from FreeBSD (rev.1.185):
After a rmdir()ed directory has been truncated, force an update of
the directory's inode after queuing the dirrem that will decrement
the parent directory's link count.  This will force the update of
the parent directory's actual link to actually be scheduled.  Without
this change the parent directory's actual link count would not be
updated until ufs_inactive() cleared the inode of the newly removed
directory, which might be deferred indefinitely.  ufs_inactive()
will not be called as long as any process holds a reference to the
removed directory, and ufs_inactive() will not clear the inode if
the link count is non-zero, which could be the result of an earlier
system crash.
[plus description about problems woth background fsck solved
by this; irrelevant to NetBSD]

For me, the good effect is at least that I'm getting less filesystem
inconsistencies after a crash.

Approved by christos quite a while ago.
2006-10-24 19:36:26 +00:00
reinoud
0ce809091d Replace the LIST structure mp->mnt_vnodelist to a TAILQ structure since all
vnodes were synced and processed backwards. This meant that the last
accessed node was processed first and the earlierst last.

An extra benefit is the removal of the ugly hack from the Berkly days on
LFS.

In the proces, i've also replaced the various variations hand written loops
by the TAILQ_FOREACH() macro's.
2006-10-20 18:58:12 +00:00
yamt
65a8f1c211 ffs_truncate: don't forget to zero the past eof in the case of
blocksize < pagesize.  PR/33777 from Simon Burge.
XXX check other filesystems, esp. lfs.
2006-10-17 11:39:18 +00:00
yamt
72fc92b729 ffs_alloc: remove an assertion which is no longer true. 2006-10-15 12:23:56 +00:00
yamt
560c0c565c don't use g_glock directly. 2006-10-14 09:17:26 +00:00
yamt
b7cedb8e34 handle_workitem_freefrag/handle_workitem_freeblocks:
don't fake up inode/vnode pair.
2006-10-14 07:26:29 +00:00
hannken
3e0dbf3bc5 Add __unused to unused function arguments. 2006-10-13 10:21:21 +00:00
christos
4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
christos
b6bf786e1c Coverity CID 3690: Reverse INULL: Add KASSERT. 2006-10-03 18:59:22 +00:00
christos
f1a4e9cae0 Coverity CID 2949: comment out dead code (from Arnaud Lacombe) 2006-09-29 19:37:11 +00:00
jld
1b78265f0e Change ffs_mount, in MNT_UPDATE case, to check dev_t's for equality
instead of just vnode pointers.  Fixes erroneous "does not match mounted
device" errors from mount(8) in the presence of MFS /dev, init.root, &c.

No objections on tech-kern.
2006-09-21 00:11:30 +00:00
christos
676e77765a fix missing initializers 2006-08-30 01:28:53 +00:00
ad
f474dceb13 Use the LWP cached credentials where sane. 2006-07-23 22:06:03 +00:00
martin
a3b5baed42 Fix alignement problems for fhandle_t, exposed by gcc4.1.
While touching all vptofh/fhtovp functions, get rid of VFS_MAXFIDSIZ,
version the getfh(2) syscall and explicitly pass the size available in
the filehandle from userland.

Discussed on tech-kern, with lots of help from yamt (thanks!).
2006-07-13 12:00:24 +00:00
yamt
e408053d1b fix a simonb-timecounters regression.
the precision of getnanotime() is not suitable for file timestamps.
esp. when it's nfs-exported.

- introduce vfs_timestamp().
  (the name is from freebsd.  currently merely a wrapper of nanotime())
- for ufs-like filesystems, use it rather than getnanotime().

XXX check other filesystems.
2006-06-23 14:13:02 +00:00
hannken
442bf57d1c softdep_sync_metadata: If vp is a block device it may have new I/O requests
posted for it even if the vnode is locked. This will deadlock with wmesg
"softgetdbuf" if it gets a BMSAFEMAP dependency as here we have "bp == nbp"
and try to get a buffer we already own.

Approved by: Frank van der Linden <fvdl@netbsd.org>
2006-06-12 16:37:00 +00:00
kardel
1276c3051e PR 33697: complete timecounter conversion 2006-06-11 09:26:04 +00:00
kardel
de4337ab21 merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
  time.tv_sec -> time_second
- struct timeval mono_time is gone
  mono_time.tv_sec -> time_uptime
- access to time via
	{get,}{micro,nano,bin}time()
	get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
  Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
  NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
2006-06-07 22:33:33 +00:00
elad
fc9422c9d9 integrate kauth. 2006-05-14 21:31:52 +00:00
yamt
1d3a67174f remove unused FFS_NAMES and LFS_NAMES. 2006-04-23 14:15:12 +00:00
christos
53ae068fc6 Coverity CID 746: Remove dead code. lbn >= NDADDR is mutually exclusive to
snapshot_locked == 0.
2006-04-18 21:39:03 +00:00
christos
e14b3e8165 Coverity CID 2858: Avoid NULL deref. 2006-04-15 05:29:10 +00:00
bouyer
eb7f9aba74 Revert previous; I mixed bpp and *bpp when reading ffs_balloc_ufs1().
ffs_balloc() will always allocate a new buffer or leave it as NULL,
so coverity is wrong here, we're not using a freed argument.
2006-04-10 22:01:06 +00:00
bouyer
a4181a9049 If we brelse ibp, set ibp to NULL, to avoid reusing it later in balloc()
or in our code at the next iteration.
Coverity ID 2706
2006-04-10 21:50:18 +00:00
yamt
539544d937 ffs_gop_size: revert a problematic part of 1.78.
problems reported by Kouichirou Hiratsuka and Jukka Salmi on current-users@.
2006-04-09 21:59:35 +00:00
yamt
c5fcdd1719 some cleanups after the introduction of GOP_SIZE_MEM flag.
- remove GOP_SIZE_READ/GOP_SIZE_WRITE flags.
  they have not been used since the change.
- ufs_balloc_range: remove code which has been no-op since the change.
  thanks Konrad Schroder for explaining the original intention of the code.
- ffs_gop_size: don't extend past eof, in the case of GOP_SIZE_MEM.
  otherwise genfs_getpages end up to allocate pages past eof unnecessarily.
2006-03-30 12:40:06 +00:00
hannken
cd28767efa ffs_balloc*(): Add an assertion for "bpp != NULL" if B_METAONLY is set.
From Coverity CIDs 1170..1173
2006-03-23 11:16:47 +00:00
bouyer
9d8928a40d Fix dead error condition, coverity ID 747. 2006-03-18 13:56:51 +00:00
christos
5a57baa413 don't use MALLOC with a non-constant size; use malloc instead. 2006-03-17 23:29:07 +00:00
thorpej
58853410ae Use device_class() instead of accessing dv_class directly. 2006-02-21 04:32:38 +00:00
yamt
03f80508d6 - unify ffs_blkatoff and lfs_blkatoff.
- remove ufs_ops::uo_blkatoff.
- add directory read-ahead code.  (disabled for now.)
2006-01-14 17:41:16 +00:00
yamt
690d424f28 - add simple functions to allocate/free a buffer for i/o.
- make bufpool static.
2006-01-04 10:13:05 +00:00
chs
0545b6e0cb changes for making DIAGNOSTIC not change the kernel ABI:
- for structure fields that are conditionally present,
   make those fields always present.
 - for functions which are conditionally inline, make them never inline.
 - remove some other functions which are conditionally defined but
   don't actually do anything anymore.
 - make a lock-debugging function conditional on only LOCKDEBUG.

as discussed on tech-kern some time back.
2005-12-27 04:06:45 +00:00
perry
0f0296d88a Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete. 2005-12-24 20:45:08 +00:00
rpaulo
fc2fb45bf0 Convert UFS_EXTATTR to struct lwp. 2005-12-23 23:20:00 +00:00
yamt
523e856cba prevent in-core vnode being freed from getting new references.
otherwise, once the corresponding bit in the inode bitmap is cleared,
an unrelated inode with the same inode number can be allocated and
ufs_ihashget() picks a stale in-core vnode for it.

PR/32301 by Matthias Scheler.
2005-12-23 15:31:40 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
dsl
d59e7ef247 Force some multiplies to give a 64 bit result to avoid dirsize being zero
and causing a divide by zero trap later.
Fixes a panic noted in netbsd-help.
2005-11-27 11:45:56 +00:00
yamt
6a17dd42f4 - ignore truncation for VCHR/VBLK/VFIFO as it used to be
before yamt-vop merge.  PR/32049 from Atsushi Onoe.
- reject setattr which attempts to change size of VLNK/VSOCK.
2005-11-11 15:50:57 +00:00
gdt
2de7c6cd0d Adjust signature of softdep_freefile (dummy stub which always panics
if called) to match ffs_extern.h so that kernels w/o softdep can compile.
2005-11-02 22:10:41 +00:00
yamt
a748ea88dd merge yamt-vop branch. remove following VOPs.
VOP_BLKATOFF
	VOP_VALLOC
	VOP_BALLOC
	VOP_REALLOCBLKS
	VOP_VFREE
	VOP_TRUNCATE
	VOP_UPDATE
2005-11-02 12:38:58 +00:00
yamt
baee927713 introduce "ufs_ops" and use it for ITIMES. 2005-09-27 06:48:55 +00:00
yamt
d3a07546a6 revert ffs_snapshot.c 1.20 because it's bogus. pointed by Simon Burge. 2005-09-26 14:10:32 +00:00
yamt
6138b82a56 always use nanotime rather than time.
it's bad to mix nanotime and time because it sometimes
make timestamps go backwards.
2005-09-26 13:52:20 +00:00
jmmv
2a3e5eeb7c Apply the NFS exports list rototill patch:
- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
  function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
  file sys/nfs/nfs_export.c.  The former was becoming large and its code
  is always compiled, regardless of the build options.  Using the latter,
  the code is only compiled in when NFSSERVER is enabled.  While doing this,
  also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
  path and a set of export entries.  At the moment it can only clear the
  exports list or append entries, one by one, but it is done in a way that
  allows setting the whole set of entries atomically in the future (see the
  comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
  that it becomes file system agnostic.  In fact, all this whole thing was
  done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
  exports initialization; done internally by the kernel when initializing
  the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
  subsystems can run arbitrary code upon receipt of specific VFS events.
  At the moment, this only provides support for unmount and is used to
  destroy NFS exports lists from the file systems being unmounted, though it
  has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
2005-09-23 12:10:31 +00:00
rpaulo
3c4f143c6e Fix bogus if-clause introduced in previous revision. 2005-09-22 14:04:29 +00:00
rpaulo
a12bed5a16 In ffs_unmount(), detect EOPNOTSUPP errno returned from
ufs_extattr_stop().

From FreeBSD.
2005-09-22 13:50:55 +00:00
christos
49840169c0 Add another KASSERT. 2005-09-12 20:26:44 +00:00
christos
c93a283e5f - access the ffs and ext2fs itimes functions through a pointer, so that
if the filesystem is not compiled in the kernel still links. Probably
  a better solution is to use weak symbols.
- move the filesystem-specific itime macros to the filesystem header files.
2005-09-12 20:23:03 +00:00
drochner
9cde940a73 move the new ffs_itimes() to a berr place -- ffs_subr.c is shared with
userland
2005-09-12 20:09:59 +00:00
christos
a12024da06 Use nanotime() to update the time fields in filesystems. Convert the code
from macros to real functions. Original patch and review from chuq.
Note: ext2fs only keeps seconds in the on-disk inode, and msdosfs does not
have enough precision for all fields, so this is not very useful for those
two.
2005-09-12 16:24:41 +00:00
yamt
d8798fec66 - for pagecache dependency, track which page in the block
has been written or not individually by (ab)using b_resid
  in pcbp as a bitmap.
- add a comment to explain why it's needed.

PR/15364.  reviewed by Chuck Silvers.
2005-09-09 15:04:07 +00:00
yamt
5b4c989faf revert the code to expand putpage requests to block boundary.
because:
	- it was incomplete in some cases.
	- it can confuse pagedaemon.
see PR/15364 for details.
2005-09-09 15:00:39 +00:00
xtraeme
23ebf62d26 * Remove __P()
* Use ANSI function declarations on ext2fs and mfs
2005-08-30 22:01:12 +00:00
thorpej
e1afed9c2d Experimental support for extended attributes on UFS1 file systems, using a
backing file per attribute type indexed by inode number to hold the extended
attributes.

This is working pretty well on my test systems, except for the "autostart"
feature.  I need someone with a better handle on the VFS locking protocol
to go over that.

This is a work-in-progress.  There are parts of this that could be re-factored
allowing this approach to be used on other types of file systems.

Adapted from FreeBSD.
2005-08-28 19:37:58 +00:00
yamt
4c32aa5945 PRId64 -> ld in UVMHIST_LOG format strings. 2005-08-24 10:19:43 +00:00
christos
0b0eb1328b Don't overload MAXNAMLEN, use a separate constant for each filesystem type. 2005-08-23 08:05:13 +00:00
christos
50f8955b6e 64 bit inode changes. 2005-08-19 02:04:03 +00:00
yamt
946832fd33 revert VCHR part of ffs_vnops.c 1.71.
as VCHR uses the device pager, no point to call VOP_PUTPAGES here.
pointed by Chuck Silvers.
2005-07-26 12:14:46 +00:00
drochner
e32ba1775e fix crash in mount error handling: don't free storage which was not
malloc'd
2005-07-25 11:42:38 +00:00
yamt
b7bfe82866 update file timestamps for nfsd loaned-read and mmap.
PR/25279.  discussed on tech-kern@.
2005-07-23 12:18:41 +00:00
yamt
6afb995fea ffs_full_fsync: because VBLK/VCHR can be mmap'ed,
do VOP_PUTPAGES for them as well.
2005-07-21 22:00:08 +00:00
thorpej
29af9583d2 Use ANSI function decls. 2005-07-15 05:01:16 +00:00
yamt
44d128fa8e - constify genfs_ops.
- use member designators.
2005-06-28 09:30:37 +00:00
dbj
7753d41b8e remove (long) cast on bpref, which is daddr_t 2005-06-06 17:10:25 +00:00
dbj
331e001f0c the cluster summary must be swapped even for ufs2 2005-06-03 01:14:07 +00:00
is
4daeda666d fix copy/paste/don'tupdate bug (fix from PR 22232 by Robert Elz). 2005-06-02 10:08:36 +00:00
christos
07d1f24ff5 rename delay because it is a function on sparc. 2005-05-30 22:13:22 +00:00
christos
273df63602 - sprinkle const
- avoid shadow variables.
2005-05-29 21:25:24 +00:00
hannken
a69fbd6a18 - Use an empty snap block list to set the initial file size. Snapshot is
now valid from the beginning.  No need to copy the last fs block two times.
- No need to allocate the cylinder group blocks twice.
- cgbuf -> sbbuf
2005-05-25 11:07:13 +00:00
hannken
ffa83f8f0d ffs/ffs_alloc.c:
- Add a missing ACTIVECG_CLR().

ffs/ffs_snapshot.c:
- Use async/delayed writes for snapshot creation and sync/uncache these buffers
  on end. Reduces the time the file system must be suspended.
- Remove um_snaplistsize. Was a duplicate of um_snapblklist[0].
- Byte swap the list of preallocated blocks on read/write instead of access.
- Always keep this list on ip->i_snapblklist so it may be rolled back when the
  newest snapshot gets removed. Fixes a rare snapshot corruption when using
  more than one snapshot on a file system.

ufs/ufsmount.h:
  - Make TAILQ_LAST() possible on member um_snapshots.
  - Remove um_snaplistsize. Was a duplicate of um_snapblklist[0].
2005-05-22 08:35:28 +00:00
hannken
a71c653aca flush_inodedep_deps(): If softdep_lookupvp() returns NULL it means the
inode has been reclaimed.  Skip the VOP_PUTPAGES() in this case.

Reviewed by: Chuck Silvers <chs@netbsd.org>
2005-05-07 14:24:14 +00:00
hannken
cad9d39281 Fix last commit. The last block of the file system may have changed
even if the last cylinder group is not modified.
2005-05-03 09:43:23 +00:00
hannken
dc13562a0c Fix an inconsistency where the last block of the snapshot contains old data.
The last block of the file system is written to the snapshot before the
file system is suspended.  If the last cylinder group is modified after
the file system is suspended the last block of the snapshot may contain
old data.  So update this block again.
2005-04-24 15:49:37 +00:00
yamt
5241cb4bbc don't assign to non-lvalue. found by gcc4. 2005-04-21 14:02:02 +00:00
thorpej
e633e8b61b - Define a VFS_ATTACH() macro that places a reference to a vfsops structure
into the "vfsops" link set.
- Use VFS_ATTACH() where vfsops are declared for individual file systems.
- In vfsinit(), traverse the "vfsops" link set, rather than vfs_list_initial[].
2005-03-29 02:41:05 +00:00
christos
cac7cf0758 PR/26823: Michael L. Hitch: Endianness flag were not preserved in the compat
superblock read routine.
2005-03-04 21:45:29 +00:00
perry
bcfcddbac1 nuke trailing whitespace 2005-02-26 22:31:44 +00:00
hannken
1d85e05ec4 Make `options FFS_NO_SNAPSHOT' only disable snapshot creation
while not trashing existing snapshots.

Approved by: core@
2005-02-21 17:52:11 +00:00
dsl
2f6f4269bd Add a stub file so that snapshot support can be compiled out.
Will allow INSTALL_TINY to fit back in its designated space.
Since the calling code doesn't allow a snapshot mount to fail, this code
will output a warning and delete any snapshots it finds.
This only happend on rw mounts - snapshots don't seem to be created
when mounting ro.
The whole way the snapshots gets mounted is a PITA anyway, the superblock
'last mounted' time should be used to validate that the fs hasn't been
mounted elsewhere.
2005-02-10 22:22:32 +00:00
hannken
8bb5af4d2e Fss device only checks read access to snapshot vode. On snapshot creation
check we are either super-user or owner of the snapshot vnode.
2005-02-09 16:05:29 +00:00
hannken
c13136f43f No longer needed. Ffs snapshots are enabled by default. 2005-01-31 22:21:17 +00:00
wrstuden
442d792d00 Fix pasto in previous. We only perform the DIOCCACHESYNC call if
FSYNC_CACHE is set, not if FSYNC_WAIT is set.
2005-01-27 02:16:42 +00:00
wrstuden
e384a44e9d Extend fsync_range(2) to support the FDISKSYNC flag, which requests
that the sync be propogated out through the disk drive caches.
2005-01-25 23:55:20 +00:00
hannken
5b0fdd5c72 Protect calls to ffs_*_swap' with #ifdef FFS_EI'. 2005-01-18 10:40:21 +00:00
mycroft
7f1fe4e81f Rearrange some code slightly to avoid uninitialized variable warnings. 2005-01-11 00:19:36 +00:00
mycroft
0461b30ac3 Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions.  This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC.  I've been running this for a
long time.
2005-01-09 03:11:48 +00:00
thorpej
1c95472d01 Add the system call and VFS infrastructure for file system extended
attributes.

From FreeBSD.
2005-01-02 16:08:28 +00:00
dbj
0c5a27af69 remove opt_compat_netbsd.h, afaict it is no longer needed.
i think it was previously used to pull in COMPAT_09 for ffs_statfs
2004-12-26 17:34:39 +00:00
mycroft
5ac91d4849 Remove some unnecessary (int32_t) casts that would cause us to screw up the
top bit in block addresses.

Also, change some daddr_t->int32_t casts (mostly as arguments to ufs_rw32(),
where they would get promoted anyway) to u_int32_t.
2004-12-15 07:11:51 +00:00
jdolecek
3524fe97e8 allow changes of the sysctl values 2004-11-21 19:21:51 +00:00
dbj
2bd0f63fd0 print absolute inode number in debug output when freeing free inode occurs.
previously, the number was relative to the cylinder group, which was confusing.
prefix debug message with "ifree:" so this can be differentiated in bug reports.
2004-10-11 17:15:36 +00:00
thorpej
11afd11faa Add a new VNODE_LOCKDEBUG option, which enables checks in the VOP_*()
calls to ensure that the vnode lock state is as expected when the VOP
call is made.  Modify vnode_if.src to set the expected state according
to the documenting lock table for each VOP.  Modify vnode_if.sh to emit
the checks.

Notes:
- The checks are only performed if the vnode has the VLOCKSWORK bit
  set.  Some file systems (e.g. specfs) don't even bother with vnode
  locks, so of course the checks will fail.
- We can't actually run with VNODE_LOCKDEBUG because there are so many
  vnode locking problems, not the least of which is the "use SHARED for
  VOP_READ()" issue, which screws things up for the entire call chain.

Inspired by similar changes in OpenBSD, but implemented differently.
2004-09-21 03:10:35 +00:00
yamt
cc047d3821 um_maxfilesize should be set after
ffs_oldfscompat_read adjusted fs_maxfilesize.
2004-09-19 11:58:29 +00:00
skrll
f7155e40f6 There's no need to pass a proc value when using UIO_SYSSPACE with
vn_rdwr(9) and uiomove(9).

OK'd by Jason Thorpe
2004-09-17 14:11:20 +00:00
hannken
5816dad45e While creating a snapshot inodes must be freed from the
snapshot, not from the file system.
ffs_freefile() needs explicit "fs" and "devvp" arguments.
2004-08-29 10:13:48 +00:00
mycroft
bb17450999 Don't write out the extra zero pages with PGO_SYNCIO. We start an asynchronous
write anyway, and they will not be freed until that write is finished.
2004-08-15 19:01:16 +00:00
mycroft
a97f7bfcbd Correct the fix for the partial-truncate inefficiency. We still need to zero,
but we only need to sync those pages that are being lopped off, if any.
2004-08-15 17:36:00 +00:00
mycroft
f3fbefe76a Minor simplification to some arithmetic. 2004-08-15 16:17:37 +00:00
mycroft
45a21b76f0 Fixing age old cruft:
* Rather than using mnt_maxsymlinklen to indicate that a file systems returns
  d_type fields(!), add a new internal flag, IMNT_DTYPE.

Add 3 new elements to ufsmount:
* um_maxsymlinklen, replaces mnt_maxsymlinklen (which never should have existed
  in the first place).
* um_dirblksiz, which tracks the current directory block size, eliminating the
  FS-specific checks littered throughout the code.  This may be used later to
  make the block size variable.
* um_maxfilesize, which is the maximum file size, possibly adjusted lower due
  to implementation issues.

Sync some bug fixes from FFS into ext2fs, particularly:
* ffs_lookup.c 1.21, 1.28, 1.33, 1.48
* ffs_inode.c 1.43, 1.44, 1.45, 1.66, 1.67
* ffs_vnops.c 1.84, 1.85, 1.86

Clean up some crappy pointer frobnication.
2004-08-15 07:19:54 +00:00
mycroft
5d31d187f3 Partially fix a performance problem in the partial-truncate case. We were
doing synchronous writes unnecessarily in a couple of places.  Now it's 1
write per truncate in my test case rather than 3.  :-P
2004-08-14 02:26:57 +00:00
mycroft
14ae52970e There is no need to do a synchronous write when truncating a short symlink. 2004-08-14 01:32:02 +00:00
mycroft
9f8efecd46 In the indirect block unwind case, we only need to do the synchronous writes
of the inode in the softdep case.  XXX This is really a deficiency in softdep.
2004-08-14 01:30:56 +00:00
mycroft
bc25b30608 Add a new flag, IN_MODIFY. This is like IN_UPDATE|IN_CHANGE, but unlike
setting those flags, it does not cause the inode to be written in the periodic
sync.  This is used for writes to special files (devices and named pipes) and
FIFOs.

Do not preemptively sync updates to access times and modification times.  They
are now updated in the inode only opportunistically, or when the file or device
is closed.  (Really, it should be delayed beyond close, but this is enough to
help substantially with device nodes.)

And the most amusing part:
Trickle sync was broken on both FFS and ext2fs, in different ways.  In FFS, the
periodic call to VFS_SYNC(MNT_LAZY) was still causing all file data to be
synced.  In ext2fs, it was causing the metadata to *not* be synced.  We now
only call VOP_UPDATE() on the node if we're doing MNT_LAZY.  I've confirmed
that we do in fact trickle correctly now.
2004-08-14 01:08:02 +00:00
pk
a7c40722d8 Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().
2004-07-05 07:28:45 +00:00
hannken
8b477ad955 When we expunge an unreferenced file from a snapshot its size may be zero. 2004-06-30 18:42:17 +00:00
hannken
7a5be5a9ff - Add flag L_COWINPROGRESS to struct lwp to avoid recursion when
doing copy-on-write.

- Change VFS_SNAPSHOT() to return the snapshot vnode locked.

- Make the IO path for copy-on-write and snapshot-read more lightweight.
  Avoids deadlocks where vn_rdwr(...READ...) has a shared lock and needs
  to copy-on-write.
  Avoids deadlocks/panics where to clean pages the copy-on-write needs
  to allocate pages for its VOP_PUTPAGES().

L_COWINPROGRESS part approved by: Jason R. Thorpe <thorpej@netbsd.org>
2004-06-20 18:55:58 +00:00
hannken
78a89b12b6 Use one daddr_t XXXblks[NDADDR + NIADDR] instead of two.
No functional changes. Reduces kernel stack usage by 120 bytes.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
2004-06-20 18:23:30 +00:00
he
94319b09bc Need to forward-declare "struct timespec" because the new ffs_snapshot()
function declaration refers to it.  Fixes build problem of sbin/badsect
for the vax target, which still uses gcc 2.95.3.
2004-06-04 07:43:56 +00:00
hannken
7b03a4b40c Once all block address modifications are done invalidate and
free all pages from the snapshot vnode.
2004-05-31 13:28:53 +00:00
hannken
55db9c1eb7 Fixup last commit. fs->fs_active must be initialized. 2004-05-27 17:04:52 +00:00
hannken
712c432a06 Don't use VTOI(vp)->i_flags to test for snapshot devices. Will not work
for non-UFS file systems. Test for VBLK vnode instead.
2004-05-26 20:33:10 +00:00
hannken
d63a5c8e01 Make it compile without option FFS_EI. 2004-05-26 11:17:39 +00:00
hannken
8c21bc6224 Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.
- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
    may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
    Snapshots may not be opened for writing and the attributes are read-only.
    Use the mtime as the time this snapshot was taken.
    Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
  one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
  a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
2004-05-25 14:54:55 +00:00
atatat
53c625655c Sysctl descriptions under vfs subtree 2004-05-25 04:44:43 +00:00
atatat
10a7ba9ef6 Tweak sysctl setup functions (the macros, actually) for use in lkms,
and tweak lkminit_*.c (where applicable) to call them, and to call
sysctl_teardown() when being unloaded.

This consists of (1) making setup functions not be static when being
compiled as lkms (change to sys/sysctl.h), (2) making prototypes
visible for the various setup functions in header files (changes to
various header files), and (3) making simple "load" and "unload"
functions in the actual lkminit stuff.

linux_sysctl.c also needs its root exposed (ie, made not static) for
this (when built as an lkm).
2004-05-20 06:34:24 +00:00
atatat
1d3a6a329e Explicitly call pool_init() (and pool_destroy()) when being built as
an _LKM.

This adds pools to the list of things that lkms must do manually
because they're set up with link sets.  Not that there's anything
wrong with link sets, but that we need to try harder to remember that
lkms are second class citizens.  Of a sort.
2004-05-20 05:39:34 +00:00
simonb
b09560304e Unwrap a not-too-long line. 2004-04-26 01:40:40 +00:00
dbj
9a0dfd8c28 remove botched superblock upgrade warnings.
there are now alternate non-kernel checks and fixes for this problem.
relevent prs include:
bin/17910 kern/21283 kern/21404 port-macppc/23925 port-macppc/23926
install/25138
2004-04-25 21:02:26 +00:00
simonb
b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
christos
6bd1d6d4db Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
2004-04-21 01:05:31 +00:00
dbj
dec025329f remove code that attempts to correct superblock location. this
enforces an unnecessary restriction that the superblock be in the
particular expected locations.  Also, the compatibility case is
handled in ffs_oldfscompat_read.
2004-04-18 03:35:16 +00:00
dbj
d7c33aeb1e when enabling ffs compatibility in ffs_reload, use
sblockloc that superblock was read from
also note XXX that ffs_reload doesn't handle superblock moving
2004-04-18 03:30:23 +00:00
dsl
cd6c744984 Rework previous so that FS_FLAGS_UPDATED is only looked at for ffsv1 2004-03-27 12:40:46 +00:00
atatat
19af35fd0d Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
2004-03-24 15:34:46 +00:00
dsl
221ec38d03 Rework superblock validation logic to make adding validity tests easier.
Ensure that we don't use the first alternate superblock of a ffsv1
filesystem with 64k blocks (it is in the same place as an ffsv2 sb).
Fixes part of PR kern/24809
2004-03-21 18:48:24 +00:00
dsl
57f6eb7ead Change comments - one I wrote earlier wasn't right.
Add a couple of notes about areas of the superblock being reassigned
when ffsv2 was imported.
2004-03-20 17:26:16 +00:00
dsl
11f75824d0 Add a large comment about the balls-up caused by the ffsv2 superblock
not being at 8k - causes all sorts of problems, in particular with
ffsv1 filessytems with 64k blocks, and disks that are reformatted from
ffsv1 to ffsv2 (and v.v.).  see also PR kern/24809
2004-03-20 15:37:12 +00:00
yamt
3e796c5be5 reserve a MAXBSIZE-sized buffer for inodedeps for pagedaemon.
PR/24443.
2004-03-11 11:50:43 +00:00
yamt
bfe5a94adc as we always replace whole buf in the case of indirdep,
simply changing b_data is enough.  eliminate M_INDIRDEP.

PR/24443.
2004-03-11 11:48:16 +00:00
dbj
8ad71c85f1 quiet tls. change botched superblock warning to use -b 16 2004-03-11 07:14:12 +00:00
keihan
b220964103 s/netbsd.org/NetBSD.org/g 2004-03-10 09:56:59 +00:00
jdolecek
d06e4401fa make sblock_try[] const 2004-02-22 08:58:03 +00:00
hannken
3db4e2acd8 Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.
VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp)  Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp)      Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.
2004-01-25 18:06:48 +00:00
soren
3e41a33af7 With large average filesizes, it was possible to overflow dirsize to zero,
causing a division by zero in ffs_dirpref().

From Barry Bouwsma of Tiengen.
2004-01-13 13:38:18 +00:00
dbj
51134cc5dd change the updating note to say you may need fsck_ffs -b 32 -c 4' 2004-01-12 16:19:19 +00:00
dbj
6202d43b3b add checks for a couple of botched superblock upgrade cases
and report a warning with repair references.
2004-01-12 05:49:03 +00:00
hannken
ed68c4e34c Allow vfs_write_suspend() to wait if the file system is already
suspending.

Move vfs_write_suspend() and vfs_write_resume() from kern/vfs_vnops.c
to kern/vfs_subr.c.

Change vnode write gating in ufs/ffs/ffs_softdep.c (from FreeBSD).

When vnodes are throttled in softdep_trackbufs() check for
file system suspension every 10 msecs to avoid a deadlock.
2004-01-10 17:16:38 +00:00
hannken
8308a4868d Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue.

Cleanup ffs_sync() which did not synchronously wait when MNT_WAIT
was specified. Clear the work queue when MNT_WAIT is specified.

Result is a clean on-disk file system after ffs_sync(.., MNT_WAIT, ..)

From FreeBSD.
2004-01-10 16:23:36 +00:00
yamt
7266a95907 store a i/o priority hint in struct buf for buffer queue discipline. 2004-01-10 14:39:50 +00:00
dbj
01637061e3 never upgrade the superblock or set FS_FLAGS_UPDATED in fs_old_flags
add compatibility for filesystems created before FFSv2 integration
these patches are from pr port-macppc/23926 and should also fix
problems discussed in pr kern/21404 and pr kern/21283
2004-01-09 19:10:22 +00:00
dbj
d7fdf500a8 reintroduce compatbility defines for
fs_headswitch, fs_trkseek, fs_csmask, fs_csshift
fs_postbl, fs_rotbl, cg_blktot, cg_blks, cbtocylno, cbtorpos
2004-01-03 19:18:17 +00:00
dbj
ca5ad5d61f explicitly pad struct appleufslabel and use __attribute__((__packed__))
since apple put the 64 bit uuid field on a 4 byte boundary
2004-01-02 06:57:46 +00:00
dbj
23d4eb34b2 add uuid field to apple ufs volume label 2004-01-02 05:08:57 +00:00
dbj
65a136e22d remove incorrect XXX comments I introduced a couple of days ago 2003-12-31 19:33:13 +00:00
dbj
ba5b25c952 remove unused cs_numclusters field from struct csum_total
this avoids a potential future bug if it is ever used.
before this fix, fsck_ffs would check and fix this field to be zero
2003-12-31 19:19:39 +00:00
dbj
c0000df464 update explanatory comment about NOCSPTRS to reflect that fs_active
is now within that region.
no functional change
2003-12-31 18:53:45 +00:00
dbj
82a1a92247 reorder ffs_sb_swap to reflect actual order in superblock
add comments regarding historical field overlap
no functional change
2003-12-31 18:40:23 +00:00
dbj
4bdc4574c7 add fs_flags to ffs_sb_swap 2003-12-31 18:32:47 +00:00
pk
70f20a1217 Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes).  It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms.  Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.
2003-12-30 12:33:13 +00:00
dbj
dbba662bc8 fix bugs in ffs_cg_swap for FS_42POSTBLFMT 2003-12-30 03:30:43 +00:00
atatat
13f8d2ce5f Dynamic sysctl.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al.  Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded.  Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment.  I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
2003-12-04 19:38:21 +00:00
dbj
d8b416001c clarify comments, especially since ffs_isfreeblock is non-intuitive:
ffs_isblock:
    check if a block is available
      returns true if all the correponding bits in the free map are 1
     returns false if any corresponding bit in the free map is 0
  ffs_isfreeblock:
    check if a block is completely allocated
      returns true if all the corresponding bits in the free map are 0
      returns false if any corresponding bit in the free map is 1
2003-12-02 04:40:43 +00:00
dbj
f31a623a56 in ffs_unmount, ignore error returned by VOP_CLOSE(devvp)
this fixes a problem where device close error would cause
unmount to fail but structures to be left partially deallocated
2003-12-01 18:57:07 +00:00
mycroft
c714b3b115 Remove part of previous -- there is NO reason for directory allocation to use
arc4random().
2003-11-27 04:52:55 +00:00