Commit Graph

1741 Commits

Author SHA1 Message Date
ad
83f7350f6d PR kern/40210 5.0 BETA WAPBL related crash 2008-12-21 10:44:32 +00:00
pgoyette
9c68331911 Store config(1)'s root filesystem type as a text string rather than
embedding the address of its xxx_mountroot() in swapnetbsd.c.  This
permits booting of kernels with hard-wired filesystem type even if the
filesystem is in a loadable module (ie, not linked into the kernel
image).

Discussed on current-users.  Tested on amd64 and i386 with both hard-
wired and '?' filesystem times, and on both modular and monolithic
kernels.

Thanks to pooka@ for code review and suggestions.

Addresses my PR kern/40167
2008-12-19 17:11:57 +00:00
hannken
e1e7ee242d Restore a line removed by mistake with the last commit.
Should fix PR 40225 panic: indiracct: missing indir.
2008-12-19 11:36:10 +00:00
cegger
9b87d582bd kill MALLOC and FREE macros. 2008-12-17 20:51:31 +00:00
dholland
2adcd1a3c2 Don't deadlock on rename("foo/foo", "foo") in the case where foo/foo is a
directory. This doesn't affect non-wapbl renames; it affects wapbl because
one of the lock acquisitions was moved up past where this case otherwise
fails.

PR 40163 from Lloyd Parkes.
2008-12-13 04:45:28 +00:00
pooka
d34f27b620 Decode write access advice and pass to uvm (not that it's handled
there, but ...).
2008-12-08 11:48:03 +00:00
pooka
5297d2febf Don't even try to pretend WAPBL_DEBUG_INODES works here, just #error. 2008-12-08 11:37:37 +00:00
pooka
836c2144d0 Remove no longer valid comment (which probably didn't even say what
it wanted to say in the first place).
2008-12-08 11:34:30 +00:00
hannken
59f928fb25 ffs_copyonwrite(): Only use si_snapblklist if it is already allocated.
ffs_snapshot_read(): Use IO_ALTSEMANTICS to allow reading a snapshot vnode
                     beyond file system size.  Needed to read the snapblklist
                     on mount.

Persistent snapshots work again.

Should fix PR kern/37425: fss_snapshot_mount panic during fsck.
2008-12-07 19:51:07 +00:00
hannken
8e313cc27b Revert previous -- ALL reads are from kernel space.
Still open: PR kern/37425: fss_snapshot_mount panic during fsck.
2008-12-07 18:55:58 +00:00
hannken
7dbaf06e71 ffs_copyonwrite(): Only use si_snapblklist if it is already allocated.
ffs_snapshot_read(): Allow the kernel to read beyond file system size.

Persistent snapshots work again.

Should fix PR kern/37425: fss_snapshot_mount panic during fsck.
2008-12-07 10:01:09 +00:00
joerg
9a364d2ed3 Split ffs_freefile into a frontend for normal cylinder group and for
snapshot use. Adjust ffs_blkfree_common to get the fs instance passed
in, the original commit didn't account blocks in the snapshots
correctly. Assert that ffs_blkfree is used with the primary fs instance
and that ffs_checkfreefile is only used for snapshots. Move the bdwrite
from ffs_blkfree_common into the caller for symmetry. This creates a
redundant write of unmodified data for ffs_blkfree_snap if a double free
of a block happens.

Reviewed and tested by hannken@.
2008-12-06 20:05:55 +00:00
joerg
6cdfaeec55 Revert last. Conditionalize variables on FFS_EI. 2008-12-01 13:45:51 +00:00
cegger
9c45aac9d8 build fix: remove unused variables 2008-12-01 13:33:39 +00:00
joerg
740a2c079c ffs_blkfree is used in two different ways. The normal usage is to free a
block in the cylinder groups of the filesystem. The other user is the
snapshot code, which wants to modify the copied cylinder groups. Use
different frontends to distinguish the cases in preparation for fine
grained locking for cylinder groups.
2008-12-01 13:22:06 +00:00
joerg
dfd7714b8f Split ffs_blkalloc into a frontend that does inode based consistency
checks and a backend that just asserts them. Use the backend in
ffs_wapbl_abort_sync_metadata instead of faking an inode.
2008-11-30 16:20:44 +00:00
pooka
b4099c3e1d Rototill all remaining file systems to use ubc_uiomove() instead
of the ubc_alloc() - uiomove() - ubc_release() dance.
2008-11-26 20:17:33 +00:00
tsutsui
24ec9f1685 Remove an extra semicolon. 2008-11-24 17:11:43 +00:00
mrg
ef028d2043 add support for 32 bit uid/gid fields in ext2, but only do so for
when the revision is > REV0.
2008-11-23 10:09:25 +00:00
ad
2f839a2253 These depend on ffs. 2008-11-13 11:10:41 +00:00
ad
bed0008a9a Remove #ifdef LFS from the ufs code. 2008-11-13 11:09:45 +00:00
ad
d7360a07e4 _KERNEL_OPT 2008-11-13 10:48:52 +00:00
ad
669263e5c2 _KERNEL_OPT 2008-11-13 10:14:55 +00:00
joerg
e09eb39f96 wapbl_replay_free needs the reply to have been stopped, so make sure
that the changes happen in the right order. Reported by veego@
2008-11-11 21:02:54 +00:00
joerg
b9400f6fd4 Move WAPL replay handling from bread() into ufs_strategy.
This changes the order of hook processing as the copy-on-write handlers
are called after the journal processing. This makes more sense as the
journal overwrite is logically part of the disk IO.
2008-11-11 08:29:58 +00:00
joerg
3fbdfc8af9 Reduce internals of WAPBL exposed to the rest of the system. 2008-11-10 20:12:13 +00:00
joerg
ecbfc2933c Remove XXXUBC code for ffs_reallocblks, that has been conditionalized in
2002 and #if 0'ed in 2005. It would need a considerable amount of work
to bring back and obscures the more important block allocation.
2008-11-06 22:31:08 +00:00
joerg
564d6ccca2 Fix indentation. 2008-10-30 17:03:09 +00:00
hannken
06529f4f6d Correct previous.
- Count frags, not blocks to get the file system size.
- Cannot use blksize() here, it depends on vnode size.
- Correctly update xfersize on short reads.
2008-10-23 17:16:24 +00:00
hannken
02630b7919 When computing the requests hard limit in ffs_snapshot_read()
use the file system size, not the size of the snapshot vnode.
2008-10-23 14:25:21 +00:00
hannken
ac6b16172a Make genfs_directio() IO_JOURNALLOCKED aware. DirectIO no longer triggers
"locking against myself" panic in wapbl_begin().

Observed and tested by: Frank Kardel <kardel@netbsd.org>
2008-10-19 18:17:13 +00:00
hannken
44f3404f57 Break a deadlock where one thread has a wapbl transaction, calls VOP_GETPAGES
and wants to busy a page  while  another thread calls VOP_PUTPAGES on the same
vnode, takes pages busy and wants to start a wapbl transaction.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
2008-10-10 09:21:58 +00:00
pooka
fb58d5d6ec #error if WABPL_DEBUG_INODES is defined. That code has bitrotted
more than casu marzu cheese.
2008-10-08 22:58:56 +00:00
pooka
ed8826a34e Remove some of my debugging code which was not meant to be committed
in the wapbl merge.
2008-09-23 15:27:59 +00:00
christos
8d896b1fc3 fix reversed comment, from anon ymous 2008-09-23 12:37:05 +00:00
freza
4ebea2e5f3 Revert previous, pooka@ points out it's wrong. 2008-09-21 23:22:00 +00:00
freza
262577121d WAPBL: in '%s: replaying log to disk' message use the path we're
trying to mount on instead of the misleading last-mounted-on
    path. Reported by jmcneill.
2008-09-21 21:08:22 +00:00
hannken
72538db7ba Adjust some WAPBL transactions:
- Put transaction inside cgaccount() to simplify caller.
- No vget() / vrele() inside a transaction.
2008-09-08 14:22:31 +00:00
joerg
fe12f360b8 Move successful removal of unreferenced inodes under WAPBL_DEBUG to not
spam the console.

OK simon@
2008-09-08 03:16:43 +00:00
hannken
15d57e4f81 Ffs_snapshot() has become a huge monster over the time. Break it into
helper functions to enhance readability.  Adjust comments to reality
and test the main error paths.

While here, expand and remove the last FreeBSD->NetBSD conversion macros.

No functional change intended.
2008-09-02 08:51:46 +00:00
hannken
ced96ac82f ffs_truncate() always runs with journal locked. Propagate this information
to VOP_PUTPAGES().

Report from Lars Nordlund on current-users@
2008-08-30 08:25:53 +00:00
hannken
86492b8668 Sync the just created snapshot to disk.
Invalidate short ( < fs_bsize ) buffers.  We will always read full
size buffers later.

Should fix PR #39402
2008-08-25 13:34:53 +00:00
hannken
24b2cb27c8 Add missing vput() for logvp.
Fixes PR #39400
2008-08-24 15:33:37 +00:00
hannken
83fe3aa0dd Merge the _ufs1 and _ufs2 variants of the expunge and accounting functions.
Remove some unneeded UFS_FSNEEDSWAP().

Saves ~250 lines of redundant code.
2008-08-24 09:51:47 +00:00
hannken
88400c4373 Add snapshot support for logging ffs file systems.
- Add UFS_WAPBL_BEGIN() / UFS_WAPBL_END() where needed.

- Expunge WAPBL log inodes from snapshots.

- Ffs_copyonwrite() and ffs_snapblkfree() must run inside a WAPBL transaction.

- Add ffs_gop_write() as a wrapper around genfs_gop_write() that makes sure
  genfs_gop_write() gets always called inside a WAPBL transaction.

- Add VOP_PUTPAGES() flag PGO_JOURNALLOCKED to tag calls to VOP_PUTPAGES()
  inside a WAPBL transaction.

Reviewed by: Simon Burge <simonb@netbsd.org>,  Greg Oster <oster@netbsd.org>

PGO_JOURNALLOCKED / ffs_gop_write() part presented on tech-kern@.
2008-08-22 10:48:22 +00:00
hannken
9d026d04ef ffs_suspendctl: make sure everything is on disk and the on disk log is empty. 2008-08-15 17:32:32 +00:00
matt
8165c33c80 Implement following constants and add support their to the UFS family of file
systems:
	_PC_2_SYMLINKS
	_PC_SYMLINK_MAX

From andy dot shevchenko at gmail dot com.
2008-08-14 16:19:25 +00:00
hannken
93d3ff0e45 Deny read/write access to snapshot vnodes. We use fss(4) to read from
snapshots.  With this policy in place:

- Separate the snapshot vnode lock from the snapshot common lock.
  Snapshots no longer need recursive vnode locks.

- Use a mutex (si_snaplock) to serialize creation, deletion, reading and
  writing of snapshots.

- Move ffs_read() for snapshots into ffs_snapshot.c.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>

While here change ffs_copyonwrite() to fail requests from pagedaemon that need
to copy-on-write.
2008-08-12 10:14:37 +00:00
oster
328f49787a Define UFS_WAPBL_UNREGISTER_INODE() and UFS_WAPBL_REGISTER_INODE()
to something that pacifies the compiler in the non-WAPBL case.

Fix suggested by Martin Husemann.  Fixes PR#39302.
2008-08-06 21:18:59 +00:00
hannken
601ab263e0 Do not call UFS_WAPBL_*() when ffs_freefile() is acting on a snapshot.
While here replace the test for VBLK with a convenience variable.
2008-08-06 12:54:26 +00:00
pooka
b43847b66c zu, not zd, to print size_t 2008-08-05 13:39:29 +00:00
simonb
ba0675032b Only allow WAPBL to operate with UFS2 style superblocks.
Problem reported by Takeshi Nakayama.
2008-08-04 15:55:11 +00:00
simonb
53aeec60eb When checking if there's enough space at the end of a partition,
compare bytes vs bytes, not sectors vs bytes.

Problem discovered and fix tested by Michael Hitch.
2008-08-02 02:23:51 +00:00
oster
9921f79434 Make MSDOS filesystems work again after WAPBL merge. Fixes a quite
repeatable panic in fstrans_getstate() found while searching for a
different USB bug.  Also makes the code somewhat more readable.

Patch from Juergen Hannken-Illjes with a small rearrangement from me.

Approved by: hannken
2008-07-31 23:49:50 +00:00
hannken
85271ee0ad Ffs snapshots don't work (yet) with WAPBL:
- no snapshot creation on logging file systems.
- refuse to mount logging file systems with persistent snapshots.

Ok: Simon Burge <simonb@netbsd.org>
2008-07-31 15:37:56 +00:00
hannken
fb6f2c9b30 Resolve a deadlock when fs_nodealloccg() initializes more inodes on
an UFS2 file system.  With the current cylinder group buffer busy it
calls ffs_getblk().  This runs through copy-on-write and may need the
current cylinder group buffer to allocate a new block for the snapshot.

While here write the cylinder group buffer synchronously after
cg_initediblk was changed because fsck_ffs will trust it.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
2008-07-31 09:35:09 +00:00
simonb
0800f2fb9f Be consistent with #define<tab>. 2008-07-31 08:44:21 +00:00
simonb
36d65f1138 Merge the simonb-wapbl branch. From the original branch commit:
Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
   journaling code.  Originally written by Darrin B. Jewell while
   at Wasabi and updated to -current by Antti Kantee, Andy Doran,
   Greg Oster and Simon Burge.

OK'd by core@, releng@.
2008-07-31 05:38:04 +00:00
hannken
4685b65ada ffs_snapshot():
Release allocated indir blocks on non-softdep file systems instead
    of writing them twice.
    It is sufficient to clean dirty data pages to avoid UBC inconsistencies.

ffs_snapblkfree() and wrsnapblk():
    If a snapshots effective link count is zero there is no need
    to use synchronous writes.

ffs_copyonwrite():
    Defer locking the snapshots until there is a need to copy the block.

wrsnapblk():
    Use vn_rdwr() instead of bwrite() to write to the snapshots.
2008-07-30 10:09:30 +00:00
hannken
f67742b3c8 expunge_ufs*(): Use the buffer cache to update the inodes on the snapshot like
the rest of snapshot creation does.
2008-07-15 08:20:56 +00:00
simonb
27ae933bee Fix potential 32-bit overflow problem in the blockpref code.
mlelstv@ points out FreeBSD fixed the same thing a couple of years
ago - here's the commit message they used on rev 1.127:

   Fixes a bug that caused UFS2 filesystems bigger than 2TB to
   prematurely report that they were full and/or to panic the kernel
   with the message ``ffs_clusteralloc: allocated out of group''.

   Submitted by:   Henry Whincup <henry@jot.to>
2008-07-11 05:31:44 +00:00
ad
c4d3520a6e ufsdirhash_build: missing unlock in failure path. 2008-07-03 09:56:15 +00:00
rumble
4c59ea5a60 Fix lkm fallout from previous sysctl changes. This largely duplicates
sysctl creation code, but lkms are going away soon(ish) anyway.

Spotted by Chris Gilbert.
2008-06-28 15:50:20 +00:00
rumble
28f5ebd853 Create sysctl entries during module initialisation and destroy them
appropriately.

Many of these file systems are now ready for modularisation.
2008-06-28 01:34:05 +00:00
gmcgarry
838d34e828 fcntl(4) says the command is type int. lfs_fcntl() comment says u_long. The implementation says int. Synchronise comment with documentation and cast to int before comparison. 2008-06-24 10:47:32 +00:00
reinoud
f6a70673ba Mark a buffer busy in getnewbuf() when it came from the pool_cache since
its not on a free list.

Also change buf_init() to not automatically mark buffers `busy' since this
only makes sense for bufcache buffers.

Mark all buf_init'd buffers 'busy' on the places where they ought to be
flagged as such to not confuse the buffer cache.

Fixes PR 38923.
2008-06-17 14:53:10 +00:00
skd
66fcc9f90f Add some locking, runs with DIAGNOSTIC. 2008-06-16 02:36:27 +00:00
skd
18ccf49656 Fix two cases where we would panic locking against ourselves. 2008-06-15 21:18:06 +00:00
hannken
a618b33b8c ufs_blkatoff: Update comment. 2008-06-05 09:32:29 +00:00
ad
e89db9644e When setting DONE on the buffer, assert that there are no waiters in
biowait().
2008-06-04 17:46:21 +00:00
ad
06c343ac94 vm_page: put TAILQ_ENTRY into a union with LIST_ENTRY, so we can use both. 2008-06-04 12:41:40 +00:00
ad
6b01ae766f - Tidy up the locking a bit.
- Use atomics/kmem_alloc/pool_cache.
2008-06-04 11:33:19 +00:00
hannken
15e90e4bbc ufs/ffs: replace calls to getblk() with ffs_getblk(). Now all buffers
have been run through copy-on-write and async mounts work again.

Fixes PR kern/38820

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2008-06-03 09:47:49 +00:00
ad
3a8db3158e Use atomics to maintain v_usecount. 2008-06-02 16:25:34 +00:00
ad
0d151db6c9 Don't needlessly acquire v_interlock. 2008-06-02 16:00:33 +00:00
christos
ee16aae1e5 Revert to using specfs_fsync(); using a do-nothing mfs_fsync() does not work
because the filesystem cannot be unmounted since ffs_fsync() will loop forever
trying to empty the v_dirtyblkhd list.
2008-06-02 00:24:28 +00:00
ad
c35a9dfad1 Put a TNF copyright on it. 2008-05-31 21:39:13 +00:00
ad
3592ae4882 XXX softdep:
If the number of deletes in progress is getting too high, newdirrem()
requests the syncer to flush faster, and in some cases will block to
prevent deletes accumulating faster than the disk can service them.

The syncer will try to lock vnodes that the remover holds locked, leading
to the syncer and remover proceeding in lockstep and making very little
overall forward progress.

Put a hook into ufs_rmdir() and ufs_remove() so that the softdep code
can pace itself without holding vnode locks if the number of deletes is
running out of control.
2008-05-31 21:37:08 +00:00
hannken
336f2a69f4 ffs_copyonwrite(): stop abusing ffs_balloc() to get a block address.
Use ufs_getlbns()/bread() instead.
Saves some reads and removes deep recursion with possible deadlock
when ffs_balloc() runs copy-on-write on the buffer returned.
2008-05-29 10:00:50 +00:00
nakayama
b70810493a s/log file system/log-structured file system/ 2008-05-24 18:14:24 +00:00
ad
9e6de4df51 Don't moan about LFS unless the mount succeeds. 2008-05-20 16:26:04 +00:00
ad
a0fd5bc68d Until these get fixed or replaced:
WARNING: the foo file system is experimental and may be unstable
2008-05-18 13:56:12 +00:00
hannken
5d2bff060a Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write.  Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn().  If set the caller
  intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
  may clear the buffer and runs copy-on-write.  Process possible errors
  from getblk() or fscow_run().  Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2008-05-16 09:21:59 +00:00
rumble
a1221b6d4a Convert file systems to dynamically attach with the new module interface.
Make VFS hooks dynamic while we're here and say farewell to VFS_ATTACH and
VFS_HOOKS_ATTACH linksets.

As a consequence, most of the file systems can now be loaded as new style
modules.

Quick sanity check by ad@.
2008-05-10 02:26:09 +00:00
ad
05b982d1c7 mfs doesn't need fsync. 2008-05-07 21:30:42 +00:00
ad
42d0626726 PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
  sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
  and is only ever write locked in dounmount(). A write hold can't be taken
  on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
  example when going r/o -> r/w, and is only present to serialize updates.
  In order to take this lock, a read hold must first be taken on
  mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
2008-05-06 18:43:44 +00:00
ad
e071d39c84 - Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
2008-05-05 17:11:16 +00:00
ad
928a6b2096 PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
2008-04-30 12:49:16 +00:00
ad
e3610f1886 kern/38135 vfs_busy/vfs_trybusy confusion
The symptom was that sometimes file systems would occasionally not appear
in output from 'df' or 'mount' if the system was busy. Resolution:

- Make mount locks work somewhat like vm_map locks.
- vfs_trybusy() now only fails if the mount is gone, or if someone is
  unmounting the file system. Simple contention on mnt_lock doesn't
  cause it to fail.
- vfs_busy() will wait even if the file system is being unmounted.
2008-04-29 23:51:04 +00:00
ad
baa3395f8f PR kern/38057 ffs makes assuptions about devvp file system
PR kern/33406 softdeps get stuck in endless loop

Introduce VFS_FSYNC() and call it when syncing a block device, if it
has a mounted file system.
2008-04-29 18:18:08 +00:00
martin
ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad
284c2b9aef Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.
2008-04-24 18:39:20 +00:00
ad
6d70f903e6 Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
  be sent from a hardware interrupt handler. Signal activity must be
  deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
  and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.
2008-04-24 15:35:27 +00:00
ad
d9bace2a92 Acquire kernel_lock directly in LFS syscalls. 2008-04-21 11:45:34 +00:00
hannken
2fd9e21242 Replace get/setspecific with a void pointer in struct ufsmount. Use explicit
initialization/finalization of snapshot private data on creation/deletion
of struct ufsmount.
Snapshot mounts no longer may fail silently because kmem_alloc() fails.

Welcome to 4.99.60

Ok: Andrew Doran <ad@netbsd.org>
2008-04-17 09:52:47 +00:00
ad
0701eb1ec7 newdirrem: if the number of deletes in progress is getting too high, start
pushing the syncer before considering rate limiting the deletes. We hold
vnodes locked and it's likely that the syncer will try to lock them while
flushing, leading to the syncer and remover proceeding in lockstep and
making very little forward progress. XXX this is not a solution.
2008-04-11 16:25:38 +00:00
ad
be04ac4896 Make rusage collection per-LWP and collate in the appropriate places.
cloned threads need a little bit more work but the locking needs to
be fixed first.
2008-03-27 19:06:51 +00:00
ad
021b86dd4b Changes for PR kern/38291 (panic unmounting MFS /tmp):
- Reference count the mfsnode to fix an aincent bug. Only destroy when
  reference count drops to zero. In mfs_start(), busy the mount and get
  a reference to the mfsnode to prevent it disappearing while the server
  is running. If the file system is gone already, vfs_busy() will fail.
- Always destroy the bufq.
- Use a global mfs_lock for simplicity.
- Replace use of malloc/free. Fixes broken MALLOC_TYPE change.
2008-03-26 14:19:43 +00:00
ad
a9ca7a3734 Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.
2008-03-21 21:54:58 +00:00
ad
110d5cc274 Make MFS MP-safe. Needed because of the funny tricks it plays. 2008-02-21 14:10:57 +00:00
matt
e2ca3f7504 Merge all the *different* definitions of bufqueues into one common one. 2008-02-20 17:13:29 +00:00
ad
fb00b83874 Give bbusy() an interlock argument. If the we need to wait for the buffer,
the interlock is dropped and reacquired when awoken. This allows for
busying buffers attached to a list that is not locked by bufcache_lock.
2008-02-15 13:46:04 +00:00
ad
b2fa822a33 The buffer LOCKED flag need not be under the protection of bufcache_lock,
BUSY is enough.
2008-02-15 13:30:56 +00:00
ad
648f07789f Do genfs_node_init() earlier. PR kern/36162. 2008-02-05 15:18:36 +00:00
hannken
a524d758da Make it work after lockmgr -> vlockmgr conversion:
- Initialize si_vnlock in si_mount_init().
  - Also initialize vl_recursecnt to zero.
- Destroy it only in si_mount_dtor().
- Simplify the v_lock <-> si_vnlock exchange.
- Don't abuse the overall error variable for LK_NOWAIT errors.
- ffs_snapremove: release the vnode one instead of three times.
2008-01-30 17:20:04 +00:00
ad
08da6a8594 Replace use of lockmgr. 2008-01-30 14:54:01 +00:00
ad
7356aff6af Replace use of LK_SLEEPFAIL. 2008-01-30 14:50:28 +00:00
ad
25153c3ec9 PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a
  mount takes a reference, and in turn the mount takes a reference to the
  vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
  locking inherited from 4.4BSD with a recursable rwlock.
2008-01-30 11:46:59 +00:00
ad
3490efcc63 Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
2008-01-30 09:50:19 +00:00
hannken
89cea1c2c4 - Always destroy si_vnlock after use.
- Take care of vnodes without file system data.
2008-01-28 17:49:06 +00:00
dholland
717e1785a5 Fix some race conditions in rename.
Introduce a per-FS rename lock and new vfsops to manipulate it.
Get this lock while renaming. Also add another relookup() in do_sys_rename,
which is a hack to kludge around some of the worst deficiencies of
ufs_rename.
reviewed-by: pooka (and an earlier rev by ad)
posted on tech-kern with no objections.
2008-01-28 14:31:15 +00:00
pooka
eb6074e41f Replace vrelel() 010101-mania with a flags parameter. However,
leave flags unimplemented for a while (no change in functionality).
2008-01-27 22:47:31 +00:00
ad
1997a1e1f4 Remove VOP_LEASE. Discussed on tech-kern. 2008-01-25 14:32:11 +00:00
ad
2c1cdef339 Ignore clean vnodes in a couple more places. 2008-01-25 14:16:41 +00:00
ad
51d8643120 - Fix a couple of locking botches.
- Remove calls to VOP_LEASE().
2008-01-25 13:31:22 +00:00
pooka
1bff00ecec Destroy extattr lock when destroying extattrs associated with the
mountpoint.  Make stopping extattrs always succesful to facilitate
always being able to free resources.
2008-01-25 10:49:32 +00:00
pooka
04177824be spec_node_init() mfs device vnode.
fixes PR kern/37867 by Steve Woodford
2008-01-25 10:30:20 +00:00
ad
927229e57b quotaoff: ignore clean vnodes. Fixes PR kern/37818. 2008-01-24 22:55:21 +00:00
ad
703069c0e9 specfs changes for PR kern/37717 (raidclose() is no longer called on
shutdown). There are still problems with device access and a PR will be
filed.

- Kill checkalias(). Allow multiple vnodes to reference a single device.

- Don't play dangerous tricks with block vnodes to ensure that only one
  vnode can describe a block device. Instead, prohibit concurrent opens of
  block devices. As a bonus remove the unreliable code that prevents
  multiple file system mounts on the same device. It's no longer needed.

- Track opens by vnode and by device. Issue cdev_close() when the last open
  goes away, instead of abusing vnode::v_usecount to tell if the device is
  open.
2008-01-24 17:32:52 +00:00
hannken
0409981d90 si_mount_dtor(): destroy si_vnlock before free. 2008-01-24 17:29:56 +00:00
hannken
ca2bbf9908 Fix a typo from the vmlocking2 merge: vmark() the right vnode. 2008-01-24 16:26:44 +00:00
pooka
c40d447080 Sprinkle comments about um_lock status on function entry and exit.
No functional change.
2008-01-21 23:36:26 +00:00
ad
42c90ece4c Fix dodgy tests of v_usecount. 2008-01-17 10:39:14 +00:00
ad
abb6491738 mfs_close: remove a broken assertion. 2008-01-17 10:24:05 +00:00
christos
20f21ddc4a fix locking botch; vunmark() needs the mountvnode lock, and the loop invariant
is to exit with the mountvnode lock held.
2008-01-15 21:30:46 +00:00
ad
08c770b434 Initialize caches at IPL_SOFTBIO (not IPL_NONE) so that we are allocating
from kmem_map.
2008-01-12 00:34:10 +00:00
ad
dd68496f43 Fix hangs on 'biolock' when creating a directory under / with softdep. 2008-01-09 18:20:54 +00:00
ad
0b52913dee Go back to freeing on disk inodes in the inactive routine. It would be
better not to do this, but it rules out potential side effects with softdep.
2008-01-09 16:15:22 +00:00
ad
52603a7d6d Fix 'panic: softdep_update_inodeblock: update failed'. 2008-01-07 16:56:27 +00:00
tnn
51f964a289 softdep_freefile: don't acquire ufsmount lock twice. 2008-01-07 05:20:25 +00:00
ad
e01dd1a1f8 Use pool_cache. 2008-01-03 19:28:48 +00:00
pooka
8e7631e799 make the UFS_EXTATTR case build 2008-01-03 02:23:27 +00:00
pooka
d53e261066 valloc -> vnalloc, vfree -> vnfree
Avoids collision with userland valloc(3).

no functional change
ad ok
2008-01-03 01:26:28 +00:00
ad
4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
perry
b6a2ef7569 Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
2007-12-25 18:33:32 +00:00
dsl
7e2790cf6f Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
    int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
2007-12-20 23:02:38 +00:00
dyoung
c9b4d7e0b1 Call genfs_node_init a little earlier to avoid a vput()ing an
uninitialized node, later, which leads to a kernel panic.  Patch
by Antti Kantee.
2007-12-20 16:18:57 +00:00
he
cd15efd05d Fix a use of lfs_truncate() inside an #ifdef notyet (so no resulting change);
lfs_truncate() has lost its lwp argument.
2007-12-12 18:36:10 +00:00
he
8b0afebe4f Make this build again, as part of sys/lkm/dev/vnd/:
- lfs_truncate() has lost its lwp argument.
- Cast from void* to char* before doing pointer arithmetic.
2007-12-12 18:35:21 +00:00
lukem
2b0e4fae39 Move __KERNEL_RCSID() so that it's always available if this file is
compiled, even if DEBUG isn't defined.
(This matches the behaviour of various other source files that
provide functions only if DEBUG is enabled.)
2007-12-12 03:49:03 +00:00
ad
e38a5a204c Fix a stray brelse() that got missed. 2007-12-12 03:10:47 +00:00
lukem
85a77245b2 defflag LFS_KERNEL_RFW (in opt_lfs.h).
Note: lfs_rfw.c doesn't compile if you define the option; locking API fallout?
2007-12-12 02:56:03 +00:00
lukem
961afe5035 use __KERNEL_RCSID() instead of __RCSID() 2007-12-11 12:16:34 +00:00
pooka
db06a930e6 Remove cn_lwp from struct componentname. curlwp should be used
from on.  The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.
2007-12-08 19:29:36 +00:00
ad
5870896633 Grab ump->um_lock in another spot. 2007-12-08 15:23:32 +00:00
ad
c27abd2a99 Add some comments. 2007-12-08 15:21:19 +00:00
hannken
d556dc98b0 Fscow_run(): add a flag "bool data_valid" to note still valid data.
Buffers run through copy-on-write are marked B_COWDONE.  This condition
is valid until the buffer has run through bwrite() and gets cleared from
biodone().

Welcome to 4.99.39.

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2007-12-02 13:56:15 +00:00
tsutsui
581d311161 Use e2fs_first_dblock in superblock to read/write group descriptor blocks. 2007-12-01 16:57:26 +00:00
pooka
61e8303e9d Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start.  In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern
2007-11-26 19:01:26 +00:00
tsutsui
e2199739d6 Misc cosmetics. 2007-11-26 15:46:48 +00:00
tsutsui
da1f43d88f Account e2fs_reserved_ngdb blocks accordingly in ext2fs_statvfs(). 2007-11-26 15:41:03 +00:00
dholland
9c31b45f5e Change the fs_clean member of the ffs superblock to be unsigned
(uint8_t instead of int8_t) - this prevents an ugly sign-extension
printing bug as well as formally undefined behavior when you mount an
unclean fs enough times.

From (my own) PR kern/28134; I've been carrying this patch for three
years, long enough to forget about it, and it's had no ill effects in
that time.

reviewed: pooka
2007-11-23 18:00:46 +00:00
pooka
2b2c7bcc21 Update comments: ufs_{rename,mkdir,rmdir} haven't been system calls
since 4.3BSD.
2007-11-23 14:18:01 +00:00
yamt
a3dbd70683 lfs_mountroot: use vfs_destroy. 2007-11-22 10:51:44 +00:00
tsutsui
d1112c275c Misc cosmetics. 2007-11-17 08:51:51 +00:00
tsutsui
b95d1c0be7 Some KNF and cosmetics. 2007-11-17 08:34:38 +00:00
tsutsui
bf39da4423 Also bswap recently added e2fs_reserved_ngdb in e2fs_sb_bswap(). 2007-11-17 03:43:18 +00:00
tsutsui
0192f3ddfc Add some definitions for resizefs features. 2007-11-15 12:59:17 +00:00
tsutsui
bed6c72507 - add some more constant definitions
- add some comments
- typo and some cosmetics
2007-11-13 13:50:18 +00:00
rmind
f499d5e662 Use PRI_BIO for kthreads instead of PINOD. Fixes a missed case of priority
inversion, which caused LFS to fire some assertions.

Reported by Kurt Schreiner on <current-users>.
2007-11-10 18:53:57 +00:00
ad
d18c6ca4de Merge from vmlocking:
- pool_cache changes.
- Debugger/procfs locking fixes.
- Other minor changes.
2007-11-07 00:23:13 +00:00
hannken
75bdfc0dac Avoid doing bawrite to initialize inode block while holding cylinder
group block buffer busy.  If filesystem has any active snapshots, bawrite
can come back trying to allocate new snapshot data block from the same
cylinder group and cause deadlock.

From FreeBSD Rev. 1.117
2007-11-01 06:31:59 +00:00
hannken
df70316e3f Ffs_blkfree() and ffs_freefile() take a devvp that may be a regular file whencalled from snapshot creation. Be sure to use the right mount.
Ok: Andrew Doran <ad@netbsd.org>
2007-10-18 17:39:04 +00:00
ad
fd7e1bf0d1 Sync with ffs: fix ufs_ihashlock / ufs_hash_lock deadlock.
From Sverre Froyen.
2007-10-17 17:34:25 +00:00
ad
4c92a21547 Remove LOCK_ASSERT(!simple_lock_held(&foo)); 2007-10-11 19:53:37 +00:00
ad
f103d731ed Fix DEBUG builds. 2007-10-10 22:38:00 +00:00
ad
7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad
5c3b2b3f2d Merge ffs locking & brelse changes from the vmlocking branch. 2007-10-08 18:01:27 +00:00
hannken
3856acafe2 Update the file system copy-on-write handler.
- Instead of hooking the handler on the specdev of a mounted file system
  hook directly on the `struct mount'.

- Rename from `vn_cow_*' to `fscow_*' and move to `kern/vfs_trans.c'.  Use
  `mount_*specific' instead of clobbering `struct mount' or `struct specinfo'.

- Replace the hand-made reader/writer lock with a krwlock.

- Keep `vn_cow_*' functions and mark as obsolete.

- Welcome to NetBSD 4.99.32 - `struct specinfo' changed size.

Reviewed by: Jason Thorpe <thorpej@netbsd.org>
2007-10-07 13:38:53 +00:00
pooka
1b732f2a39 avoid variable size stack allocations 2007-09-25 15:13:14 +00:00
pooka
a51616225d BLKSIZE is always the same as blksize these days, so get rid of it. 2007-09-24 16:50:58 +00:00
pooka
094e51ae36 Fix comment inaccurate from prehistoric times: default MINFREE is 5, not 10 2007-09-24 16:20:50 +00:00
rumble
0ae0a486c7 Avoid stack allocation of large dirent structures in foo_readdir(). 2007-09-24 00:42:12 +00:00
pooka
b2c6279cb7 include sys/mount.h for export_args30 2007-09-10 23:47:23 +00:00
pooka
3f3cac88a3 Make bioops a pointer and point it to the softdeps struct in softdep
init.  Decouples "options SOFTDEP" from the main kernel and ffs code.
2007-09-01 23:40:21 +00:00
hannken
5657bcacb9 Modify ffs_lock() to take care for changed v_vnlock. Snapshots do not need
transferlockers() anymore.

From FreeBSD ffs_vnops.c Rev. 1.159

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2007-08-21 09:27:33 +00:00
hannken
f43278a84f - Use a mutex to protect snapinfo.
- Move the snapshot lock to snapinfo.
- ffs_snapblkfree(),ffs_copyonwrite(): replace lockmgr() with VOP_LOCK().
2007-08-18 10:11:01 +00:00
hannken
7d3c830b7c Expunge traces of unlinked snapshot files when making a new snapshot.
From FreeBSD Rev. 1.123
2007-08-18 09:48:33 +00:00
hannken
d56e8d1456 Move the fstrans-aware lock vnops from ufs to ffs. Other ufs file systems
do not need them.

Ride on 4.99.28
2007-08-09 09:22:34 +00:00
pooka
3de9f5d391 Instead of having lfs muck directly about with vnode free lists,
introduce vrele2(), which allows to release vnodes the way lfs
sometimes wants it:
  + without calling inactive
  + inserting the vnode at the head of the freelist (this is a very
    questionable optimization that isn't even enabled by default,
    but I went along with the same semantics for now)
2007-08-09 08:51:21 +00:00
hannken
70cc392f1e Move snapshot per-mount data from struct ufsmount to mount specific data.
No functional changes.

Welcome to 4.99.28  (struct ufsmount changed size)
2007-08-09 07:34:27 +00:00
pooka
95bedce095 include assumed headers 2007-08-02 12:53:30 +00:00
pooka
8d1f899239 * nuke the nameidata parameter from VFS_MOUNT(). Nobody on tech-kern
knew what it was supposed to be used for and wrstuden gave a go-ahead
* while rototilling, convert file systems which went easily to
  use VFS_PROTOS() instead of manually prototyping the methods
2007-07-31 21:14:15 +00:00
hannken
a508ec1750 - Replace the freelist with a pool. Current code never freed its dquot's.
- Always call dqsync() with dq locked.
- Add some assertions to verify the lock held.
- Serialize quotaon()/quotaoff(), dqhashmtx becomes dqlock.  From ad@

Reviewed by: Andrew Doran <ad@netbsd.org>
2007-07-31 09:29:52 +00:00
ad
a0d1fd8d0c It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.
2007-07-29 13:31:07 +00:00
yamt
662e7a9e40 use ubc_uiomove for read as well. 2007-07-27 10:00:42 +00:00
yamt
3822af7031 ubc_uiomove: add an "advice" argument rather than using UVM_ADV_RANDOM blindly. 2007-07-27 09:50:36 +00:00
pooka
1ce406a846 Change unused fflags parameter in VOP_MMAP to prot and pass in
desired vm protection.
2007-07-27 08:26:38 +00:00
pooka
d9970c8066 Use eopnotsupp() instead of vfs_stdsuspendctl() and retire the latter. 2007-07-26 22:57:36 +00:00
pooka
2189caca89 comment police: DIRBLKSIZE would be too chatty and therefore the
macro is known as DIRBLKSIZ
2007-07-23 14:58:04 +00:00
ad
f92f123709 Workaround the ufs_haslock/ufs_ihash_lock deadlock. From a patch
posted by Blair Sadewitz.
2007-07-23 09:05:02 +00:00
rumble
17b8c7717a Add missing RCSID. 2007-07-22 21:12:27 +00:00
christos
407114c830 make this compile again 2007-07-22 03:40:59 +00:00
ad
744a92f0f8 Don't depend on uvm_extern.h pulling in proc.h. 2007-07-21 19:06:20 +00:00
pooka
9137aeda4b In sync, skip over vnodes based on if they are clean rather than
if they have pages.
2007-07-20 16:46:43 +00:00
hannken
8169dcee9d Update and add locking to ufs_quota:
- Replace DQ_LOCK/DQ_WANT/sleep/wakeup with a mutex `dq_interlock'.  Use this
  mutex to protect all quota values and flags.
- Protect the hashtable with a mutex.
- Never update quotas for the quota files on the same file system.  Prevents
  a deadlock when dqsync() has to change the  quota file's size (PR #13942).

Reviewed by:	Andrew Doran <ad@netbsd.org>
		Bill Stouder-Studenmund <wrstuden@netbsd.org>
2007-07-19 09:33:04 +00:00
christos
c9d7b911ee Eliminate MFSNAMELEN 2007-07-17 21:26:41 +00:00
christos
785c01892b eliminate MFSNAMELEN 2007-07-17 21:20:43 +00:00
pooka
e24b0872a4 Make set_statvfs_info() take a parameter for the vfs name instead
of always retrieving it from mp->mnt_op->vfs_name

christos ok
2007-07-17 11:19:31 +00:00
pooka
395899ddd0 When allocating blocks, check minfree before asking kauth about
suser.  The latter has unknown cost and rarely needs to be called.
2007-07-16 14:26:08 +00:00