Commit Graph

135 Commits

Author SHA1 Message Date
tsutsui e7713433d4 Move declaration of ufs_hashlock into <ufs/ufs_extern.h> from each c source. 2009-09-13 05:17:36 +00:00
christos 461a86f9bd merge christos-time_t 2009-01-11 02:45:45 +00:00
hannken 5d2bff060a Make sure all cached buffers with valid, not yet written data have been
run through copy-on-write.  Call fscow_run() with valid data where possible.

The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against
endless recursion.

- Add a flag B_MODIFY to bread(), breada() and breadn().  If set the caller
  intends to modify the buffer returned.

- Always run copy-on-write on buffers returned from ffs_balloc().

- Add new function ffs_getblk() that gets a buffer, assigns a new blkno,
  may clear the buffer and runs copy-on-write.  Process possible errors
  from getblk() or fscow_run().  Part of PR kern/38664.

Welcome to 4.99.63

Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
2008-05-16 09:21:59 +00:00
ad 42d0626726 PR kern/38141 lookup/vfs_busy acquire rwlock recursively
Simplify the mount locking. Remove all the crud to deal with recursion on
the mount lock, and crud to deal with unmount as another weirdo lock.

Hopefully this will once and for all fix the deadlocks with this. With this
commit there are two locks on each mount:

- krwlock_t mnt_unmounting. This is used to prevent unmount across critical
  sections like getnewvnode(). It's only ever read locked with rw_tryenter(),
  and is only ever write locked in dounmount(). A write hold can't be taken
  on this lock if the current LWP could hold a vnode lock.

- kmutex_t mnt_updating. This is taken by threads updating the mount, for
  example when going r/o -> r/w, and is only present to serialize updates.
  In order to take this lock, a read hold must first be taken on
  mnt_unmounting, and the two need to be held across the operation.

One effect of this change: previously if an unmount failed, we would make a
half hearted attempt to back out of it gracefully, but that was unlikely to
work in a lot of cases. Now while an unmount that will be aborted is in
progress, new file operations within the mount will fail instead of being
delayed. That is unlikely to be a problem though, because if the admin
requests unmount of a file system then s(he) has made a decision to deny
access to the resource.
2008-05-06 18:43:44 +00:00
ad 928a6b2096 PR kern/38135 vfs_busy/vfs_trybusy confusion
The previous fix worked, but it opened a window where mounts could have
disappeared from mountlist while the caller was traversing it using
vfs_trybusy(). Fix that.
2008-04-30 12:49:16 +00:00
martin ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad d9bace2a92 Acquire kernel_lock directly in LFS syscalls. 2008-04-21 11:45:34 +00:00
ad 25153c3ec9 PR kern/37706 (forced unmount of file systems is unsafe):
- Do reference counting for 'struct mount'. Each vnode associated with a
  mount takes a reference, and in turn the mount takes a reference to the
  vfsops.
- Now that mounts are reference counted, replace the overcomplicated mount
  locking inherited from 4.4BSD with a recursable rwlock.
2008-01-30 11:46:59 +00:00
ad 3490efcc63 Replace struct lock on vnodes with a simpler lock object built on
krwlock_t. This is a step towards removing lockmgr and simplifying
vnode locking. Discussed on tech-kern.
2008-01-30 09:50:19 +00:00
ad 4a780c9ae2 Merge vmlocking2 to head. 2008-01-02 11:48:20 +00:00
dsl 7e2790cf6f Convert all the system call entry points from:
int foo(struct lwp *l, void *v, register_t *retval)
to:
    int foo(struct lwp *l, const struct foo_args *uap, register_t *retval)
Fixup compat code to not write into 'uap' and (in some cases) to actually
pass a correctly formatted 'uap' structure with the right name to the
next routine.
A few 'compat' routines that just call standard ones have been deleted.
All the 'compat' code compiles (along with the kernels required to test
build it).
98% done by automated scripts.
2007-12-20 23:02:38 +00:00
ad 7dad9f7391 Merge from vmlocking:
- Split vnode::v_flag into three fields, depending on field locking.
- simple_lock -> kmutex in a few places.
- Fix some simple locking problems.
2007-10-10 20:42:20 +00:00
ad 5c3b2b3f2d Merge ffs locking & brelse changes from the vmlocking branch. 2007-10-08 18:01:27 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
ad 9abeea588a Replace some uses of lockmgr() / simplelocks. 2007-02-15 15:40:50 +00:00
ad b07ec3fc38 Merge newlock2 to head. 2007-02-09 21:55:00 +00:00
elad 1e70d64818 Consistent usage of KAUTH_GENERIC_ISSUSER. 2007-01-04 16:55:29 +00:00
christos 168cd830d2 __unused removal on arguments; approved by core. 2006-11-16 01:32:37 +00:00
christos 4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
perseant 437e855235 Changes to help the roll-forward agent, to wit:
* Mark being-deleted files in the Ifile so we can finish deleting them
  at fs mount time.
* Flag the Ifile with "cleaner must clean" when writers are waiting for
  the cleaner, rather than relying solely on the cleaner's estimation of
  whether it should clean or not.
* Note partial segments written by a user agent (in particular,
  fsck_lfs) so that repeated rolls forward don't interfere with one
  another.
* Add a new fcntl, LFCNPASS, that allows the log to wrap exactly once,
  for better testing of the validity of checkpoints.
* Keep track of the on-disk nlink count when cleaning, so that we don't
  partially complete directory operations while cleaning.
* Ensure that every single Ifile inode write represents a consistent
  view of the filesystem.  In particular, the accounting for the segment
  we are writing the inode into must be correct, and the accounting for
  the segment that inode used to reside in must be correct.  Rather than
  just rewriting the inode if we wrote it wrong, rewrite the necessary
  ifile blocks before writing the inode so we never write it wrong.
* Don't unmark any VDIROP vnodes if we haven't written them to disk,
  avoiding yet another problem with the "wait for the cleaner" error
  return from lfs_putpages().

Also, move the last callback to an aiodone call, so we no longer do any
memory management from interrupt context.
2006-09-01 19:41:28 +00:00
ad f474dceb13 Use the LWP cached credentials where sane. 2006-07-23 22:06:03 +00:00
kardel de4337ab21 merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
  time.tv_sec -> time_second
- struct timeval mono_time is gone
  mono_time.tv_sec -> time_uptime
- access to time via
	{get,}{micro,nano,bin}time()
	get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
  Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
  NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
2006-06-07 22:33:33 +00:00
elad fc9422c9d9 integrate kauth. 2006-05-14 21:31:52 +00:00
perseant e52cd940c0 Get rid of the LFS_FORCE_WRITE case. We never really used it, and it could
panic the kernel if cleaner daemon passed the right combination of arguments.
Coverity CID 2741.
2006-04-18 22:42:33 +00:00
perseant 7c22dcc8a6 Several minor bug fixes:
* Correct (weak) segment lock assertions in lfs_fragextend and lfs_putpages.
* Keep IN_MODIFIED set if we run out of avail in lfs_putpages.
* Don't try to (re)write buffers on a VBLK vnode; fixes a panic I found
  while running with an LFS root.
* Raise priority of LFCNSEGWAIT to PVFS; PUSER is way too low for
  something the pagedaemon is relying on.
2006-04-07 23:59:28 +00:00
rtr aa6b2db95f init struct vnode *vp = NULL
coverity 2724 / run 6
XXX in future runs coverity may complain about deref NULL now but comment
    on line 382 indicates this should not be possible
2006-03-19 04:10:02 +00:00
tls a67eab5ee4 From Konrad Schroeder, in response to strange df output on anoncvs.netbsd.org:
We were returning the wrong value for free space.  Now we're not.
2006-03-17 23:21:01 +00:00
christos 95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
perseant 96f8f74d91 Don't update lfs_stats.segs_reclaimed if we're not keeping statistics.
Patch from Juan RP.
2005-05-25 01:50:01 +00:00
perseant 2ecd1730c0 Keep track of the number of segments reclaimed, since the cleaner doesn't
do this anymore (it hasn't for quite some time).  Add a couple of conditional
debugging messages to indicate why segments are not cleaned, in the event
that lfs_segclean is used.

Make the LFCNSEGWAITALL fcntl work again.
2005-05-20 19:48:25 +00:00
perseant 94decdd25d Use lfs_malloc() to manage the blkiov arrays that the cleaner functions use,
since the cleaner is likely to operate in a low-memory condition.
2005-04-16 17:28:37 +00:00
perseant 1ebfc508b6 Protect various per-fs structures with fs->lfs_interlock simple_lock, to
improve behavior in the multiprocessor case.  Add debugging segment-lock
assertion statements.
2005-04-01 21:59:46 +00:00
perseant eefd94b8e2 Straighten out the maze of ifdefs. Instead, consolidate all the debugging
stuff under '#ifdef DEBUG', and use sysctl knobs to turn on/off particular
parts of the debugging reporting (if DEBUG is enabled).  Re-enable the LFS
statistics in sysctl, while I'm there.  A bit of a rototill.
2005-03-08 00:18:19 +00:00
perry bcfcddbac1 nuke trailing whitespace 2005-02-26 22:31:44 +00:00
perseant 25f49c3c91 Various minor LFS improvements:
* Note when lfs_putpages(9) thinks it is not going to be writing any
  pages before calling genfs_putpages(9).  This prevents a situation in
  which blocks can be queued for writing without a segment header.
* Correct computation of NRESERVE(), though it is still a gross
  overestimate in most cases.  Note that if NRESERVE() is too high, it
  may be impossible to create files on the filesystem.  We catch this
  case on filesystem mount and refuse to mount r/w.
* Allow filesystems to be mounted whose block size is == MAXBSIZE.
* Somewhere along the line, ufs_bmaparray(9) started mangling UNWRITTEN
  entries in indirect blocks again, triggering a failed assertion "daddr
  <= LFS_MAX_DADDR".  Explicitly convert to and from int32_t to correct
  this.
* Add a high-water mark for the number of dirty pages any given LFS can
  hold before triggering a flush.  This is settable by sysctl, but off
  (zero) by default.
* Be more careful about the MAX_BYTES and MAX_BUFS computations so we
  shouldn't see "please increase to at least zero" messages.
* Note that VBLK and VCHR vnodes can have nonzero values in di_db[0]
  even though their v_size == 0.  Don't panic when we see this.
* Change lfs_bfree to a signed quantity.  The manner in which it is
  processed before being passed to the cleaner means that sometimes it
  may drop below zero, and the cleaner must be aware of this.
* Never report bfree < 0 (or higher than lfs_dsize) through
  lfs_statvfs(9).  This prevents df(1) from ever telling us that our full
  filesystems have 16TB free.
* Account space allocated through lfs_balloc(9) that does not have
  associated buffer headers, so that the pagedaemon doesn't run us out
  of segments.
* Return ENOSPC from lfs_balloc(9) when bfree drops to zero.
* Address a deadlock in lfs_bmapv/lfs_markv when the filesystem is being
  unmounted.  Because vfs_busy() is a shared lock, and
  lfs_bmapv/lfs_markv mark the filesystem vfs_busy(), the cleaner can be
  holding the lock that umount() is blocking on, then try to vfs_busy()
  again in getnewvnode().
2005-02-26 05:40:42 +00:00
yamt 3ea6756a92 use b_private rather than b_saveaddr.
XXX LFS_USE_B_INVAL
2003-12-04 14:57:47 +00:00
yamt 71602f6ec9 fix spec vnode aliasing. 2003-11-07 14:52:27 +00:00
yamt f80b24474d g/c CHECK_COPYIN. 2003-09-10 11:09:11 +00:00
agc aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
yamt 1bc98d3c14 - check EROFS earlier in lfs_markv.
- remove wrong error recovery code (fake buffers are never on bufqueue)
  and put a comment instead.
2003-07-30 12:38:53 +00:00
yamt 3bded40734 remove an unused definition of LFS_VREF_THRESHOLD. 2003-07-30 12:34:00 +00:00
yamt eb4e09d59f use queue.h macros. 2003-07-02 13:43:02 +00:00
fvdl d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
thorpej a06b275edc Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget().  Turns out
  that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
  and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
  above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
  just to appease the above.
2003-06-29 18:43:21 +00:00
darrenr 960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
nakayama a6d8c9185d Avoid comparison is always false warning in gcc 3.3 w/ 64-bit size_t. 2003-05-17 01:44:39 +00:00
fvdl 42614ed3f3 Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
2003-04-02 10:39:19 +00:00
yamt 91ce94db76 fix "more than one fragment" panics;
direct and indirect block pointers are not valid in the case of shortlinks.
while i'm here, move duplicated code in lfs_vget/fastvget into a new
function, lfs_vinit.
2003-03-20 14:11:46 +00:00
perseant ea03a1ac09 Add simple_lock protection for lfs_seglock and lfs_subsys_pages; these will
be expanded to cover other per-fs and subsystem-wide data as well.

Fix a case of IN_MODIFIED being set without updating lfs_uinodes, resulting
in a "lfs_uinodes < 0" panic.

Fix a deadlock in lfs_putpages arising from the need to busy all pages in a
block; unbusy any that had already been busied before starting over.
2003-03-15 06:58:49 +00:00
perseant a46b9ccf95 Only #define LFS if not already defined. 2003-03-08 23:18:54 +00:00