Commit Graph

241 Commits

Author SHA1 Message Date
christos
80ecd573c0 PR/1796: John Kohl: statfs misbehaves under chrooted environments.
- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
  and factored out some of the vfsop statfs() code to copy_statfs_info(). This
  fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
2003-04-16 21:44:18 +00:00
jdolecek
98f212db7d use former genfs_eopnotsupp_rele() as genfs_eopnotsupp(), so that vnodes
are vput()/vrele()d as necessary - some filesystems did use the wrong
one for some ops, and it's just safer to not take the chance

based on suggestion by Bill Studenmund
2003-04-10 21:53:32 +00:00
jdolecek
1ac1ffed36 improve genfs_eopnotsupp_rele() so that's usable for vop_rename,
which uses WILLPUT for member which may be NULL
handle correctly dvp == vp case for WILLPUT members, so this works
  for vop_remove, vop_rename

thanks Bill Studenmund for code&comments on this
2003-04-10 21:34:12 +00:00
thorpej
eb14e86676 Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it.  This fixes a few places where either b_dep or b_interlock were
not properly initialized.
2003-02-25 20:35:31 +00:00
perseant
b397c875ae Add code to UBCify LFS. This is still behind "#ifdef LFS_UBC" for now
(there are still some details to work out) but expect that to go
away soon.  To support these basic changes (creation of lfs_putpages,
lfs_gop_write, mods to lfs_balloc) several other changes were made, to
wit:

* Create a writer daemon kernel thread whose purpose is to handle page
  writes for the pagedaemon, but which also takes over some of the
  functions of lfs_check().  This thread is started the first time an
  LFS is mounted.

* Add a "flags" parameter to GOP_SIZE.  Current values are
  GOP_SIZE_READ, meaning that the call should return the size of the
  in-core version of the file, and GOP_SIZE_WRITE, meaning that it
  should return the on-disk size.  One of GOP_SIZE_READ or
  GOP_SIZE_WRITE must be specified.

* Instead of using malloc(...M_WAITOK) for everything, reserve enough
  resources to get by and use malloc(...M_NOWAIT), using the reserves if
  necessary.  Use the pool subsystem for structures small enough that
  this is feasible.  This also obsoletes LFS_THROTTLE.

And a few that are not strictly necessary:

* Moves the LFS inode extensions off onto a separately allocated
  structure; getting closer to LFS as an LKM.  "Welcome to 1.6O."

* Unified GOP_ALLOC between FFS and LFS.

* Update LFS copyright headers to correct values.

* Actually cast to unsigned in lfs_shellsort, like the comment says.

* Keep track of which segments were empty before the previous
  checkpoint; any segments that pass two checkpoints both dirty and
  empty can be summarily cleaned.  Do this.  Right now lfs_segclean
  still works, but this should be turned into an effectless
  compatibility syscall.
2003-02-17 23:48:08 +00:00
pk
338f31f581 Make the buffer cache code MP-safe. 2003-02-05 21:38:38 +00:00
christos
3908d39e06 step 3. Assign lwp properly if null, so that we can PHOLD without segfaulting. 2003-01-21 00:01:14 +00:00
thorpej
b78f59b443 Merge the nathanw_sa branch. 2003-01-18 08:51:40 +00:00
yamt
bbbe3e07d7 genfs_compat_gop_write: set uio_iovcnt correctly. 2002-11-15 14:01:57 +00:00
yamt
ac3a01e67e use B_ASYNC for children of nested buffers in genfs_getpages.
ok'ed by Chuck Silvers.
2002-10-25 05:44:41 +00:00
jdolecek
e0cc03a09b merge kqueue branch into -current
kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
2002-10-23 09:10:23 +00:00
fvdl
eb485a7e27 Use B_ASYNC in the !PGO_SYNCIO case. Gets back most, if not all, NFS
read throughput performance lost since the introduction of UBC. Spotted
by YAMAMOTO Takashi, many thanks to him.
2002-10-21 15:21:35 +00:00
enami
9e1deeab34 Add missing pageq lock while uvm_pagefree() is called (either directly
or indirectly).  Reviewed by chuq.
2002-05-29 11:04:39 +00:00
enami
1578726840 Just give up to do readahead rather than waiting busy pages.
While I'm here, added few patchable variable so that one can
easily measure readahead behaviour.
2002-05-18 02:54:57 +00:00
perseant
3fa1c8abe9 Protect v_synclist with splbio(); note that LIST_REMOVE is not an idempotent
operation if more than one LIST_REMOVE happens on interrupt, so both the test
for VONWORKLIST and the LIST_REMOVE(vp, v_synclist) need to be in splbio().
2002-05-14 19:37:18 +00:00
enami
293906a53a Redo rev. 1.57 a bit different way; don't use `tpg' since it may be freed.
Subtract the number of pages behind us when calculating new offset instead.
2002-05-10 07:51:37 +00:00
enami
911c9febb2 Don't modify the local variable `n' in genfs_putpages(). It should contain
the number of elements in the page array at the beginning of every iteration.
2002-05-10 02:51:44 +00:00
enami
e3cc9c0682 When traversing by list, if the page next to us is a page in the cluster,
advance the pointer.
2002-05-09 07:22:09 +00:00
enami
fabaf9a730 - In genfs_putpages(), no need to restrict the cluster within the given
region.
- In uvm_aio_aiodone(), remove assertions no longer true.
2002-05-09 07:14:37 +00:00
enami
8876669f4c Since npages may includes number of pages behind us, we can't use it to
update current offset.  Instead, use the last page in the run of pages
to calculate new offset.
2002-05-06 00:42:22 +00:00
enami
e6513c283e Stylistic change; introduce new local variable and use it instead of
sprinkling different expression to test if we're pagedaemon.
2002-05-06 00:18:15 +00:00
enami
6335b88f05 We don't need to re-activate page in genfs_putpages() when GOP_WRITE returns
ENOMEM (temporary memory shortage) since it is already handled in
uvm_aio_aiodone() for both async/sync case.  Discussed with chuq.
2002-04-26 03:57:31 +00:00
enami
6cfcfb947c genfs_{compat_}getpages(): For PGO_LOCKED request, it is safe to return
read only page if it was due to read fault.  This avoid many unnecessary
read fault introduced by recent nfs_bio.c change.  Reviewed by chuq.
2002-04-16 06:05:05 +00:00
enami
08625200a0 KNF and other misc. cosmetic changes. 2002-04-16 06:00:46 +00:00
chs
72c455ce83 in genfs_compat_getpages(), clear any part of a page that
VOP_READ() doesn't fill in (eg. because it's past EOF).
2002-03-22 03:51:51 +00:00
atatat
31144d9976 Convert ioctl code to use EPASSTHROUGH instead of -1 or ENOTTY for
indicating an unhandled "command".  ERESTART is -1, which can lead to
confusion.  ERESTART has been moved to -3 and EPASSTHROUGH has been
placed at -4.  No ioctl code should now return -1 anywhere.  The
ioctl() system call is now properly restartable.
2002-03-17 19:40:26 +00:00
chs
a51be40dcb don't yield the cpu in genfs_putpages() if we're the pagedaemon.
pointed out by enami.  fixes PR 15784.
2002-03-02 06:58:01 +00:00
enami
9a623b9870 Don't use MALLOC for variable sized allocation. 2002-02-20 06:16:22 +00:00
chs
96f907f394 fix two problems:
- when yielding the cpu while using the vnode's page list, use a marker page
   to keep our place in the list (like the other cases where we drop the lock).
 - wait until no one else has the page busy before deciding if the page needs
   to be cleaned.  a page will be dirty while it's being initialized but will
   be marked clean before PG_BUSY is cleared.
both found by enami.
2002-02-19 15:49:39 +00:00
enami
fe24174a3b Don't bother to subtract 0. 2002-02-13 05:20:41 +00:00
enami
52a2a21502 Don't leave junk in pgs[] array since it will be passed to uvn_findpages()
again.
2002-02-12 01:08:12 +00:00
chs
0365a63944 in genfs_putpages():
- yield the cpu if we've taken too long.
 - when traversing by offset, skip over any pages that we clustered.
2002-01-26 02:44:27 +00:00
chs
03ea276e84 in genfs_gop_write(), actually set the B_ASYNC flag on buffers that we're
not going to wait for.  this doesn't matter for real devices since we call
VOP_STRATEGY() directly, but NFS uses this flag to decide whether or not
to hand the buffer off to an nfsiod thread.
2001-12-31 06:44:58 +00:00
chs
64b0c2adbb in genfs_putpages(), we must wait for any pending write i/os to complete
if the putpages request is synchronous.
2001-12-31 06:40:08 +00:00
chs
40bf5f0e12 add some compatibility routines to allow mmap() to work non-UBCified
filesystems (in the same non-coherent fashion that they worked before).
2001-12-18 07:49:36 +00:00
chs
4d14671458 add VOP_GETPAGES and VOP_PUTPAGES methods for layered filesystems.
drop the interlock on the upper layer, acquire the interlock on the
lower layer.
2001-12-06 04:29:23 +00:00
chs
5a690c92a1 add a VOP_PUTPAGES method for all the filesystems that don't have pages,
just unlock the interlock.
2001-12-06 04:27:40 +00:00
christos
420771d7cc PR/14781: Matthew Fredette: Clamp the number of read-ahead pages to 16 because
other code has this limit. Also while I am here, convert the magic 16 into
a #define constant and use it in the appropriate places. This is a temporary
fix, since all this read-ahead business is XXXUBC anyway.
2001-11-30 15:18:39 +00:00
lukem
2565646230 don't need <sys/types.h> when including <sys/param.h> 2001-11-15 09:47:59 +00:00
lukem
e4b00f433c add RCSIDs 2001-11-10 13:33:40 +00:00
enami
6e46b6ec2c s/genfs_do_putpages/genfs_gop_write/ in uvmhist. 2001-10-03 14:13:08 +00:00
chs
4111c37251 when zeroing pages past EOF, don't zero the page containing EOF if it
already contains valid data.  should fix PRs 13361 and 13436.
2001-09-21 07:52:25 +00:00
chs
5f5ac77eff add a forward decl for struct vm_page. 2001-09-15 22:38:40 +00:00
chs
099a6b5258 interfaces and structures used by new genfs_{get,put}pages(). 2001-09-15 21:33:05 +00:00
chs
64c6d1d2dc a whole bunch of changes to improve performance and robustness under load:
- remove special treatment of pager_map mappings in pmaps.  this is
   required now, since I've removed the globals that expose the address range.
   pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
   no longer any need to special-case it.
 - eliminate struct uvm_vnode by moving its fields into struct vnode.
 - rewrite the pageout path.  the pager is now responsible for handling the
   high-level requests instead of only getting control after a bunch of work
   has already been done on its behalf.  this will allow us to UBCify LFS,
   which needs tighter control over its pages than other filesystems do.
   writing a page to disk no longer requires making it read-only, which
   allows us to write wired pages without causing all kinds of havoc.
 - use a new PG_PAGEOUT flag to indicate that a page should be freed
   on behalf of the pagedaemon when it's unlocked.  this flag is very similar
   to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
   pageout fails due to eg. an indirect-block buffer being locked.
   this allows us to remove the "version" field from struct vm_page,
   and together with shrinking "loan_count" from 32 bits to 16,
   struct vm_page is now 4 bytes smaller.
 - no longer use PG_RELEASED for swap-backed pages.  if the page is busy
   because it's being paged out, we can't release the swap slot to be
   reallocated until that write is complete, but unlike with vnodes we
   don't keep a count of in-progress writes so there's no good way to
   know when the write is done.  instead, when we need to free a busy
   swap-backed page, just sleep until we can get it busy ourselves.
 - implement a fast-path for extending writes which allows us to avoid
   zeroing new pages.  this substantially reduces cpu usage.
 - encapsulate the data used by the genfs code in a struct genfs_node,
   which must be the first element of the filesystem-specific vnode data
   for filesystems which use genfs_{get,put}pages().
 - eliminate many of the UVM pagerops, since they aren't needed anymore
   now that the pager "put" operation is a higher-level operation.
 - enhance the genfs code to allow NFS to use the genfs_{get,put}pages
   instead of a modified copy.
 - clean up struct vnode by removing all the fields that used to be used by
   the vfs_cluster.c code (which we don't use anymore with UBC).
 - remove kmem_object and mb_object since they were useless.
   instead of allocating pages to these objects, we now just allocate
   pages with no object.  such pages are mapped in the kernel until they
   are freed, so we can use the mapping to find the page to free it.
   this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
2001-09-15 20:36:31 +00:00
chs
5a4fdb6ddb make genfs get/put work for block devices as well:
- the "fs bshift" for block devices is always DEV_BSHIFT.
 - retrieve the device vnode from VOP_BMAP() and use that to set b_dev
   in page i/o buffers.
2001-08-17 05:51:29 +00:00
assar
bec71dc090 change vop_symlink and vop_mknod to return vpp (the created node)
refed, so that the caller can actually use it.  update callers and
file systems that implement these vnode operations
2001-07-24 15:39:30 +00:00
chs
766dfc9b6f be sure to allocate dirty zeroed pages to cover blocks we allocate
to resolve a write fault.  fixes PR 13201.
also, be sure to allocate blocks for write faults to holes even if
the page is already in memory.  fixes PR 13189.
2001-06-14 08:22:14 +00:00
wiz
fa87a2091d Typos in comments (misc/13133 by Michael K. Sanders) 2001-06-07 13:32:46 +00:00
chs
45701591c6 add a genfs_mmap() and change all of the disk-based filesystems
to implement VOP_MMAP() with the genfs version, in preparation for
actually using this VOP.
2001-05-28 02:50:51 +00:00
chs
11a9651c8f replace vm_page_t with struct vm_page *. 2001-05-26 21:27:10 +00:00
chs
dd82ad8e2c eliminate the VM_PAGER_* error codes in favor of the traditional E* codes.
the mapping is:

VM_PAGER_OK		        0
VM_PAGER_BAD		        <unused>
VM_PAGER_FAIL		        <unused>
VM_PAGER_PEND		        0 (see below)
VM_PAGER_ERROR		        EIO
VM_PAGER_AGAIN		        EAGAIN
VM_PAGER_UNLOCK		        EBUSY
VM_PAGER_REFAULT	        ERESTART

for async i/o requests, it used to be possible for the request to
be convert to sync, and the pager would return VM_PAGER_OK or VM_PAGER_PEND
to indicate whether the caller should perform post-i/o cleanup.
this is no longer allowed; pagers must now return 0 to indicate that
the async i/o was successfully started, and the caller never needs to
worry about doing the post-i/o cleanup.
2001-03-10 22:46:45 +00:00
chs
667e1805e6 in genfs_getpages(), don't try to optimize zeroing past EOF.
fixes PR 12297.
2001-02-28 02:59:19 +00:00
chs
f87a22a66b distinguish between a file's in-memory EOF (which marks the offset at
which we disallow creation of page cache pages) and its on-disk EOF
(which marks the offset at which there is not (yet) data on disk that
we need to read when creating pages).  for requests with PGO_PASTEOF,
the in-memory EOF maybe be much larger than the on-disk EOF.
2001-02-27 02:57:02 +00:00
chs
1a5818b05e fix a couple more bugs:
- in genfs_getpages(), unbusy any pages that we don't free in the error path.
 - in genfs_putpages(), if we get a bmap error, record that in the master buf.
2001-02-18 15:03:42 +00:00
fvdl
f12c24a45c Oops, removal unintenionally commited debug code. 2001-02-12 19:12:10 +00:00
fvdl
dd32618956 Format arg nit. 2001-02-12 17:41:49 +00:00
chs
8c14e1d2db fix several bugs:
- in the cases where we skip over the i/o loop, increment npages by ridx
   so that when the cleanup code starts processing the pgs array at index 0
   it'll actually process all of the pages.
 - process the PG_RELEASED flag when unbusying pages.
 - add some missing MP locking.
 - use MIN() and MAX() instead of min() and max() since the latter are
   functions which take arguments of type "int" but we call them with
   values of type "off_t", so the values could be truncated.
 - in the PGO_PASTEOF case, use the larger of the current file size and the
   end of the requested range of pages as the file size for this request.
   this fixes some problems with sparsing writes to large offsets.
2001-02-05 12:26:08 +00:00
fvdl
f4ddf5e1b6 Cast lbn to off_t in a few places, to avoid daddr_t overflow and all sorts
of havoc. From Bill Sommerfeld.
2001-01-22 16:39:54 +00:00
chs
68b98ea45f several bugs:
- in genfs_getpages() don't start read-ahead if we get an error on the
   sync read, and always start read-ahead after the range of the sync read
   if we do any at all.
 - off-by-one error in genfs_size().
2000-12-27 04:47:43 +00:00
enami
0088605039 Don't cache a device vnode in a layer node cache once the layer node
is inactivated.  Otherwise, the device won't closed.
2000-12-21 03:51:02 +00:00
chs
f5878a3362 only zero the part of the page after EOF if we're actually
initializing the page.
2000-12-09 22:38:23 +00:00
chs
e9037d16c5 allow building without SOFTDEP by adding the pageiodone hook to bio_ops. 2000-11-27 18:26:38 +00:00
chs
aeda8d3b77 Initial integration of the Unified Buffer Cache project. 2000-11-27 08:39:39 +00:00
fvdl
db4108490a Adapt for VOP_FSYNC parameter change. 2000-09-19 22:01:59 +00:00
thorpej
7cc27a88c0 Convert namei pathname buffer allocation to use the pool allocator. 2000-08-03 20:41:05 +00:00
mycroft
7385963fc9 Stylistic change. 2000-05-29 18:59:51 +00:00
perseant
f0728fdce1 Change the sementics of the last parameter from a boolean ("waitfor") to
a set of flags ("flags").  Two flags are defined, UPDATE_WAIT and
UPDATE_DIROP.

Under the old semantics, VOP_UPDATE would block if waitfor were set,
under the assumption that directory operations should be done
synchronously.  At least LFS and FFS+softdep do not make this
assumption; FFS+softdep got around the problem by enclosing all relevant
calls to VOP_UPDATE in a "if(!DOINGSOFTDEP(vp))", while LFS simply
ignored waitfor, one of the reasons why NFS-serving an LFS filesystem
did not work properly.

Under the new semantics, the UPDATE_DIROP flag is a hint to the
fs-specific update routine that the call comes from a dirop routine, and
should be wait for, or not, accordingly.

Closes PR#8996.
2000-05-13 23:43:06 +00:00
augustss
bd842961d4 Register, begone! 2000-03-30 12:22:12 +00:00
simonb
08312317e7 Delete redundant decl of layer_node_create(), it's in layer_extern.h. 2000-03-30 02:19:16 +00:00
jdolecek
89015c4648 Add new VFS op routine - vfs_done and call it on filesystem detach
in vfs_detach(). vfs_done may free global filesystem's resources,
typically those allocated in respective filesystem's init function.
Needed so those filesystems which went in via LKM have a chance to
clean after themselves before unloading. This fixes random panics
when LKM for filesystem using pools was loaded and unloaded several
times.

For each leaf filesystem, add appropriate vfs_done routine.
2000-03-16 18:08:17 +00:00
soren
95054da1a1 Fix doubled 'the's in comments. 2000-03-13 23:52:25 +00:00
fvdl
0b1963121a Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O
1999-11-15 18:49:07 +00:00
wrstuden
092a6ee985 Since we don't put layered device nodes in the spechash hash chains,
initialize vp->v_hashchain to NULL.
1999-10-25 23:34:31 +00:00
fvdl
d369311766 Remove some mentioned members in the vop {un}lock args struct that we
do not actually have.
1999-10-23 19:34:50 +00:00
wrstuden
3bf14d81e9 Add support for fcntl(2) to generate VOP_FCNTL calls. Any fcntl
call with F_FSCTL set and F_SETFL calls generate calls to a new
fileop fo_fcntl. Add genfs_fcntl() and soo_fcntl() which return 0
for F_SETFL and EOPNOTSUPP otherwise. Have all leaf filesystems
use genfs_fcntl().

Reviewed by: thorpej
Tested by: wrstuden
1999-08-03 20:19:16 +00:00
wrstuden
a0f2937049 Define VLAYER and make layered fs's set this flag when creating their vnodes.
getnewvnode now checks this bit, and it if's set makes sure a vnode's not
locked before removing it from the free list.

Closes PR 7954 by Alan Barrett <apb@iafrica.com>.
1999-07-15 21:30:31 +00:00
wrstuden
b7f5310486 Fix tyop pointed out by Chuck Silvers <chuq@chuq.com>. 1999-07-12 16:37:03 +00:00
wrstuden
9866514df5 Introduce layer library in genfs. This set of files abstracts most of
the functionality of nullfs. The latter is now just a mount & unmount
routine, and a few tables. umapfs borrow most of this infrastructure.

Both fs's are now nfs-exportable.

All layered fs's share a common format to private mount & private
vnode structs (which a particular fs can extend).

Also add genfs_noerr_rele(), a vnode op which will vrele/vput
operand vnodes appropriately.
1999-07-08 01:18:59 +00:00
mycroft
b174019ccc Pass null pointers to VOP_UPDATE rather than having all the callers fetch the
current time themselves.
1999-03-05 21:09:48 +00:00
kleink
bf1863d17b Add genfs_einval(), which does the obvious thing. 1998-08-13 09:59:52 +00:00
matthias
574106c52b create miscfs/genfs/genfs_vnops.c:genfs_enoioctl and make all the other
filesystems use it instead of a private version.
1998-08-10 08:11:10 +00:00
thorpej
961f0708b1 - Rename nqnfs_vop_lease_check() to genfs_lease_check(). If NFSSERVER is
not in the kernel, genfs_lease_check() is simply a no-op.  This allows
  LKM'd file systems to be exported (previously did not work properly
  due to a compile-time decision based on -DNFSSERVER).
- defopt NFSSERVER
1998-06-25 22:15:28 +00:00
cgd
651b44e211 Rework the way kernel include files are installed. In the new method,
as with user-land programs, include files are installed by each directory
in the tree that has includes to install.  (This allows more flexibility
as to what gets installed, makes 'partial installs' easier, and gives us
more options as to which machines' includes get installed at any given
time.)  The old SYS_INCLUDES={symlinks,copies} behaviours are _both_
still supported, though at least one bug in the 'symlinks' case is
fixed by this change.  Include files can't be build before installation,
so directories that have includes as targets (e.g. dev/pci) have to move
those targets into a different Makefile.
1998-06-12 23:22:30 +00:00
kleink
a32a338757 * Convert fsync vnode operator implementations and usage from the old
waitfor argument and MNT_WAIT/MNT_NOWAIT to flags and FSYNC_WAIT.
* In genfs_fsync(), honor the FSYNC_NODATA flag.
1998-06-05 19:52:59 +00:00
fvdl
e5bc90f40c Merge with Lite2 + local changes 1998-03-01 02:20:01 +00:00
perry
1a80fd799d RCSID Police. 1998-01-05 19:19:41 +00:00
kleink
9c16cd8a46 Implement a POSIX compliant genfs VOP_SEEK() and use it in the appropriate
places; by Chris G. Demetriou and myself.
1997-04-11 21:52:00 +00:00
mycroft
2bc736661a Implement poll(2). 1996-09-07 12:40:22 +00:00
thorpej
2c02b8ec56 Remove some unused variables. 1996-09-05 09:26:14 +00:00
mycroft
c52352c819 Add a set of generic file system operations that most file systems use.
Also, fix some time stamp bogosities.
1996-09-01 23:47:48 +00:00