Commit Graph

93 Commits

Author SHA1 Message Date
yamt
a748ea88dd merge yamt-vop branch. remove following VOPs.
VOP_BLKATOFF
	VOP_VALLOC
	VOP_BALLOC
	VOP_REALLOCBLKS
	VOP_VFREE
	VOP_TRUNCATE
	VOP_UPDATE
2005-11-02 12:38:58 +00:00
yamt
baee927713 introduce "ufs_ops" and use it for ITIMES. 2005-09-27 06:48:55 +00:00
jmmv
2a3e5eeb7c Apply the NFS exports list rototill patch:
- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
  function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
  file sys/nfs/nfs_export.c.  The former was becoming large and its code
  is always compiled, regardless of the build options.  Using the latter,
  the code is only compiled in when NFSSERVER is enabled.  While doing this,
  also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
  path and a set of export entries.  At the moment it can only clear the
  exports list or append entries, one by one, but it is done in a way that
  allows setting the whole set of entries atomically in the future (see the
  comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
  that it becomes file system agnostic.  In fact, all this whole thing was
  done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
  exports initialization; done internally by the kernel when initializing
  the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
  subsystems can run arbitrary code upon receipt of specific VFS events.
  At the moment, this only provides support for unmount and is used to
  destroy NFS exports lists from the file systems being unmounted, though it
  has room for extension.

Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
2005-09-23 12:10:31 +00:00
christos
c93a283e5f - access the ffs and ext2fs itimes functions through a pointer, so that
if the filesystem is not compiled in the kernel still links. Probably
  a better solution is to use weak symbols.
- move the filesystem-specific itime macros to the filesystem header files.
2005-09-12 20:23:03 +00:00
xtraeme
23ebf62d26 * Remove __P()
* Use ANSI function declarations on ext2fs and mfs
2005-08-30 22:01:12 +00:00
christos
0b0eb1328b Don't overload MAXNAMLEN, use a separate constant for each filesystem type. 2005-08-23 08:05:13 +00:00
yamt
b7bfe82866 update file timestamps for nfsd loaned-read and mmap.
PR/25279.  discussed on tech-kern@.
2005-07-23 12:18:41 +00:00
yamt
44d128fa8e - constify genfs_ops.
- use member designators.
2005-06-28 09:30:37 +00:00
christos
273df63602 - sprinkle const
- avoid shadow variables.
2005-05-29 21:25:24 +00:00
thorpej
e633e8b61b - Define a VFS_ATTACH() macro that places a reference to a vfsops structure
into the "vfsops" link set.
- Use VFS_ATTACH() where vfsops are declared for individual file systems.
- In vfsinit(), traverse the "vfsops" link set, rather than vfs_list_initial[].
2005-03-29 02:41:05 +00:00
perry
bcfcddbac1 nuke trailing whitespace 2005-02-26 22:31:44 +00:00
ws
5387f7217c Add support for large files (>2GB).
Like Linux, automagically convert old filesystem to use this,
if they are already at revision 1.
For revision 0, just punt (unlike Linux; makes me a bit too nervous.)

There should be an option to fsck_ext2fs to upgrade revision 0 to revision 1.

Reviewd by Manuel (bouyer@).
2005-02-09 23:02:10 +00:00
mycroft
7f1fe4e81f Rearrange some code slightly to avoid uninitialized variable warnings. 2005-01-11 00:19:36 +00:00
mycroft
e72fc6717e Whoops -- move the location of the VOP_OPEN()/VOP_CLOSE(), et al, from
foo_mountfs() to foo_mount(), to match the new mountroot API.
Also, for ext2fs and lfs, copy some restructuring from ffs to allow changing
file system parameters without specifying the device name.
(ntfs could use some more work.)
2005-01-09 09:27:17 +00:00
mycroft
0461b30ac3 Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions.  This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC.  I've been running this for a
long time.
2005-01-09 03:11:48 +00:00
thorpej
1c95472d01 Add the system call and VFS infrastructure for file system extended
attributes.

From FreeBSD.
2005-01-02 16:08:28 +00:00
christos
e98accb116 Put the correct fragment size in struct statvfs. From Kevin Lahey. 2004-11-11 01:32:12 +00:00
thorpej
11afd11faa Add a new VNODE_LOCKDEBUG option, which enables checks in the VOP_*()
calls to ensure that the vnode lock state is as expected when the VOP
call is made.  Modify vnode_if.src to set the expected state according
to the documenting lock table for each VOP.  Modify vnode_if.sh to emit
the checks.

Notes:
- The checks are only performed if the vnode has the VLOCKSWORK bit
  set.  Some file systems (e.g. specfs) don't even bother with vnode
  locks, so of course the checks will fail.
- We can't actually run with VNODE_LOCKDEBUG because there are so many
  vnode locking problems, not the least of which is the "use SHARED for
  VOP_READ()" issue, which screws things up for the entire call chain.

Inspired by similar changes in OpenBSD, but implemented differently.
2004-09-21 03:10:35 +00:00
mycroft
45a21b76f0 Fixing age old cruft:
* Rather than using mnt_maxsymlinklen to indicate that a file systems returns
  d_type fields(!), add a new internal flag, IMNT_DTYPE.

Add 3 new elements to ufsmount:
* um_maxsymlinklen, replaces mnt_maxsymlinklen (which never should have existed
  in the first place).
* um_dirblksiz, which tracks the current directory block size, eliminating the
  FS-specific checks littered throughout the code.  This may be used later to
  make the block size variable.
* um_maxfilesize, which is the maximum file size, possibly adjusted lower due
  to implementation issues.

Sync some bug fixes from FFS into ext2fs, particularly:
* ffs_lookup.c 1.21, 1.28, 1.33, 1.48
* ffs_inode.c 1.43, 1.44, 1.45, 1.66, 1.67
* ffs_vnops.c 1.84, 1.85, 1.86

Clean up some crappy pointer frobnication.
2004-08-15 07:19:54 +00:00
mycroft
bc25b30608 Add a new flag, IN_MODIFY. This is like IN_UPDATE|IN_CHANGE, but unlike
setting those flags, it does not cause the inode to be written in the periodic
sync.  This is used for writes to special files (devices and named pipes) and
FIFOs.

Do not preemptively sync updates to access times and modification times.  They
are now updated in the inode only opportunistically, or when the file or device
is closed.  (Really, it should be delayed beyond close, but this is enough to
help substantially with device nodes.)

And the most amusing part:
Trickle sync was broken on both FFS and ext2fs, in different ways.  In FFS, the
periodic call to VFS_SYNC(MNT_LAZY) was still causing all file data to be
synced.  In ext2fs, it was causing the metadata to *not* be synced.  We now
only call VOP_UPDATE() on the node if we're doing MNT_LAZY.  I've confirmed
that we do in fact trickle correctly now.
2004-08-14 01:08:02 +00:00
pk
a7c40722d8 Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().
2004-07-05 07:28:45 +00:00
hannken
8c21bc6224 Add ffs internal snapshots. Written by Marshall Kirk McKusick for FreeBSD.
- Not enabled by default. Needs kernel option FFS_SNAPSHOT.
- Change parameters of ffs_blkfree.
- Let the copy-on-write functions return an error so spec_strategy
    may fail if the copy-on-write fails.
- Change genfs_*lock*() to use vp->v_vnlock instead of &vp->v_lock.
- Add flag B_METAONLY to VOP_BALLOC to return indirect block buffer.
- Add a function ffs_checkfreefile needed for snapshot creation.
- Add special handling of snapshot files:
    Snapshots may not be opened for writing and the attributes are read-only.
    Use the mtime as the time this snapshot was taken.
    Deny mtime updates for snapshot files.
- Add function transferlockers to transfer any waiting processes from
  one lock to another.
- Add vfsop VFS_SNAPSHOT to take a snapshot and make it accessible through
  a vnode.
- Add snapshot support to ls, fsck_ffs and dump.

Welcome to 2.0F.

Approved by: Jason R. Thorpe <thorpej@netbsd.org>
2004-05-25 14:54:55 +00:00
atatat
53c625655c Sysctl descriptions under vfs subtree 2004-05-25 04:44:43 +00:00
atatat
1d3a6a329e Explicitly call pool_init() (and pool_destroy()) when being built as
an _LKM.

This adds pools to the list of things that lkms must do manually
because they're set up with link sets.  Not that there's anything
wrong with link sets, but that we need to try harder to remember that
lkms are second class citizens.  Of a sort.
2004-05-20 05:39:34 +00:00
wiz
8cd26a6026 Fix typo in error message, reported by Piotr Meyer in PR 25418. 2004-05-02 06:59:20 +00:00
simonb
b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
christos
6bd1d6d4db Replace the statfs() family of system calls with statvfs().
Retain binary compatibility.
2004-04-21 01:05:31 +00:00
atatat
19af35fd0d Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
2004-03-24 15:34:46 +00:00
bouyer
35decc6ed6 Fix disclaimer in my copyright. Pointed out by Thomas Klausner. 2004-03-22 19:23:08 +00:00
atatat
13f8d2ce5f Dynamic sysctl.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al.  Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded.  Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment.  I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
2003-12-04 19:38:21 +00:00
dbj
fe7c786886 add mnt_iflag field to struct mount for internal flags
mv MNT_GONE, MNT_UNMOUNT and MNT_WANTRDWR to this field
additonally add mnt_writeopcountupper and mnt_writeopcountlower fields
in preparation for pending write suspension support work
bump kernel version to 1.6ZD
2003-10-14 14:02:56 +00:00
bouyer
7b066791c8 Remove references to University of California from my copyright notices. 2003-10-05 17:48:49 +00:00
agc
aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
thorpej
a06b275edc Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget().  Turns out
  that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
  and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
  above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
  just to appease the above.
2003-06-29 18:43:21 +00:00
darrenr
960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
christos
80ecd573c0 PR/1796: John Kohl: statfs misbehaves under chrooted environments.
- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
  and factored out some of the vfsop statfs() code to copy_statfs_info(). This
  fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
2003-04-16 21:44:18 +00:00
fvdl
85d5aca005 Actually get an ext2fs_dinode structure from the pool before using it. 2003-04-05 14:01:56 +00:00
fvdl
42614ed3f3 Add support for UFS2. UFS2 is an enhanced FFS, adding support for
64 bit block pointers, extended attribute storage, and a few
other things.

This commit does not yet include the code to manipulate the extended
storage (for e.g. ACLs), this will be done later.

Originally written by Kirk McKusick and Network Associates Laboratories for
FreeBSD.
2003-04-02 10:39:19 +00:00
dsl
bd99e3429d Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).
2003-03-21 23:11:19 +00:00
fvdl
a3ff3a3038 Bump daddr_t to 64 bits. Replace it with int32_t in all places where
it was used on-disk, so that on-disk formats remain the same.
Remove ufs_daddr_t and ufs_lbn_t for the time being.
2003-01-24 21:55:02 +00:00
christos
6f3945a88d MNT_GETARGS support 2002-09-21 18:10:34 +00:00
gehenna
77a6b82b27 Merge the gehenna-devsw branch into the trunk.
This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

	device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
  by using this grammer.

- Added the new naming convention.
  The name of the device switch must be <prefix>_[bc]devsw for auto-generation
  of device switch tables.

- The backward compatibility of loading block/character device
  switch by LKM framework is broken. This is necessary to convert
  from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
  We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
  the LKM framework will refer it to assign device major number dynamically.
2002-09-06 13:18:43 +00:00
soren
178d83d503 Die, qaddr_t, die! - mnt_data in struct mount is already effectively
a void *, so stop pretending otherwise.
2002-07-30 07:40:07 +00:00
thorpej
a180cee23b Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map).  Try to deal with this:

* Group all information about the backend allocator for a pool in a
  separate structure.  The pool references this structure, rather than
  the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
  to become available, but will still fail if it cannot callocate KVA
  space for the pages.  If this happens, carefully drain all pools using
  the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
  some pages, and use that information to make draining easier and more
  efficient.
* Get rid of PR_URGENT.  There was only one use of it, and it could be
  dealt with by the caller.

From art@openbsd.org.
2002-03-08 20:48:27 +00:00
lukem
ec6245465a add RCSID 2001-11-08 02:39:06 +00:00
simonb
915441c64b Use the sector size from the partition info, not a hard-coded value. 2001-11-06 07:09:39 +00:00
simonb
ecf09baab6 Remove a variable that is set but never used. 2001-11-06 07:08:09 +00:00
chs
64c6d1d2dc a whole bunch of changes to improve performance and robustness under load:
- remove special treatment of pager_map mappings in pmaps.  this is
   required now, since I've removed the globals that expose the address range.
   pager_map now uses pmap_kenter_pa() instead of pmap_enter(), so there's
   no longer any need to special-case it.
 - eliminate struct uvm_vnode by moving its fields into struct vnode.
 - rewrite the pageout path.  the pager is now responsible for handling the
   high-level requests instead of only getting control after a bunch of work
   has already been done on its behalf.  this will allow us to UBCify LFS,
   which needs tighter control over its pages than other filesystems do.
   writing a page to disk no longer requires making it read-only, which
   allows us to write wired pages without causing all kinds of havoc.
 - use a new PG_PAGEOUT flag to indicate that a page should be freed
   on behalf of the pagedaemon when it's unlocked.  this flag is very similar
   to PG_RELEASED, but unlike PG_RELEASED, PG_PAGEOUT can be cleared if the
   pageout fails due to eg. an indirect-block buffer being locked.
   this allows us to remove the "version" field from struct vm_page,
   and together with shrinking "loan_count" from 32 bits to 16,
   struct vm_page is now 4 bytes smaller.
 - no longer use PG_RELEASED for swap-backed pages.  if the page is busy
   because it's being paged out, we can't release the swap slot to be
   reallocated until that write is complete, but unlike with vnodes we
   don't keep a count of in-progress writes so there's no good way to
   know when the write is done.  instead, when we need to free a busy
   swap-backed page, just sleep until we can get it busy ourselves.
 - implement a fast-path for extending writes which allows us to avoid
   zeroing new pages.  this substantially reduces cpu usage.
 - encapsulate the data used by the genfs code in a struct genfs_node,
   which must be the first element of the filesystem-specific vnode data
   for filesystems which use genfs_{get,put}pages().
 - eliminate many of the UVM pagerops, since they aren't needed anymore
   now that the pager "put" operation is a higher-level operation.
 - enhance the genfs code to allow NFS to use the genfs_{get,put}pages
   instead of a modified copy.
 - clean up struct vnode by removing all the fields that used to be used by
   the vfs_cluster.c code (which we don't use anymore with UBC).
 - remove kmem_object and mb_object since they were useless.
   instead of allocating pages to these objects, we now just allocate
   pages with no object.  such pages are mapped in the kernel until they
   are freed, so we can use the mapping to find the page to free it.
   this allows us to remove splvm() protection in several places.

The sum of all these changes improves write throughput on my
decstation 5000/200 to within 1% of the rate of NetBSD 1.5
and reduces the elapsed time for "make release" of a NetBSD 1.5
source tree on my 128MB pc to 10% less than a 1.5 kernel took.
2001-09-15 20:36:31 +00:00
chs
adf5d360a7 add a new VFS op, vfs_reinit, which is called when desiredvnodes is
adjusted via sysctl.  file systems that have hash tables which are
sized based on the value of this variable now resize those hash tables
using the new value.  the max number of FFS softdeps is also recalculated.

convert various file systems to use the <sys/queue.h> macros for
their hash tables.
2001-09-15 16:12:54 +00:00