Commit Graph

258 Commits

Author SHA1 Message Date
dholland cba87bb8e7 Abolish all the silly indirection macros for initializing vnode ops tables.
These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.

Part 3; cvs randomly didn't commit all the files the first time, still
hunting down the files it skipped.
2021-07-19 01:33:53 +00:00
dholland d819c3614f Use macros for the canned parts of device and fifo vnode op tables.
Add GENFS_SPECOP_ENTRIES and GENFS_FIFOOP_ENTRIES macros that contain
the portion of the vnode ops table declaration that is
(conservatively) the same in every fs. Use these in every fs that
supports devices and/or fifos with separate ops tables.

Note that ptyfs works differently (it has one type of vnode with
open-coded dispatch to the specfs code, which I haven't changed in
this commit) and rump/librump/rumpvfs/rumpfs.c has an indirect dynamic
dispatch that already does more or less the same thing, which I also
haven't changed.

Also note that this anticipates a few bits in the next changeset here
and there, and adds missing but unreachable calls in some cases (e.g.
most fses weren't defining whiteout on devices and fifos, but it isn't
reachable there), and it changes parsepath on devices and fifos to
genfs_badop from genfs_parsepath (but it's not reachable there
either).

It appears that devices in kernfs were missing kqfilter, so it's
possible that if you try to use kqueue on /kern/rootdev that it'll
explode.

And finally note that the ops declaration tables aren't
order-dependent. (Other than vop_default_desc has to come first.)
Otherwise this wouldn't work.
2021-07-18 23:56:12 +00:00
dholland 37ce85c5f7 Fix perms on /kern/{r,}rootdev. 2021-07-06 03:23:03 +00:00
dholland 285b06fd69 Add missing VOP_KQFILTER to kernfs.
Not sure if lack of it can be used for local DoS or not, but best to
fix.
2021-07-06 03:22:44 +00:00
dholland c6c16cd073 - Add a new vnode op: VOP_PARSEPATH.
- Move namei_getcomponent to genfs_vnops.c and call it genfs_parsepath.
 - Add a parsepath entry to every vnode ops table.

VOP_PARSEPATH takes a directory vnode to be searched and a complete
following path and chooses how much of that path to consume. To begin
with, all parsepath calls are genfs_parsepath, which locates the first
'/' as always.

Note that the call doesn't take the whole struct componentname, only
the string. The other bits of struct componentname should not be
needed and there's no reason to cause potential complications by
exposing them.
2021-06-29 22:34:05 +00:00
chs e3beb37645 VOP_BMAP() may be called via ioctl(FIOGETBMAP) on any vnode that applications
can open.  change various pseudo-fs *_bmap methods return an error instead of
panic.

Reported-by: syzbot+8289a3eaf2ba60958c87@syzkaller.appspotmail.com
2021-06-28 17:52:12 +00:00
christos 79e3c74f8e Introduce genfs_pathconf() and use it for the default case in all filesystems. 2020-06-27 17:29:17 +00:00
bouyer ecb2afc2be Add need-flags for kernfs.
Compile Xen kernfs support only if kernfs is compiled in the kernel.
Should fix MODULAR build.
2020-05-26 10:37:24 +00:00
christos 9aa2a9c323 Add ACL support for FFS. From FreeBSD. 2020-05-16 18:31:45 +00:00
jdolecek 8d1e4d9c00 switch to kmem_zalloc() instead of malloc() for struct kernfs_mount 2020-04-07 08:35:49 +00:00
jdolecek 8a3ee72648 switch KERNFS_ALLOCENTRY() to use kmem_zalloc() instead of malloc() 2020-04-07 08:14:42 +00:00
pgoyette 9120d4511b Use the module subsystem's ability to process SYSCTL_SETUP() entries to
automate installation of sysctl nodes.

Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself.  These are not changed.
2020-03-16 21:20:09 +00:00
ad bada6a544d v_interlock -> vmobjlock 2020-02-24 20:44:25 +00:00
riastradh b26fba762e Use specfs vnops for specnodes in kernfs.
While here, don't filter out rootdev and rrootdev merely because
they're not cached.

Fixes the elusive /kern/rootdev and /kern/rrootdev nodes, which only
appeared sometimes when they felt like it, and fixes operations on
/kern/rootdev and /kern/rrootdev always returning EOPNOTSUPP.

We didn't seem to have a single PR for these issues but the following
PRs are all relevant:

PR bin/13564
PR kern/38265
PR kern/38778
PR kern/45974

XXX pullup-9, pullup-8, pullup-7, pullup-6, pullup-5, pullup-4, pullup-3, pullup-2, pullup-1.4T...
2020-02-04 04:19:24 +00:00
ad c2e9cb9413 VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode.  Matches
FreeBSD.
2020-01-17 20:08:06 +00:00
thorpej d6c967bb85 - Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
  functions (naming mirrors that of other time access functions in kern_tc.c).
  It returns the (maybe-converted) value of timebasebin, which also tracks
  our estimate of when the system was booted (i.e. the legacy "boottime" was
  redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes.  At least now the problem is centralized in one location.
2020-01-02 15:42:26 +00:00
hannken 18d25e92d6 Add missing operation VOP_GETPAGES() returning EFAULT.
Without this operation posix_fadvise(..., POSIX_FADV_WILLNEED)
would leave the v_interlock held.

Observed by maxv@
2019-08-29 06:43:13 +00:00
riastradh d1579b2d70 Rename min/max -> uimin/uimax for better honesty.
These functions are defined on unsigned int.  The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER!  Some subsystems have

	#define min(a, b)	((a) < (b) ? (a) : (b))
	#define max(a, b)	((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX.  Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate.  But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all.  (Who knows, maybe in some cases integer
truncation is actually intended!)
2018-09-03 16:29:22 +00:00
christos 2be0b0c7db factor out some repeated code and simplify the logputchar function. 2018-03-31 23:12:01 +00:00
riastradh 7f7aad09bd Make VOP_RECLAIM do the last unlock of the vnode.
VOP_RECLAIM naturally has exclusive access to the vnode, so having it
locked on entry is not strictly necessary -- but it means if there
are any final operations that must be done on the vnode, such as
ffs_update, requiring exclusive access to it, we can now kassert that
the vnode is locked in those operations.

We can't just have the caller release the last lock because some file
systems don't use genfs_lock, and require the vnode to remain valid
for VOP_UNLOCK to work, notably unionfs.
2017-05-26 14:20:59 +00:00
riastradh 87fb32292e Make VOP_INACTIVE preserve vnode lock on return.
Discussed on tech-kern:
https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html

Ride 7.99.68, a bumpy bus of incremental vfs improvements!
2017-04-11 14:24:59 +00:00
hannken 326db3aaf6 Add generic genfs_suspendctl() and use it for all file systems.
Layered file systems need work.
2017-02-17 08:31:23 +00:00
hannken 7139aab724 Remove now obsolete operation vcache_remove().
Welcome to 7.99.36
2016-08-20 12:37:06 +00:00
riastradh 46e71c7d57 Make VOP_LINK return directory still locked and referenced.
Ride 7.99.10 bump.
2015-04-20 22:59:19 +00:00
uebayasi fe9a32c84e Define filesystem attributes with vfs dependency. 2014-10-11 06:42:18 +00:00
dholland 05d075b3ae Add VOP_FALLOCATE and VOP_FDISCARD to every vnode ops table I can
find.

The filesystem ones all call genfs_eopnotsupp - right now I am only
implementing the plumbing and we can implement fallocate and/or
fdiscard for files later.

The device ones call spec_fallocate (which is also genfs_eopnotsupp)
and spec_fdiscard, which dispatches to the device-level op.

The fifo ones all call vn_fifo_bypass, which also ends up being
EOPNOTSUPP.
2014-07-25 08:20:51 +00:00
hannken 98fe1e89dc Change kernfs from hashlist to vcache. 2014-07-20 13:58:04 +00:00
hannken 73c1268ff3 Remove another KAME IPSEC residue, "struct secasvar" and "struct secpolicy". 2014-07-20 13:17:37 +00:00
hannken 3fd06797df Finish KAME IPSEC removal:
- Remove field kfs_value, it is always zero. Compute the hash from kt_tag.
- Remove stray definitions kernfs_revoke_sa and kernfs_revoke_sp.

While here, remove kfs_type from allocvp(), it is always kt->kt_tag.
2014-07-17 08:21:34 +00:00
christos 79c480ed3b From Ilya Zykov: Unbreak kernfs which was broken by this commit
|Make the spec_node table implementation private to spec_vnops.c.
|To retrieve a spec_node, two new lookup functions (by device or by mount)
|are implemented.  Both return a referenced vnode, for an opened block device
|the opened vnode is returned so further diagnostic checks "vp == ... sd_bdevvp"
|will not fire.  Otherwise any vnode matching the criteria gets returned.
|No objections on tech-kern.

The effect was that ls /kernfs appeared empty in most cases.
2014-04-08 17:56:10 +00:00
hannken 6d285189fb Change all vfsops to use C99 designated initializers.
No functional changes intended.
2014-03-23 15:21:15 +00:00
hannken 2b6ec89863 The current implementation of vn_lock() is racy. Modification of
the vnode operations vector for active vnodes is unsafe because it
is not known whether deadfs or the original file system will be
called.

- Pass down LK_RETRY to the lock operation (hint for deadfs only).

- Change deadfs lock operation to return ENOENT if LK_RETRY is unset.

- Change all other lock operations to check for dead vnode once
  the vnode is locked and unlock and return ENOENT in this case.

With these changes in place vnode lock operations will never succeed
after vclean() has marked the vnode as VI_XLOCK and before vclean()
has changed the operations vector.

Adresses PR kern/37706 (Forced unmount of file systems is unsafe)

Discussed on tech-kern.

Welcome to 6.99.33
2014-02-27 16:51:37 +00:00
pooka 4f6fb3bf35 Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
2014-02-25 18:30:08 +00:00
hannken 97834f7ba0 Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
2014-02-07 15:29:20 +00:00
hannken 04c776e5c8 Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
2014-01-23 10:13:55 +00:00
hannken 1139274440 Change vnode operations create, mknod, mkdir and symlink to keep the
directory node dvp locked on return.

Discussed on tech-kern@

Welcome to 6.99.29
2014-01-17 10:55:01 +00:00
plunky 5ec364d4d9 C99 section 6.7.2.3 (Tags) Note 3 states that:
A type specifier of the form

	enum identifier

  without an enumerator list shall only appear after the type it
  specifies is complete.

which means that we cannot pass an "enum vtype" argument to
kauth_access_action() without fully specifying the type first.
Unfortunately there is a complicated include file loop which
makes that difficult, so convert this minimal function into a
macro (and capitalize it).

(ok elad@)
2013-03-18 19:35:35 +00:00
drochner 364a06bb29 remove KAME IPSEC, replaced by FAST_IPSEC 2012-03-22 20:34:37 +00:00
elad 0c9d8d15c9 Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

    http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
    http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
    http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.
2012-03-13 18:40:26 +00:00
njoly 973e485533 Start making fs read(2) fail with EISDIR if the implementation does
not allow read on directories (kernfs, rumpfs, ptyfs and sysvbfs).
Adjust man page accordingly, and add a small corresponding vfs
testcase.
2011-12-12 19:11:21 +00:00
christos a3c0501888 define KERNFS_MAXNAMLEN and use it.` 2011-09-27 01:23:05 +00:00
rmind e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
hannken fb62bef947 Make holding v_interlock mandatory for callers of vget().
Announced some time ago on tech-kern.
2010-07-21 17:52:09 +00:00
hannken 1664eae7f3 Using vfinddev() leads to vnode races as it returns an unreferenced
vnode that may disappear before the caller has a chance to reference it.

Reference the vnode while the specfs cache is locked.

Welcome to 5.99.37.

No objections on tech-kern.
2010-07-21 09:06:37 +00:00
hannken 245651a23d Remove vlockmgr(). Generic vnode lock operations now use a rwlock located
in the vnode.  All LK_* flags move from sys/lock.h to sys/vnode.h.  Calls
to vlockmgr() in file systems get replaced with VOP_LOCK() or VOP_UNLOCK().

Welcome to 5.99.34.

Discussed on tech-kern.
2010-07-01 13:00:54 +00:00
hannken 1423e65b26 Clean up vnode lock operations pass 2:
VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
2010-06-24 12:58:48 +00:00
pooka 19599ea184 If msgbuf is not enabled, do not report the node in readdir. That
way ls -l won't report funny errors because getattr for a readdir
result fails.

XXX: lookup for msgbuf still succeeds even if not enabled
2010-03-31 01:27:05 +00:00
pooka a8ae7feaa2 You have found a scroll of genocide --More--
What class of monsters do you wish to genocide? --More--
> fs_foo.h
Wiped out all fs_foo.h
2010-03-03 01:26:01 +00:00
njoly d343885518 Remove unneeded strlen() call in KFShostname case. 2010-01-22 22:46:00 +00:00
pooka c3183f3251 The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase).  Plenty of mix'n match upper/lowercase has creeped
into the tree since then.  Nuke the macros and convert all callsites
to lowercase.

no functional change
2010-01-08 11:35:07 +00:00