Commit Graph

325 Commits

Author SHA1 Message Date
schmonz fee42fd1d9 NFS client: fix interop with macOS 14 servers.
Symptom: a bunch of "Cannot open `.' (Invalid argument)".

thorpej@ analysis and fix: on the first request to read a given
directory, make sure READDIR and READDIRPLUS cookie verifiers are
being set to 0. This is in RFC1813 and macOS must have gotten
stricter about it.

Verified on 10.0_RC1/aarch64 to fix the reproducers in PR kern/57691 as
well as the original use case in which I met the bug: pkg_rr once again
runs to completion.
2023-12-10 18:16:08 +00:00
andvar 9f4a9600be fix various typos in comments, docs and log messages. 2022-05-24 06:27:59 +00:00
christos ef2dbbad7b restructure so we abort/unlock properly on failure. 2022-03-30 10:52:59 +00:00
christos 6a3c4a6f4f add a kauth vnode check for creating links 2022-03-27 16:24:57 +00:00
thorpej 982ae832c3 Overhaul of the EVFILT_VNODE kevent(2) filter:
- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
  forcing each individual file system to deal with it (except VOP_RENAME(),
  because VOP_RENAME() is a mess and we currently have 2 different ways
  of handling it; at least it's reasonably well-centralized in the "new"
  way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
  compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
  to avoid doing work for events no one cares about (avoiding, e.g.
  taking locks and traversing the klist to send a NOTE_WRITE when
  someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
  to be invoked before and after vop_pre() and vop_post(), respectively.
  Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
  vop_*_args structures.  These context fields are used to convey information
  between the file system VOP function and the VOP wrapper, but do not
  occupy an argument slot in the VOP_*() call itself.  These context fields
  are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
  back the resulting link count of the target vnode.  Return this in tmpfs,
  udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
2021-10-20 03:08:16 +00:00
dholland 4171507047 Abolish all the silly indirection macros for initializing vnode ops tables.
These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.
2021-07-18 23:57:13 +00:00
dholland d819c3614f Use macros for the canned parts of device and fifo vnode op tables.
Add GENFS_SPECOP_ENTRIES and GENFS_FIFOOP_ENTRIES macros that contain
the portion of the vnode ops table declaration that is
(conservatively) the same in every fs. Use these in every fs that
supports devices and/or fifos with separate ops tables.

Note that ptyfs works differently (it has one type of vnode with
open-coded dispatch to the specfs code, which I haven't changed in
this commit) and rump/librump/rumpvfs/rumpfs.c has an indirect dynamic
dispatch that already does more or less the same thing, which I also
haven't changed.

Also note that this anticipates a few bits in the next changeset here
and there, and adds missing but unreachable calls in some cases (e.g.
most fses weren't defining whiteout on devices and fifos, but it isn't
reachable there), and it changes parsepath on devices and fifos to
genfs_badop from genfs_parsepath (but it's not reachable there
either).

It appears that devices in kernfs were missing kqfilter, so it's
possible that if you try to use kqueue on /kern/rootdev that it'll
explode.

And finally note that the ops declaration tables aren't
order-dependent. (Other than vop_default_desc has to come first.)
Otherwise this wouldn't work.
2021-07-18 23:56:12 +00:00
dholland c6c16cd073 - Add a new vnode op: VOP_PARSEPATH.
- Move namei_getcomponent to genfs_vnops.c and call it genfs_parsepath.
 - Add a parsepath entry to every vnode ops table.

VOP_PARSEPATH takes a directory vnode to be searched and a complete
following path and chooses how much of that path to consume. To begin
with, all parsepath calls are genfs_parsepath, which locates the first
'/' as always.

Note that the call doesn't take the whole struct componentname, only
the string. The other bits of struct componentname should not be
needed and there's no reason to cause potential complications by
exposing them.
2021-06-29 22:34:05 +00:00
riastradh 9fc453562f Round of uvm.h cleanup.
The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
  to query whether curlwp is the pagedaemon, which should maybe be
  exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
  by it.  We should split up uvm_extern.h but this will serve for now
  to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
  UVMHIST(ubchist), since ubchist is declared in uvm.h but the
  reference evaporates if UVMHIST is not defined, so we reduce header
  file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
  here.

ok chs@
2020-09-05 16:30:10 +00:00
christos 79e3c74f8e Introduce genfs_pathconf() and use it for the default case in all filesystems. 2020-06-27 17:29:17 +00:00
christos 9aa2a9c323 Add ACL support for FFS. From FreeBSD. 2020-05-16 18:31:45 +00:00
ad 23bf88000c Replace most uses of vp->v_usecount with a call to vrefcnt(vp), a function
that hides the details and does atomic_load_relaxed().  Signature matches
FreeBSD.
2020-04-13 19:23:17 +00:00
ad d2a0ebb67a UVM locking changes, proposed on tech-kern:
- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart.  v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap.  Others to follow later.
2020-02-23 15:46:38 +00:00
christos 9a1f52751e remove NCHNAMLEN optimization 2019-09-10 23:19:34 +00:00
riastradh d1579b2d70 Rename min/max -> uimin/uimax for better honesty.
These functions are defined on unsigned int.  The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER!  Some subsystems have

	#define min(a, b)	((a) < (b) ? (a) : (b))
	#define max(a, b)	((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX.  Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate.  But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all.  (Who knows, maybe in some cases integer
truncation is actually intended!)
2018-09-03 16:29:22 +00:00
riastradh 6fa7b15833 Change VOP_REMOVE and VOP_RMDIR to preserve lock/ref on dvp.
No change to vp -- the plan is to replace the node by the
componentname in the vop parameters, and let all directory vops do
lookups internally.

Proposed on tech-kern with no objections:
https://mail-index.netbsd.org/tech-kern/2017/04/17/msg021825.html
2017-04-26 03:02:47 +00:00
hannken e49191eb15 Return an error if NFSPROC_LOOKUP returns the file handle of the current
directory.  Treating it as DOT lookup would put garbage into the name
cache and could panic on future lookups.

Seen with ZFS file system exported from OmniOS, an OpenSolaris derivative.

Fixes PR kern/50664 "cd .." over NFS/ZFS can panic kernel
2016-01-19 10:56:59 +00:00
chs 6dafee9c7b in nfs_writerpc(), avoid a signed/unsigned problem in computing the
number of bytes to back up in the uio when we need to resend a write RPC
(eg. after a server crash) on a 64-bit platform.  should fix PR 35448.
2015-05-14 17:35:54 +00:00
riastradh 46e71c7d57 Make VOP_LINK return directory still locked and referenced.
Ride 7.99.10 bump.
2015-04-20 22:59:19 +00:00
dholland 05d075b3ae Add VOP_FALLOCATE and VOP_FDISCARD to every vnode ops table I can
find.

The filesystem ones all call genfs_eopnotsupp - right now I am only
implementing the plumbing and we can implement fallocate and/or
fdiscard for files later.

The device ones call spec_fallocate (which is also genfs_eopnotsupp)
and spec_fdiscard, which dispatches to the device-level op.

The fifo ones all call vn_fifo_bypass, which also ends up being
EOPNOTSUPP.
2014-07-25 08:20:51 +00:00
hannken 0d77f6a387 Use vcache_rekey_* for nfs_lookitup() in the "*npp != NULL" case. 2014-07-05 09:33:41 +00:00
hannken 97834f7ba0 Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
2014-02-07 15:29:20 +00:00
hannken 04c776e5c8 Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
2014-01-23 10:13:55 +00:00
hannken 1139274440 Change vnode operations create, mknod, mkdir and symlink to keep the
directory node dvp locked on return.

Discussed on tech-kern@

Welcome to 6.99.29
2014-01-17 10:55:01 +00:00
nisimura c33cfaf7a6 add one more __unused attribute to shut gcc4.8 off. 2013-11-15 14:39:53 +00:00
martin 6fd75b6342 Backout wildcard pragma to kill warnings and instead sprinkle a few dozen
__unused attributes.
Requested by joerg@
2013-09-14 22:29:08 +00:00
plunky 5ec364d4d9 C99 section 6.7.2.3 (Tags) Note 3 states that:
A type specifier of the form

	enum identifier

  without an enumerator list shall only appear after the type it
  specifies is complete.

which means that we cannot pass an "enum vtype" argument to
kauth_access_action() without fully specifying the type first.
Unfortunately there is a complicated include file loop which
makes that difficult, so convert this minimal function into a
macro (and capitalize it).

(ok elad@)
2013-03-18 19:35:35 +00:00
macallan 635775b851 fix crash in nfs client lookups, dholland says 'my fault' 2012-11-07 02:31:48 +00:00
dholland 35ed690545 Excise struct componentname from the namecache.
This uglifies the interface, because several operations need to be
passed the namei flags and cache_lookup also needs for the time being
to be passed cnp->cn_nameiop. Nonetheless, it's a net benefit.

The glop should be able to go away eventually but requires structural
cleanup elsewhere first.

This change requires a kernel bump.
2012-11-05 17:27:37 +00:00
dholland 1617a81dd1 Disentangle the namecache from the internals of namei.
- Move the namecache's hash computation to inside the namecache code,
instead of being spread out all over the place. Remove cn_hash from
struct componentname and delete all uses of it.

 - It is no longer necessary (if it ever was) for cache_lookup and
cache_lookup_raw to clear MAKEENTRY from cnp->cn_flags for the cases
that cache_enter already checks for.

 - Rearrange the interface of cache_lookup (and cache_lookup_raw) to
make it somewhat simpler, to exclude certain nonexistent error
conditions, and (most importantly) to make it not require write access
to cnp->cn_flags.

This change requires a kernel bump.
2012-11-05 17:24:09 +00:00
rmind d65753d972 Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.

No objection on tech-kern@.
2012-07-22 00:53:18 +00:00
drochner 937cb76bdb fix access permission check which got broken by some kauth rework
in March, affected mostly systems with NFS root fs
2012-04-27 18:12:01 +00:00
tls f27d6532f5 Remove arc4random() and arc4randbytes() from the kernel API. Replace
arc4random() hacks in rump with stubs that call the host arc4random() to
get numbers that are hopefully actually random (arc4random() keyed with
stack junk is not).  This should fix some of the currently failing anita
tests -- we should no longer generate duplicate "random" MAC addresses in
the test environment.
2011-11-28 08:05:05 +00:00
christos 0d448f6c48 use NFS_MAXPATHLEN instead of MAXPATHLEN 2011-09-27 01:05:08 +00:00
rmind e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
rmind 800683e30d sys_link: prevent hard links on directories (cross-mount operations are
already prevented).  File systems are no longer responsible to check this.
Clean up and add asserts (note that dvp == vp cannot happen in vop_link).

OK dholland@
2011-04-24 21:35:29 +00:00
cegger dce33a3d49 Initialize mutex and cv after sanity checks 2010-12-14 16:58:58 +00:00
cegger c26582e6b3 back out rev. 1.285. The problem I try to hunt down
in PR 42455 is not in the network stack as shown by PR 44206.
2010-12-14 16:25:18 +00:00
dholland 14402d0ff1 Abolish the SAVENAME and HASBUF flags. There is now always a buffer,
so the path in a struct componentname is now always valid during VOP
calls.
2010-11-30 10:43:01 +00:00
dholland d4eb05390d Abolish struct componentname's cn_pnbuf. Use the path buffer in the
pathbuf object passed to namei as work space instead. (For now a pnbuf
pointer appears in struct nameidata, to support certain unclean things
that haven't been fixed yet, but it will be going away in the future.)

This removes the need for the SAVENAME and HASBUF namei flags.
2010-11-30 10:29:57 +00:00
cegger cd68b811ed Add diagnostic check which hits when PR 42455 is reproduced.
Idea from hans@
2010-10-26 11:44:53 +00:00
hannken 1423e65b26 Clean up vnode lock operations pass 2:
VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.
2010-06-24 12:58:48 +00:00
pooka 242bf1c3e7 Stop exposing fifofs internals and leave only fifo_vnodeop_p visible. 2010-03-29 13:11:32 +00:00
pooka c3183f3251 The VATTR_NULL/VREF/VHOLD/HOLDRELE() macros lost their will to live
years ago when the kernel was modified to not alter ABI based on
DIAGNOSTIC, and now just call the respective function interfaces
(in lowercase).  Plenty of mix'n match upper/lowercase has creeped
into the tree since then.  Nuke the macros and convert all callsites
to lowercase.

no functional change
2010-01-08 11:35:07 +00:00
rmind 40cf6f3659 Remove uarea swap-out functionality:
- Addresses the issue described in PR/38828.
- Some simplification in threading and sleepq subsystems.
- Eliminates pmap_collect() and, as a side note, allows pmap optimisations.
- Eliminates XS_CTL_DATA_ONSTACK in scsipi code.
- Avoids few scans on LWP list and thus potentially long holds of proc_lock.
- Cuts ~1.5k lines of code.  Reduces amd64 kernel size by ~4k.
- Removes __SWAP_BROKEN cases.

Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on
acorn26 (thanks to <bjh21>).

Discussed on <tech-kern>, reviewed by <ad>.
2009-10-21 21:11:57 +00:00
apb 72648fe99b Use pid_t, not short, for a pid.
Part of PR 41255 from Kurt Lidl.
2009-07-14 20:59:54 +00:00
elad 870920260d Move the implementation of vaccess() to genfs_can_access(), in line with
the other routines of the same spirit.

Adjust file-system code to use it.

Keep vaccess() for KPI compatibility and to keep element of least
surprise. A "diagnostic" message warning that vaccess() is deprecated will
be printed when it's used (obviously, only in DIAGNOSTIC kernels).

No objections on tech-kern@:

	http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005310.html
2009-06-23 19:36:38 +00:00
yamt 8276e4de1a nfs_lookup: vn_lock the vnode returned by cache_lookup_raw
before feeding it to VOP_GETATTR.  it's necessary because the vnode might
be being cleaned by getcleanvnode.

it's an instance of more general races between vnode reclaim and
unlocked VOPs.  however, this one happens somewhat often because it can be
triggered by getnewvnode rather than revoke.
2009-05-10 05:18:26 +00:00
yamt dd0402b09c restore lines, esp. a vrele() call, which i mistakenly removed
in the previous.  (rev.1.276)
2009-05-10 03:51:43 +00:00
yamt fc99505dcc nfs_lookup: handle the case where the vnode returned cache_lookup_raw is
being reclaimed by another thread.  after recent changes in cache_lookup_raw,
there's a race between cache_lookup_raw/vtryget and getcleanvnode/vclean.
PR/41028.
2009-05-04 05:59:35 +00:00