Commit Graph

1381 Commits

Author SHA1 Message Date
riastradh bfe8d26204 nfs: Simplify assertion. No functional change intended. 2023-04-09 12:33:58 +00:00
riastradh 9bb8bd4829 nfs: Avoid free of uninitialized on bad name size in create, mknod.
XXX These error branches are a nightmare and need to be more
systematically cleaned up.  Even if they are correct now, they are
impossible to audit and extremely fragile in case anyone ever needs
to make other changes to them.

XXX pullup-8
XXX pullup-9
XXX pullup-10
2023-03-23 19:53:01 +00:00
riastradh c6ac016a00 nfs: Use unsigned name lengths so we don't trip over negative ones.
- nfsm_strsiz is only used with uint32_t in callers, but let's not
  leave it as a rake to step on.

- nfsm_srvnamesiz is abused with signed s.  The internal conversion
  to unsigned serves to reject both negative and too-large values in
  such callers.

  XXX Should make all callers use unsigned, rather than flipping back
  and forth between signed and unsigned for name lengths.

XXX pullup-8
XXX pullup-9
XXX pullup-10
2023-03-23 19:52:52 +00:00
riastradh 9b69ffe4d3 nfs: Avoid integer overflow in nfs_namei bounds check.
XXX pullup-8
XXX pullup-9
XXX pullup-10
2023-03-23 19:52:42 +00:00
riastradh bb406b518d nfs: Use unsigned fhlen so we don't trip over negative values.
XXX pullup-8
XXX pullup-9
XXX pullup-10
2023-03-23 19:52:33 +00:00
christos 93c8adad8b PR/57279: Izumi Tsutsui: Fix some {int,long} -> time_t. Still things will
break eventually because parts of the nfs protocol assume time_t will fit
in 32 bits.
2023-03-21 15:47:46 +00:00
mlelstv 873be8eff4 Avoid overflow of nfs_commitsize on machines with > 32GB RAM. 2023-03-17 00:46:35 +00:00
andvar 9c474f3292 s/reqest/request/ in comment. 2022-12-24 15:37:50 +00:00
hannken 218437a441 When partitioning a mbuf chain with m_split() the last mbuf of the returned
tail chain is not necessarily the same as the last mbuf of the initial chain.

Always set "slp->ns_rawend" to the last mbuf of the tail chain to prevent
mbuf leaks and corruption.
2022-12-20 09:40:09 +00:00
knakahara bf4e44ab86 Remove routes on an address removal if the routes referencing to the address. Implemented by ozaki-r@n.o.
A route that has a gateway is on a connected route can be invalid if the
connected route is deleted, i.e., an associated address is removed.
Traditionally NetBSD doesn't sweep such a route on the address removal.  Sending
packets over the route fails with "No route to host".  Also the route holds an
orphan ifaddr as rt_ifa that is destructed say by in_purgeaddr.

If the same address is assgined again in such a state, there can be two
different ifaddr objects with the same address.  Until recently it's not a
big problem because we can send packets anyway.  However after MP-ification
of the network stack, we can't send packets because we strictly check if rt_ifa
(i.e., the (old) ifaddr) is valid.

This change automatically removes such routes on a removal of an associated
address to avoid keeping inconsistent routes.
2022-09-20 02:23:37 +00:00
hannken 2b69930f08 Remove an incorrect assertion.
Just issue a readahead near the end of the vnode and enqueue an async read.
Now let nfs_setattr() truncate the vnode, set its new size and
nfs_vinvalbuf() waits for the pages from the readahead to become unbusy.

The async read gets processed and returns with uio_resid > 0 because there
is a hole and no write after the hole has been pushed yet.  As the vnode
size already got truncated to the new size the KASSERT() incorrectly fires.
2022-06-24 16:50:00 +00:00
andvar 9f4a9600be fix various typos in comments, docs and log messages. 2022-05-24 06:27:59 +00:00
hannken 5f1791352a As VOP_GETATTR() needs a shared lock at least move the preopattr lookup
inside nfs_namei() where we may lock the start directory without violating
the lock order.
2022-04-27 17:38:52 +00:00
christos ef2dbbad7b restructure so we abort/unlock properly on failure. 2022-03-30 10:52:59 +00:00
christos 6a3c4a6f4f add a kauth vnode check for creating links 2022-03-27 16:24:57 +00:00
hannken 0ed40efe13 Revert the hack from the last commit now that VOP_UNLOCK()
no longer may hold v_interlock or vmobjlock.
2022-02-28 08:45:36 +00:00
andvar bee56d9ee9 s/ony/only/ 2022-02-09 21:50:24 +00:00
christos 3b2f8cc103 This is a temporary hack to avoid nfs crashes related to nfs_delaytruncate. 2022-01-14 19:19:34 +00:00
msaitoh 8194593f85 s/runable/runnable/ 2021-12-05 07:35:17 +00:00
andvar 6f8dc1509f fix various typos, mainly in comments, but also in man pages and log messages. 2021-10-21 13:21:53 +00:00
thorpej 982ae832c3 Overhaul of the EVFILT_VNODE kevent(2) filter:
- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
  forcing each individual file system to deal with it (except VOP_RENAME(),
  because VOP_RENAME() is a mess and we currently have 2 different ways
  of handling it; at least it's reasonably well-centralized in the "new"
  way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
  compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
  to avoid doing work for events no one cares about (avoiding, e.g.
  taking locks and traversing the klist to send a NOTE_WRITE when
  someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
  to be invoked before and after vop_pre() and vop_post(), respectively.
  Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
  vop_*_args structures.  These context fields are used to convey information
  between the file system VOP function and the VOP wrapper, but do not
  occupy an argument slot in the VOP_*() call itself.  These context fields
  are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
  back the resulting link count of the target vnode.  Return this in tmpfs,
  udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
2021-10-20 03:08:16 +00:00
thorpej bc827931b3 Mark the EVFILT_VNODE filters MP-safe. 2021-10-11 01:49:08 +00:00
thorpej e305e37ace Setting EV_EOF requires modifying kn->kn_flags. However, that relies on
holding the kq_lock of that note's kq.  Rather than exposing this directly,
add new knote_set_eof() and knote_clear_eof() functions that handle the
necessary locking and don't leak as many implementation details to modules.

NetBSD 9.99.91
2021-10-11 01:07:36 +00:00
thorpej 7825206807 Must hold kn->kn_kq->kq_lock to modify kn->kn_flags. 2021-10-10 23:46:22 +00:00
thorpej 12ae65d98c Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89
2021-09-26 01:16:07 +00:00
andvar b780d9b67b fix various typos, mainly in comments. 2021-09-16 20:17:46 +00:00
andvar d4eac28cae s/directry/directory/ 2021-08-12 20:25:26 +00:00
dholland 4171507047 Abolish all the silly indirection macros for initializing vnode ops tables.
These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.
2021-07-18 23:57:13 +00:00
dholland d819c3614f Use macros for the canned parts of device and fifo vnode op tables.
Add GENFS_SPECOP_ENTRIES and GENFS_FIFOOP_ENTRIES macros that contain
the portion of the vnode ops table declaration that is
(conservatively) the same in every fs. Use these in every fs that
supports devices and/or fifos with separate ops tables.

Note that ptyfs works differently (it has one type of vnode with
open-coded dispatch to the specfs code, which I haven't changed in
this commit) and rump/librump/rumpvfs/rumpfs.c has an indirect dynamic
dispatch that already does more or less the same thing, which I also
haven't changed.

Also note that this anticipates a few bits in the next changeset here
and there, and adds missing but unreachable calls in some cases (e.g.
most fses weren't defining whiteout on devices and fifos, but it isn't
reachable there), and it changes parsepath on devices and fifos to
genfs_badop from genfs_parsepath (but it's not reachable there
either).

It appears that devices in kernfs were missing kqfilter, so it's
possible that if you try to use kqueue on /kern/rootdev that it'll
explode.

And finally note that the ops declaration tables aren't
order-dependent. (Other than vop_default_desc has to come first.)
Otherwise this wouldn't work.
2021-07-18 23:56:12 +00:00
dholland c6c16cd073 - Add a new vnode op: VOP_PARSEPATH.
- Move namei_getcomponent to genfs_vnops.c and call it genfs_parsepath.
 - Add a parsepath entry to every vnode ops table.

VOP_PARSEPATH takes a directory vnode to be searched and a complete
following path and chooses how much of that path to consume. To begin
with, all parsepath calls are genfs_parsepath, which locates the first
'/' as always.

Note that the call doesn't take the whole struct componentname, only
the string. The other bits of struct componentname should not be
needed and there's no reason to cause potential complications by
exposing them.
2021-06-29 22:34:05 +00:00
mlelstv 387503e686 Don't pretend that files are limited to 1TB on NFSv3. 2021-06-13 10:25:11 +00:00
hannken 5b8c1df03b Add flag/command NFSSVC_REPLACEEXPORTSLIST to nfssvc(2) system call.
Works like NFSSVC_SETEXPORTSLIST but supports "mel_nexports > 1"
and will atomically update the complete exports list for a file system.
2021-06-04 10:44:58 +00:00
simonb 70da67d08d Remove nfs_putpages() prototype; it's not defined anywhere. 2021-05-27 08:58:29 +00:00
christos 05909ab8d5 Set f_namemax during mount time like all the other filesystems so that
it does gets the right data in copy_statvfs_info(). Otherwise f_namemax
can end up being 0. To reproduce: unmount the remote filesystem, remount
it, and kill -HUP mountd to refresh exports.
2021-04-02 03:07:54 +00:00
riastradh 9fc453562f Round of uvm.h cleanup.
The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
  to query whether curlwp is the pagedaemon, which should maybe be
  exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
  by it.  We should split up uvm_extern.h but this will serve for now
  to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
  UVMHIST(ubchist), since ubchist is declared in uvm.h but the
  reference evaporates if UVMHIST is not defined, so we reduce header
  file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
  here.

ok chs@
2020-09-05 16:30:10 +00:00
christos 79e3c74f8e Introduce genfs_pathconf() and use it for the default case in all filesystems. 2020-06-27 17:29:17 +00:00
ad 4bfe043955 - Alter the convention for uvm_page_array slightly, so the basic search
parameters can't change part way through a search: move the "uobj" and
  "flags" arguments over to uvm_page_array_init() and store those with the
  array.

- With that, detect when it's not possible to find any more pages in the
  tree with the given search parameters, and avoid repeated tree lookups if
  the caller loops over uvm_page_array_fill_and_peek().
2020-05-25 21:15:10 +00:00
ad 0eaaa024ea Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
2020-05-23 23:42:41 +00:00
ad ff872804dc Start trying to reduce cache misses on vm_page during fault processing.
- Make PGO_LOCKED getpages imply PGO_NOBUSY and remove the latter.  Mark
  pages busy only when there's actually I/O to do.

- When doing COW on a uvm_object, don't mess with neighbouring pages.  In
  all likelyhood they're already entered.

- Don't mess with neighbouring VAs that have existing mappings as replacing
  those mappings with same can be quite costly.

- Don't enqueue pages for neighbour faults unless not enqueued already, and
  don't activate centre pages unless uvmpdpol says its useful.

Also:

- Make PGO_LOCKED getpages on UAOs work more like vnodes: do gang lookup in
  the radix tree, and don't allocate new pages.

- Fix many assertion failures around faults/loans with tmpfs.
2020-05-17 19:38:16 +00:00
christos 9aa2a9c323 Add ACL support for FFS. From FreeBSD. 2020-05-16 18:31:45 +00:00
hannken d5cb0dea34 Resolve delayed truncation from nfs_inactive() too.
Should prevent "locking against self" from nfs_unlock().
2020-05-01 08:43:00 +00:00
ad f5ad84fdb3 PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)
- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
  somewhere.  Use it to decide whether to do direct-mapped copy, rather than
  poking around directly in the vnode in ubc_uiomove(), which is ugly and
  doesn't work for tmpfs.  It would be nicer to contain all this in UVM but
  the filesystem provides the needed locking here (VV_MAPPED) and to
  reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS().  Pass in UBC_ISMAPPED where
  appropriate.
2020-04-23 21:47:07 +00:00
ad 23bf88000c Replace most uses of vp->v_usecount with a call to vrefcnt(vp), a function
that hides the details and does atomic_load_relaxed().  Signature matches
FreeBSD.
2020-04-13 19:23:17 +00:00
mlelstv 3679de0323 NFSv2 is limited to use only 32bit in metadata. Prevent that larger
metadata values are simply truncated.

-> clamp filesystem block counts to signed 32bit.
-> clamp file sizes to signed 32bit (*)

Some NFSv2 clients also have problems to handle buffer sizes larger
than (signed) 16bit.
-> clamp buffer sizes to signed 16bit for better compatibility.

(*) This can lead to erroneous behaviour for files larger than 2GB
that NFSv2 cannot handle but it is still better than before.
An alternative would be to (partially) reject operations on files
larger than 2GB, but which causes other problems.
2020-04-04 07:07:20 +00:00
ad 1d7848ad43 Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core.  Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.
2020-03-22 18:32:41 +00:00
pgoyette 9120d4511b Use the module subsystem's ability to process SYSCTL_SETUP() entries to
automate installation of sysctl nodes.

Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself.  These are not changed.
2020-03-16 21:20:09 +00:00
ad 16d4fad635 - Hide the details of SPCF_SHOULDYIELD and related behind a couple of small
functions: preempt_point() and preempt_needed().

- preempt(): if the LWP has exceeded its timeslice in kernel, strip it of
  any priority boost gained earlier from blocking.
2020-03-14 18:08:38 +00:00
mgorny 35f46e0f22 Update NFS errno mapping and add assert for correctness
Add the mapping for errno values missing in nfsrv_v2errmap[].  While
at it, add a compile-time assert to make sure that the array does not
become out-of-date again.
2020-03-08 22:12:42 +00:00
ad bf79731039 Tighten up the locking around vp->v_iflag a little more after the recent
split of vmobjlock & v_interlock.
2020-02-27 22:12:53 +00:00
ad bfc37e9217 v_interlock -> vmobjlock 2020-02-24 20:11:45 +00:00