Commit Graph

1348 Commits

Author SHA1 Message Date
christos
05909ab8d5 Set f_namemax during mount time like all the other filesystems so that
it does gets the right data in copy_statvfs_info(). Otherwise f_namemax
can end up being 0. To reproduce: unmount the remote filesystem, remount
it, and kill -HUP mountd to refresh exports.
2021-04-02 03:07:54 +00:00
riastradh
9fc453562f Round of uvm.h cleanup.
The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
  to query whether curlwp is the pagedaemon, which should maybe be
  exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
  by it.  We should split up uvm_extern.h but this will serve for now
  to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
  UVMHIST(ubchist), since ubchist is declared in uvm.h but the
  reference evaporates if UVMHIST is not defined, so we reduce header
  file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
  here.

ok chs@
2020-09-05 16:30:10 +00:00
christos
79e3c74f8e Introduce genfs_pathconf() and use it for the default case in all filesystems. 2020-06-27 17:29:17 +00:00
ad
4bfe043955 - Alter the convention for uvm_page_array slightly, so the basic search
parameters can't change part way through a search: move the "uobj" and
  "flags" arguments over to uvm_page_array_init() and store those with the
  array.

- With that, detect when it's not possible to find any more pages in the
  tree with the given search parameters, and avoid repeated tree lookups if
  the caller loops over uvm_page_array_fill_and_peek().
2020-05-25 21:15:10 +00:00
ad
0eaaa024ea Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
2020-05-23 23:42:41 +00:00
ad
ff872804dc Start trying to reduce cache misses on vm_page during fault processing.
- Make PGO_LOCKED getpages imply PGO_NOBUSY and remove the latter.  Mark
  pages busy only when there's actually I/O to do.

- When doing COW on a uvm_object, don't mess with neighbouring pages.  In
  all likelyhood they're already entered.

- Don't mess with neighbouring VAs that have existing mappings as replacing
  those mappings with same can be quite costly.

- Don't enqueue pages for neighbour faults unless not enqueued already, and
  don't activate centre pages unless uvmpdpol says its useful.

Also:

- Make PGO_LOCKED getpages on UAOs work more like vnodes: do gang lookup in
  the radix tree, and don't allocate new pages.

- Fix many assertion failures around faults/loans with tmpfs.
2020-05-17 19:38:16 +00:00
christos
9aa2a9c323 Add ACL support for FFS. From FreeBSD. 2020-05-16 18:31:45 +00:00
hannken
d5cb0dea34 Resolve delayed truncation from nfs_inactive() too.
Should prevent "locking against self" from nfs_unlock().
2020-05-01 08:43:00 +00:00
ad
f5ad84fdb3 PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)
- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
  somewhere.  Use it to decide whether to do direct-mapped copy, rather than
  poking around directly in the vnode in ubc_uiomove(), which is ugly and
  doesn't work for tmpfs.  It would be nicer to contain all this in UVM but
  the filesystem provides the needed locking here (VV_MAPPED) and to
  reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS().  Pass in UBC_ISMAPPED where
  appropriate.
2020-04-23 21:47:07 +00:00
ad
23bf88000c Replace most uses of vp->v_usecount with a call to vrefcnt(vp), a function
that hides the details and does atomic_load_relaxed().  Signature matches
FreeBSD.
2020-04-13 19:23:17 +00:00
mlelstv
3679de0323 NFSv2 is limited to use only 32bit in metadata. Prevent that larger
metadata values are simply truncated.

-> clamp filesystem block counts to signed 32bit.
-> clamp file sizes to signed 32bit (*)

Some NFSv2 clients also have problems to handle buffer sizes larger
than (signed) 16bit.
-> clamp buffer sizes to signed 16bit for better compatibility.

(*) This can lead to erroneous behaviour for files larger than 2GB
that NFSv2 cannot handle but it is still better than before.
An alternative would be to (partially) reject operations on files
larger than 2GB, but which causes other problems.
2020-04-04 07:07:20 +00:00
ad
1d7848ad43 Process concurrent page faults on individual uvm_objects / vm_amaps in
parallel, where the relevant pages are already in-core.  Proposed on
tech-kern.

Temporarily disabled on MP architectures with __HAVE_UNLOCKED_PMAP until
adjustments are made to their pmaps.
2020-03-22 18:32:41 +00:00
pgoyette
9120d4511b Use the module subsystem's ability to process SYSCTL_SETUP() entries to
automate installation of sysctl nodes.

Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself.  These are not changed.
2020-03-16 21:20:09 +00:00
ad
16d4fad635 - Hide the details of SPCF_SHOULDYIELD and related behind a couple of small
functions: preempt_point() and preempt_needed().

- preempt(): if the LWP has exceeded its timeslice in kernel, strip it of
  any priority boost gained earlier from blocking.
2020-03-14 18:08:38 +00:00
mgorny
35f46e0f22 Update NFS errno mapping and add assert for correctness
Add the mapping for errno values missing in nfsrv_v2errmap[].  While
at it, add a compile-time assert to make sure that the array does not
become out-of-date again.
2020-03-08 22:12:42 +00:00
ad
bf79731039 Tighten up the locking around vp->v_iflag a little more after the recent
split of vmobjlock & v_interlock.
2020-02-27 22:12:53 +00:00
ad
bfc37e9217 v_interlock -> vmobjlock 2020-02-24 20:11:45 +00:00
ad
d2a0ebb67a UVM locking changes, proposed on tech-kern:
- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart.  v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap.  Others to follow later.
2020-02-23 15:46:38 +00:00
ad
c2e9cb9413 VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode.  Matches
FreeBSD.
2020-01-17 20:08:06 +00:00
ad
05a3457e85 Merge from yamt-pagecache (after much testing):
- Reduce unnecessary page scan in putpages esp. when an object has a ton of
  pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
  precisely in uvm layer.
2020-01-15 17:55:43 +00:00
thorpej
d6c967bb85 - Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
  functions (naming mirrors that of other time access functions in kern_tc.c).
  It returns the (maybe-converted) value of timebasebin, which also tracks
  our estimate of when the system was booted (i.e. the legacy "boottime" was
  redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes.  At least now the problem is centralized in one location.
2020-01-02 15:42:26 +00:00
ad
7d06f3305f Make mntvnode_lock per-mount, and address false sharing of struct mount. 2019-12-22 19:47:34 +00:00
ad
881d12e6f2 Merge from yamt-pagecache:
- do gang lookup of pages using radixtree.
- remove now unused uvm_object::uo_memq and vm_page::listq.queue.
2019-12-15 21:11:34 +00:00
ad
5978ddc663 Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code.  Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
2019-12-13 20:10:21 +00:00
msaitoh
c56890eeef s/initalize/initialize/ in comment or printf message. 2019-10-18 04:09:01 +00:00
christos
9a1f52751e remove NCHNAMLEN optimization 2019-09-10 23:19:34 +00:00
kamil
4067fe4673 Appease GCC and initialize arps_ip
Fixes build as GCC errors with maybe-uninitialized that is a false
positive.
2019-06-29 17:42:36 +00:00
hannken
3c4b857dd5 Bracket do_sys_renameat() and nfsrv_rename() with fstrans.
The v_mount field for vnodes on the same file system as "from"
is now stable for referenced vnodes.

VFS_RENAMELOCK no longer may use lock from an unreferenced and
freed "struct mount".
2019-02-20 10:05:20 +00:00
mrg
fbffadb9f8 - add or adjust /* FALLTHROUGH */ where appropriate
- add __unreachable() after functions that can return but won't in
  this case, and thus can't be marked __dead easily
2019-02-03 03:19:25 +00:00
maxv
5b040abec8 Replace M_ALIGN and MH_ALIGN by m_align. 2018-12-22 14:28:56 +00:00
maxv
b1305a6d63 Replace: M_MOVE_PKTHDR -> m_move_pkthdr. No functional change, since the
former is a macro to the latter.
2018-12-22 13:11:37 +00:00
riastradh
d1579b2d70 Rename min/max -> uimin/uimax for better honesty.
These functions are defined on unsigned int.  The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER!  Some subsystems have

	#define min(a, b)	((a) < (b) ? (a) : (b))
	#define max(a, b)	((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX.  Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate.  But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all.  (Who knows, maybe in some cases integer
truncation is actually intended!)
2018-09-03 16:29:22 +00:00
msaitoh
61e1eb0d0b - Cleanup for dynamic sysctl:
- Remove unused *_NAMES macros for sysctl.
  - Remove unused *_MAXID for sysctls.
- Move CTL_MACHDEP sysctl definitions for m68k into m68k/include/cpu.h and
  use them on all m68k machines.
2018-08-22 01:05:21 +00:00
chs
e406c140eb add a genfs method to allow a file system to limit the range of pages
that are given to a single GOP_WRITE() call.  needed by ZFS.
2018-05-28 21:04:37 +00:00
thorpej
e832c294bb Default NFS mounts to using TCP transport instead of UDP.
PR kern/53166
2018-05-17 02:34:31 +00:00
maxv
0039128179 Use M_MOVE_PKTHDR. 2018-05-08 16:47:58 +00:00
hannken
6c7fda3054 nfsrv_readlink: stop attaching a zero-length mbuf for zero length symlinks. 2018-05-03 07:28:43 +00:00
maxv
2679f01cd0 Hum. This should be M_READONLY, not M_ROMAP.
M_ROMAP tells us whether the mbuf storage is mapped on a read-only page.
But an mbuf can still be read-only in the sense that the storage is
shared with other mbufs.
2018-04-26 20:10:44 +00:00
christos
8cb1b0010b PR/53103: Timo Buhrmester: linux emulation of sendto(2) broken
The sockargs refactoring broke it, because sockargs only works with a user
address. Added an argument to sockargs to indicate where the address is
coming from. Welcome to 8.99.14.
2018-03-16 17:25:04 +00:00
riastradh
977eeed81e Use a random opaque cookie, not kva pointer, for nfssvc(2).
(What were they smoking?!)

I suspect most of this is actually dead code that wasn't properly
amputated along with the rest of the gangrene of NFSKERB a decade
ago, but I'm out of time to investigate further.  If someone else
wants to kill NFSSVC_AUTHIN/NFSSVC_AUTHINFAIL and the rest of the
tentacular kerberosity, be my guest.

Noted by Silvio Cesare of InfoSect.
2018-01-25 17:14:36 +00:00
christos
ef140c5bfd PR/40491: From Tobias Ulmer in tech-kern@:
1. Protect the nfs request queue with its own mutex
2. make the nfs_receive queue check for signals so that intr mounts
   can be interrupted.
XXX: pullup-8
2018-01-21 20:36:49 +00:00
maya
18b796d442 Use C99 initializer for filterops
Mostly done with spatch with touchups for indentation

@@
expression a;
identifier b,c,d;
identifier p;
@@
const struct filterops p =
- 	{ a, b, c, d
+ 	{
+ 	.f_isfd = a,
+ 	.f_attach = b,
+ 	.f_detach = c,
+ 	.f_event = d,
};
2017-10-25 08:12:37 +00:00
riastradh
93562e3f53 Eliminate crusty debugging sludge.
We have a mostly sane vnode lifecycle now.  If this needs debugging,
it should be done once at the call site of VOP_RECLAIM.
2017-05-26 14:34:19 +00:00
riastradh
7f7aad09bd Make VOP_RECLAIM do the last unlock of the vnode.
VOP_RECLAIM naturally has exclusive access to the vnode, so having it
locked on entry is not strictly necessary -- but it means if there
are any final operations that must be done on the vnode, such as
ffs_update, requiring exclusive access to it, we can now kassert that
the vnode is locked in those operations.

We can't just have the caller release the last lock because some file
systems don't use genfs_lock, and require the vnode to remain valid
for VOP_UNLOCK to work, notably unionfs.
2017-05-26 14:20:59 +00:00
riastradh
6fa7b15833 Change VOP_REMOVE and VOP_RMDIR to preserve lock/ref on dvp.
No change to vp -- the plan is to replace the node by the
componentname in the vop parameters, and let all directory vops do
lookups internally.

Proposed on tech-kern with no objections:
https://mail-index.netbsd.org/tech-kern/2017/04/17/msg021825.html
2017-04-26 03:02:47 +00:00
hannken
20bb034f5b Remove unused argument "nextp" from vfs_busy() and vfs_unbusy().
Remove argument "keepref" from vfs_unbusy() and add vfs_ref() where needed.
2017-04-17 08:32:00 +00:00
hannken
ebb8f73b4b Add vfs_ref(mp) and vfs_rele(mp) to add or remove a reference to
struct mount.  Rename vfs_destroy(mp) to vfs_rele(mp) and replace
incrementing mp->mnt_refcnt with vfs_ref(mp).
2017-04-17 08:31:01 +00:00
riastradh
87fb32292e Make VOP_INACTIVE preserve vnode lock on return.
Discussed on tech-kern:
https://mail-index.netbsd.org/tech-kern/2017/04/01/msg021751.html

Ride 7.99.68, a bumpy bus of incremental vfs improvements!
2017-04-11 14:24:59 +00:00
riastradh
30509f8074 KASSERT(mutex_owned(vp->v_interlock)) in vnode iterator selector. 2017-04-01 19:35:56 +00:00
hannken
326db3aaf6 Add generic genfs_suspendctl() and use it for all file systems.
Layered file systems need work.
2017-02-17 08:31:23 +00:00