Commit Graph

836 Commits

Author SHA1 Message Date
andvar 1cd43426d5 Fix various typos in comments, log messages and documentation. 2024-02-10 18:43:51 +00:00
andvar 8bc05faa38 s/defaut/default/ in comments. 2023-08-24 14:56:03 +00:00
mrg cab2d18424 don't assign struct pointers to smaller then structure regions of memory.
in all cases here, the later parts of the structure are not actually
accessed, so there are no existing bugs here beyond general UB.  for the
ufs ones, this also removes some casts.

found by GCC 12.
2023-08-10 20:49:19 +00:00
riastradh fb591c4d97 ufs: Nix trailing whitespace and tidy up some other minor KNF. 2023-02-22 21:49:45 +00:00
andvar 59f23108d5 s/isses/issues/ 2023-01-30 13:45:26 +00:00
chs 87ba0e2a31 Restore backward compatibility of UFS2 with previous NetBSD releases by
disabling support in UFS2 for extended attributes (including ACLs).
Add a new variant of UFS2 called "UFS2ea" that does support extended attributes.
Add new	fsck_ffs operations "-c	ea" and	"-c no-ea" to convert file systems
from UFS2 to UFS2ea and	vice-versa (both of which delete all existing extended
attributes in the process).
2022-11-17 06:40:38 +00:00
simonb 03d50941f1 If UFS or LFS dirhash is enabled in the kernel, set the dirhash cache
size dependant on memory size.  If less than 128MB of memory, default
to no cache.  With 128MB of memory or more, use a maximum cache size of
1/64th of memory; cap maximum default cache size to 32MB (for systems
with 2GB of memory or more).

The dirhash cache sizes are still explicityly setable by sysctl(8) or
by adding relevant entry(s) to sysctl.conf(5).
2022-08-07 02:33:47 +00:00
andvar 6478b40555 s/blity/bility/ in various words, mainly in comments. 2022-08-06 18:26:41 +00:00
andvar 22d402f681 s/grabing/grabbing/ in comments. 2022-05-28 22:08:46 +00:00
hannken 9cac7aaf66 Keep flag "UFS_QUOTA" set until the last quota is closed.
Prevents a live lock when dqrele() finds a struct with "dq_cnt == 1"
and flag "DQ_MOD" and cannot sync as flag UFS_QUOTA is unset.
2022-04-26 15:37:25 +00:00
christos 6a3c4a6f4f add a kauth vnode check for creating links 2022-03-27 16:24:57 +00:00
andvar a41f6947a1 fix few typos for word "previous(ly)" in comments. 2022-03-23 13:06:06 +00:00
hannken 371e1d9d1b Fix wrong assertion, the negatiopn of "a && b" is "!a || !b" so we
need "DIP(ip, blocks) != 0" here.

Should fix PR kern/56725 (Panic when ls directory with device nodes
on an older ffs)
2022-02-21 17:07:45 +00:00
christos 1f3957b4a3 use MNT_NFS4ACLS instead of MNT_ACLS (which was changed before to mean
MNT_POSIX1EACLS)
2021-11-26 17:35:12 +00:00
thorpej 982ae832c3 Overhaul of the EVFILT_VNODE kevent(2) filter:
- Centralize vnode kevent handling in the VOP_*() wrappers, rather than
  forcing each individual file system to deal with it (except VOP_RENAME(),
  because VOP_RENAME() is a mess and we currently have 2 different ways
  of handling it; at least it's reasonably well-centralized in the "new"
  way).
- Add support for NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, and NOTE_READ,
  compatible with the same events in FreeBSD.
- Track which kevent notifications clients are interested in receiving
  to avoid doing work for events no one cares about (avoiding, e.g.
  taking locks and traversing the klist to send a NOTE_WRITE when
  someone is merely watching for a file to be deleted, for example).

In support of the above:

- Add support in vnode_if.sh for specifying PRE- and POST-op handlers,
  to be invoked before and after vop_pre() and vop_post(), respectively.
  Basic idea from FreeBSD, but implemented differently.
- Add support in vnode_if.sh for specifying CONTEXT fields in the
  vop_*_args structures.  These context fields are used to convey information
  between the file system VOP function and the VOP wrapper, but do not
  occupy an argument slot in the VOP_*() call itself.  These context fields
  are initialized and subsequently interpreted by PRE- and POST-op handlers.
- Version VOP_REMOVE(), uses the a context field for the file system to report
  back the resulting link count of the target vnode.  Return this in tmpfs,
  udf, nfs, chfs, ext2fs, lfs, and ufs.

NetBSD 9.99.92.
2021-10-20 03:08:16 +00:00
andvar 279d5541d3 fix typos in comments. 2021-10-15 22:32:28 +00:00
thorpej db54703ccf Use VN_KNOTE() to send our NOTE_ATTRIB and NOTE_REVOKE events. 2021-10-10 23:02:10 +00:00
andvar 60f3bf6a63 s/memry/memory+s/softare/software/+s/grapics/graphics+s/ouput/output 2021-08-19 20:56:36 +00:00
dholland 4171507047 Abolish all the silly indirection macros for initializing vnode ops tables.
These are things of the form #define foofs_op genfs_op, or #define
foofs_op genfs_eopnotsupp, or similar. They serve no purpose besides
obfuscation, and have gotten cutpasted all over everywhere.
2021-07-18 23:57:13 +00:00
dholland 723d09ce8e Add containment for the cloning devices hack in vn_open.
Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)
2021-06-29 22:40:53 +00:00
nia 479d6bd7ae Avoid potentially accessing an array with an index out of range.
Reported-by: syzbot+8832f540234b996bc5a9@syzkaller.appspotmail.com
Reported-by: syzbot+0b785dd10d987350ecb3@syzkaller.appspotmail.com
2020-12-25 10:00:40 +00:00
thorpej 33e6765fa8 Remove unnecessary inclusion of <sys/timevar.h>. 2020-12-05 17:33:53 +00:00
riastradh 9fc453562f Round of uvm.h cleanup.
The poorly named uvm.h is generally supposed to be for uvm-internal
users only.

- Narrow it to files that actually need it -- mostly files that need
  to query whether curlwp is the pagedaemon, which should maybe be
  exposed by an external header.

- Use uvm_extern.h where feasible and uvm_*.h for things not exposed
  by it.  We should split up uvm_extern.h but this will serve for now
  to reduce the uvm.h dependencies.

- Use uvm_stat.h and #ifdef UVMHIST uvm.h for files that use
  UVMHIST(ubchist), since ubchist is declared in uvm.h but the
  reference evaporates if UVMHIST is not defined, so we reduce header
  file dependencies.

- Make uvm_device.h and uvm_swap.h independently includable while
  here.

ok chs@
2020-09-05 16:30:10 +00:00
riastradh 13a2b7334d Revert "ufs: Prevent mkdir from choking on deleted directories."
This change made no sense and should not have been committed.
2020-09-05 02:55:38 +00:00
riastradh 4b73975959 ufs: Prevent mkdir from choking on deleted directories.
Fix some missing uvm_vnp_setsize in screw cases while here.
2020-09-05 02:47:48 +00:00
christos 45debbd051 Don't cache id's for vnodes that have ACLs. ok chs@ 2020-08-20 20:28:13 +00:00
chs 2b91209889 pull in a bit more FreeBSD code to allow specifying truncation of
the regular bmap (IO_NORMAL) independently of the extattr bmap (IO_EXT).
fixes fs corruption when removing extattrs in UFS2.
2020-07-26 00:21:24 +00:00
hannken 3e262909c5 Assert ufs_strategy() always gets used while current thread
holds a fstrans lock.
2020-05-18 08:28:44 +00:00
christos 9aa2a9c323 Add ACL support for FFS. From FreeBSD. 2020-05-16 18:31:45 +00:00
ad d6844897f1 cache_enter_id(): give it a boolean parameter to indicate whether the cached
identity is valid.
2020-05-12 23:17:41 +00:00
hannken 58ef9101ad There is no difference between a zero-sized and not yet
reclaimed directory vnode and a non-existent vnode.

Teach ufs_fhtovp() to treat zero-sized directories as stale.

PR kern/55211 (fs/vfs/t_vnops:nfs_dir_rmdirdotdot test fails)
2020-05-01 08:43:37 +00:00
ad f5ad84fdb3 PR kern/54759 (vm.ubc_direct deadlock when read()/write() into mapping of itself)
- Add new flag UBC_ISMAPPED which tells ubc_uiomove() the object is mmap()ed
  somewhere.  Use it to decide whether to do direct-mapped copy, rather than
  poking around directly in the vnode in ubc_uiomove(), which is ugly and
  doesn't work for tmpfs.  It would be nicer to contain all this in UVM but
  the filesystem provides the needed locking here (VV_MAPPED) and to
  reinvent that would suck more.

- Rename UBC_UNMAP_FLAG() to UBC_VNODE_FLAGS().  Pass in UBC_ISMAPPED where
  appropriate.
2020-04-23 21:47:07 +00:00
christos 9b3a9b6911 handle negative small block numbers for extattr 2020-04-20 03:57:02 +00:00
christos 0c792cd1aa Extended attribute support for ffsv2, from FreeBSD. 2020-04-18 19:18:33 +00:00
ad 23bf88000c Replace most uses of vp->v_usecount with a call to vrefcnt(vp), a function
that hides the details and does atomic_load_relaxed().  Signature matches
FreeBSD.
2020-04-13 19:23:17 +00:00
jdolecek 09a46b6e04 remove noncompilable WAPBL_DEBUG_INODES
PR kern/49554 by Thomas Klausner
2020-04-11 17:43:54 +00:00
ad c90f9c8c81 Merge the remaining changes from the ad-namecache branch, affecting namei()
and getcwd():

- push vnode locking back as far as possible.
- do most lookups directly in the namecache, avoiding vnode locks & refs.
- don't block new refs to vnodes across VOP_INACTIVE().
- get shared locks for VOP_LOOKUP() if the file system supports it.
- correct lock types for VOP_ACCESS() / VOP_GETATTR() in a few places.

Possible future enhancements:

- make the lookups lockless.
- support dotdot lookups by being lockless and inferring absence of chroot.
- maybe make it work for layered file systems.
- avoid vnode references at the root & cwd.
2020-04-04 20:49:30 +00:00
pgoyette 9120d4511b Use the module subsystem's ability to process SYSCTL_SETUP() entries to
automate installation of sysctl nodes.

Note that there are still a number of device and pseudo-device modules
that create entries tied to individual device units, rather than to the
module itself.  These are not changed.
2020-03-16 21:20:09 +00:00
ad 16d4fad635 - Hide the details of SPCF_SHOULDYIELD and related behind a couple of small
functions: preempt_point() and preempt_needed().

- preempt(): if the LWP has exceeded its timeslice in kernel, strip it of
  any priority boost gained earlier from blocking.
2020-03-14 18:08:38 +00:00
chs 7bc1eaaf87 in ufsdirhash_free(), only examine dh->dh_onlist after taking the
dirhashlist lock.  if we skip the lock then we might see that
dh_onlist is zero while ufsdirhash_recycle() is still working on
the dirhash.  the symptom I saw was that ufsdirhash_free() would
try to destroy the dh_lock mutex while it was still held.
2020-03-08 00:23:59 +00:00
riastradh df1f0e7940 Revert "Include opt_diagnostic.h for DIAGNOSTIC."
This did not do what I thought it did.  opt_diagnostic.h is only for
the unused _DIAGNOSTIC, which seems like an abortive attempt to
incrementally convert DIAGNOSTIC to an opt_*.h option rather than a
command-line option.
2020-03-05 15:18:54 +00:00
riastradh 57c472be02 Include opt_diagnostic.h for DIAGNOSTIC.
...at least, in header files, which may not have already included
libkern.h.
2020-03-05 08:08:32 +00:00
maxv b1f6fd939e Zero out the padding in 'd_namlen', to prevent info leaks. Same logic as
ufs_makedirentry().

Found by kMSan: the unzeroed bytes of the pool_cache were getting copied
to the disk via a DMA write operation, and there kMSan was noticing
uninitialized memory leaving the system.

Reported-by: syzbot+382c9dffc06a9683abb5@syzkaller.appspotmail.com
2020-02-26 18:00:12 +00:00
ad d2a0ebb67a UVM locking changes, proposed on tech-kern:
- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart.  v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap.  Others to follow later.
2020-02-23 15:46:38 +00:00
ad c2e9cb9413 VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to
allow us to get shared locks (or no lock) on the returned vnode.  Matches
FreeBSD.
2020-01-17 20:08:06 +00:00
ad 05a3457e85 Merge from yamt-pagecache (after much testing):
- Reduce unnecessary page scan in putpages esp. when an object has a ton of
  pages cached but only a few of them are dirty.

- Reduce the number of pmap operations by tracking page dirtiness more
  precisely in uvm layer.
2020-01-15 17:55:43 +00:00
ad 94843b1390 - Add and use wrapper functions that take and acquire page interlocks, and pairs
of page interlocks.  Require that the page interlock be held over calls to
  uvm_pageactivate(), uvm_pagewire() and similar.

- Solve the concurrency problem with page replacement state.  Rather than
  updating the global state synchronously, set an intended state on
  individual pages (active, inactive, enqueued, dequeued) while holding the
  page interlock.  After the interlock is released put the pages on a 128
  entry per-CPU queue for their state changes to be made real in batch.
  This results in in a ~400 fold decrease in contention on my test system.
  Proposed on tech-kern but modified to use the page interlock rather than
  atomics to synchronise as it's much easier to maintain that way, and
  cheaper.
2019-12-31 22:42:50 +00:00
msaitoh a5effc3ce9 s/inital/initial/ 2019-12-27 09:25:57 +00:00
ad 7d06f3305f Make mntvnode_lock per-mount, and address false sharing of struct mount. 2019-12-22 19:47:34 +00:00
ad 5978ddc663 Break the global uvm_pageqlock into a per-page identity lock and a private
lock for use of the pagedaemon policy code.  Discussed on tech-kern.

PR kern/54209: NetBSD 8 large memory performance extremely low
PR kern/54210: NetBSD-8 processes presumably not exiting
PR kern/54727: writing a large file causes unreasonable system behaviour
2019-12-13 20:10:21 +00:00