NetBSD

Commit Graph

Author	SHA1	Message	Date
hannken	ca4932dc6d	Print dangling vnode before panic() to help debug. PR kern/57775 ""panic: unmount: dangling vnode" while umounting procfs"	2024-01-17 10:17:29 +00:00
hannken	a5288eef9a	Include "veriexec.h" and <sys/verified_exec.h> to run veriexec_unmountchk() on "NVERIEXEC > 0".	2023-12-28 12:48:08 +00:00
riastradh	7dd29855d3	kern: Eliminate most __HAVE_ATOMIC_AS_MEMBAR conditionals. I'm leaving in the conditional around the legacy membar_enters (store-before-load, store-before-store) in kern_mutex.c and in kern_lock.c because they may still matter: store-before-load barriers tend to be the most expensive kind, so eliding them is probably worthwhile on x86. (It also may not matter; I just don't care to do measurements right now, and it's a single valid and potentially justifiable use case in the whole tree.) However, membar_release/acquire can be mere instruction barriers on all TSO platforms including x86, so there's no need to go out of our way with a bad API to conditionalize them. If the procedure call overhead is measurable we just could change them to be macros on x86 that expand into __insn_barrier. Discussed on tech-kern: https://mail-index.netbsd.org/tech-kern/2023/02/23/msg028729.html	2023-02-24 11:02:27 +00:00
hannken	85cb97f0d7	Harden layered file systems usage of field "mnt_lower" against forced unmounts of the lower layer. - Dont allow "dead_rootmount" as lower layer. - Take file system busy before a vfs operation walks down the stack. Reported-by: syzbot+27b35e5675b1753cec03@syzkaller.appspotmail.com Reported-by: syzbot+99071492e3de2eff49e9@syzkaller.appspotmail.com	2022-12-09 10:33:18 +00:00
hannken	3652e3db73	If built with DEBUG Limit the depth of file system stack so kernel sanitizers may stress mount/unmount without exhausting the kernel stack.	2022-11-10 10:55:00 +00:00
hannken	cfe3f2c399	Add a helper to set or clear lower mount and use it. Always add a reference to the lower mount. Ride 9.99.105	2022-11-04 11:20:39 +00:00
riastradh	c0216ccf58	sys/filedesc.h: New home for extern cwdi0.	2022-10-26 23:39:10 +00:00
riastradh	be5e920ca4	vflush(9): Insert `involuntary' preemption point at each vnode. Currently there is a voluntary yield every 100ms, but that's a long time. Should help to avoid hogging the CPU while flushing lots of data to big disks on systems without kpreemption.	2022-09-13 09:35:31 +00:00
hannken	5c869fd70e	Two defects in vfs_getnewfsid(): - Parallel mounts may get the same fsid. Always increment "xxxfs_mntid" to make it unlikely. - Directly walk "mountlist" to prevent a rare deadlock where one thread holds a vnode locked, calls vfs_getnewfsid() and the iterator has to wait for a suspended file system while the thread suspending needs this vnode lock.	2022-08-26 11:03:53 +00:00
hannken	10a8222c49	Protect changing "v_mountedhere" with file system suspension instead of vnode lock.	2022-08-22 09:14:24 +00:00
hannken	ba6f7f8d2a	Suspend file system after VFS_MOUNT() and before taking mnt_updating. Prevents deadlock against concurrent unmounts of layered file systems.	2022-07-08 07:43:19 +00:00
riastradh	ef3476fb57	sys: Use membar_release/acquire around reference drop. This just goes through my recent reference count membar audit and changes membar_exit to membar_release and membar_enter to membar_acquire -- this should make everything cheaper on most CPUs without hurting correctness, because membar_acquire is generally cheaper than membar_enter.	2022-04-09 23:38:31 +00:00
riastradh	a2155d69ea	specfs: Let spec_node_lookup_by_dev wait for reclaim to finish. vdevgone relies on this to ensure that if there is a concurrent revoke in progress, it will wait for that revoke to finish -- that way, it can guarantee all I/O operations have completed and the device is closed.	2022-03-28 12:37:46 +00:00
riastradh	62e877acb3	vfs(9): Add missing vnode lock around VOP_CLOSE in vfs_mountroot. Maybe vnode_if.c should be taught to KASSERT the vnode lock now that locks always work.	2022-03-24 12:59:56 +00:00
hannken	ce218897d7	Lock vnode across VOP_OPEN.	2022-03-19 13:50:02 +00:00
andvar	db54414c68	s/paniced/panicked/ and s/borken/broken/ in comments.	2022-03-16 20:31:01 +00:00
riastradh	122a3e8a60	sys: Membar audit around reference count releases. If two threads are using an object that is freed when the reference count goes to zero, we need to ensure that all memory operations related to the object happen before freeing the object. Using an atomic_dec_uint_nv(&refcnt) == 0 ensures that only one thread takes responsibility for freeing, but it's not enough to ensure that the other thread's memory operations happen before the freeing. Consider: Thread A Thread B obj->foo = 42; obj->baz = 73; mumble(&obj->bar); grumble(&obj->quux); /* membar_exit(); / / membar_exit(); / atomic_dec -- not last atomic_dec -- last / membar_enter(); / KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); The memory barriers ensure that obj->foo = 42; mumble(&obj->bar); in thread A happens before KASSERT(invariant(obj->foo, obj->bar)); free_stuff(obj); in thread B. Without them, this ordering is not guaranteed. So in general it is necessary to do membar_exit(); if (atomic_dec_uint_nv(&obj->refcnt) != 0) return; membar_enter(); to release a reference, for the `last one out hit the lights' style of reference counting. (This is in contrast to the style where one thread blocks new references and then waits under a lock for existing ones to drain with a condvar -- no membar needed thanks to mutex(9).) I searched for atomic_dec to find all these. Obviously we ought to have a better abstraction for this because there's so much copypasta. This is a stop-gap measure to fix actual bugs until we have that. It would be nice if an abstraction could gracefully handle the different styles of reference counting in use -- some years ago I drafted an API for this, but making it cover everything got a little out of hand (particularly with struct vnode::v_usecount) and I ended up setting it aside to work on psref/localcount instead for better scalability. I got bored of adding #ifdef __HAVE_ATOMIC_AS_MEMBAR everywhere, so I only put it on things that look performance-critical on 5sec review. We should really adopt membar_enter_preatomic/membar_exit_postatomic or something (except they are applicable only to atomic r/m/w, not to atomic_load/store_, making the naming annoying) and get rid of all the ifdefs.	2022-03-12 15:32:30 +00:00
hannken	1eaee3c4bf	Stop clearing "v_mountedhere" in mount_domount() error path. We did not set it and may clear the value from another mount.	2022-02-04 15:33:57 +00:00
hannken	dc06ff169e	Reorganize uvm_swap_shutdown() a bit, make sure the vnode gets locked and referenced across the call to swap_off() and finally use it from vfs_unmountall1() to remove swap after unmounting the last file system. Adresses PR kern/54969 (Disk cache is no longer flushed on shutdown)	2021-02-16 09:56:32 +00:00
hannken	0b5a6352dc	We have to ignore interrupts when suspending here the same way we have to do with revoke. Reported-by: syzbot+0cfb253b382a9836450a@syzkaller.appspotmail.com	2020-11-19 10:47:47 +00:00
hannken	d8a571076b	Suspend file system before unmounting in mount_domount() error path to prevent diagnostic assertions from unmount/flush. Reported-by: syzbot+8d557f49c8b7888182eb@syzkaller.appspotmail.com Reported-by: syzbot+e87fe1e769a3426d9bf3@syzkaller.appspotmail.com Reported-by: syzbot+9c5b86e651e98c5bf438@syzkaller.appspotmail.com Reported-by: syzbot+610b614af0d66179ca78@syzkaller.appspotmail.com Reported-by: syzbot+7818ff113a1535ebc724@syzkaller.appspotmail.com	2020-10-13 13:15:39 +00:00
ad	0eaaa024ea	Move proc_lock into the data segment. It was dynamically allocated because at the time we had mutex_obj_alloc() but not __cacheline_aligned.	2020-05-23 23:42:41 +00:00
hannken	a6d5e17025	Undo Rev. 1.79, it breaks root-on-raid where it destroys the component disks before the raid: forcefully unmounting / (/dev/raid0a)... sd1: detached sd0: detached raid0: cache flush to component /dev/sd0a failed. raid0: cache flush to component /dev/sd1a failed. fatal page fault in supervisor mode Stopped in pid 2356.2356 (reboot) at netbsd:sdstrategy+0x36 Reopens PR kern/54969: Disk cache is no longer flushed on shutdown	2020-05-01 08:45:01 +00:00
ad	e88c11f417	Revert the changes made in February to make cwdinfo use mostly lockless, which relied on taking extra vnode refs. Having benchmarked various experimental changes over the past few months it seems that it's better to avoid vnode refs as much as possible. cwdi_lock as a RW lock already did that to some extent for getcwd() and will permit the same for namei() too.	2020-04-21 21:42:47 +00:00
ad	4e73754f15	Rename buf_syncwait() to vfs_syncwait(), and have it wait on v_numoutput rather than BC_BUSY. Removes the dependency on bufhash.	2020-04-20 21:39:05 +00:00
hannken	3c531f4f39	Destroy anonymous device vnodes on reboot once the last file system got unmounted and the mount list is empty. PR kern/54969: Disk cache is no longer flushed on shutdown	2020-04-19 13:26:17 +00:00
ad	23bf88000c	Replace most uses of vp->v_usecount with a call to vrefcnt(vp), a function that hides the details and does atomic_load_relaxed(). Signature matches FreeBSD.	2020-04-13 19:23:17 +00:00
maxv	983fd9ccfe	hardclock_ticks -> getticks()	2020-04-13 15:54:45 +00:00
ad	b5adab0e05	vfs_mountroot(): don't needlessly grab a second reference to the root vnode (the kernel never chdirs) nor a lock on it.	2020-04-10 22:34:36 +00:00
ad	926b25e154	Merge from ad-namecache: - Have a stab at clustering the members of vnode_t and vnode_impl_t in a more cache-conscious way. With that done, go back to adjusting v_usecount with atomics and keep vi_lock directly in vnode_impl_t (saves KVA). - Allow VOP_LOCK(LK_NONE) for the benefit of VFS_VGET() and VFS_ROOT(). Make sure LK_UPGRADE always comes with LK_NOWAIT. - Make cwdinfo use mostly lockless.	2020-02-23 22:14:03 +00:00
ad	c2e9cb9413	VFS_VGET(), VFS_ROOT(), VFS_FHTOVP(): give them a "int lktype" argument, to allow us to get shared locks (or no lock) on the returned vnode. Matches FreeBSD.	2020-01-17 20:08:06 +00:00
ad	7d06f3305f	Make mntvnode_lock per-mount, and address false sharing of struct mount.	2019-12-22 19:47:34 +00:00
maxv	1bb344ad1f	NULL-check the structure pointer, not the address of its first field. Also add KASSERT. For clarity, and to appease kUBSan.	2019-11-16 10:07:53 +00:00
christos	5d96c08a38	If we could not start extattr for some reason, don't advertise extattr in the mount.	2019-08-19 09:32:42 +00:00
hannken	583a153e11	Move fstrans_unmount() to vfs_rele(), just before it would free the mount. Don't take a mount reference for fstrans as it gets notified about the release. Defer the final free of the mount to fstrans_mount_dtor() when fstrans has released all references to this mount. Prevents the mount's memory to be reused as a new mount before fstrans released all references. Address PR kern/53928 modules/t_builtin:disable test case randomly fails.	2019-02-20 10:08:37 +00:00
hannken	f421b3668b	Attach "mnt_transinfo" to "dead_rootmount" so every mount has a valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS. Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED. Should become the default for DIAGNOSTIC in the future.	2019-02-20 10:07:27 +00:00
hannken	88bcb4546a	Allow dounmount() with file system already suspended. Remove no longer valid test for layered mounts, ZFS will unmount snapshots bottom up.	2019-02-05 09:49:44 +00:00
hannken	28650af9eb	Change forced unmount to revert open device vnodes to anonymous devices.	2017-08-21 09:00:21 +00:00
hannken	287643b0da	Operations fstrans_start() and fstrans_start_nowait() now always use FSTRANS_SHARED as lock type so remove the lock type argument. File system state FSTRANS_SUSPENDING is now unused so remove it. Regen vnode_if files. Ride 8.99.1 less than a hour ago.	2017-06-04 08:05:41 +00:00
chs	fd34ea77eb	remove checks for failure after memory allocation calls that cannot fail: kmem_alloc() with KM_SLEEP kmem_zalloc() with KM_SLEEP percpu_alloc() pserialize_create() psref_class_create() all of these paths include an assertion that the allocation has not failed, so callers should not assert that again.	2017-06-01 02:45:05 +00:00
hannken	69174779b1	With dounmount() working on a suspended file system remove no longer needed fields mnt_busynest and mnt_unmounting from struct mount. Welcome to 7.99.73	2017-05-24 09:53:55 +00:00
hannken	c2c49e1ed2	Remove the syncer dance from dounmount(). The syncer skips unmounting file systems as they are suspended. Remove now unused syncer_mutex.	2017-05-24 09:52:59 +00:00
hannken	677cf1d8b4	Suspend file system while unmounting. This way no operations run on the mounted file system during unmount and all operations see the state before or after the (possibly failed) unmount.	2017-05-17 12:45:03 +00:00
hannken	4f4cfe27b2	Enter fstrans from _vfs_busy() and leave from vfs_unbusy(). Adapt sched_sync() and do_sys_sync().	2017-05-07 08:26:58 +00:00
hannken	c18a56f135	Move fstrans initialization to vfs_mountalloc().	2017-05-07 08:24:20 +00:00
hannken	853d034c97	Remove now invalid comment.	2017-05-07 08:21:08 +00:00
hannken	bd152b56b5	Add vfs_trybusy() and mountlist_iterator_trynext() and use it for the syncer.	2017-04-17 08:34:27 +00:00
hannken	eb8533a8b6	No need to keep a not yet visible mount busy. Move vfs_busy() from vfs_mountalloc() to vfs_rootmountalloc(). XXX: Do we really need to vfs_busy() for vfs_mountroot?	2017-04-17 08:32:55 +00:00
hannken	20bb034f5b	Remove unused argument "nextp" from vfs_busy() and vfs_unbusy(). Remove argument "keepref" from vfs_unbusy() and add vfs_ref() where needed.	2017-04-17 08:32:00 +00:00
hannken	ebb8f73b4b	Add vfs_ref(mp) and vfs_rele(mp) to add or remove a reference to struct mount. Rename vfs_destroy(mp) to vfs_rele(mp) and replace incrementing mp->mnt_refcnt with vfs_ref(mp).	2017-04-17 08:31:01 +00:00

1 2 3

104 Commits