NetBSD

Commit Graph

Author	SHA1	Message	Date
riastradh	86535c9941	specfs: KNF. No functional change intended.	2023-04-22 15:32:49 +00:00
hannken	149aa6be82	Remove unused specdev member sd_rdev. Ride 10.99.4	2023-04-22 14:30:16 +00:00
riastradh	ab579ad815	genfs: KASSERT(A && B) -> KASSERT(A); KASSERT(B)	2023-04-09 12:26:36 +00:00
hannken	f8079ac547	Fix genfs_can_chtimes() to also handle the condition: If the time pointer is null, then write permission on the file is also sufficient. From FreeBSD. Should fix PR kern/57246 "NFS group permissions regression"	2023-03-03 10:02:51 +00:00
hannken	6017299f5f	Set IMNT_MPSAFE only if the lower layer has it set.	2023-02-06 10:32:58 +00:00
hannken	85cb97f0d7	Harden layered file systems usage of field "mnt_lower" against forced unmounts of the lower layer. - Dont allow "dead_rootmount" as lower layer. - Take file system busy before a vfs operation walks down the stack. Reported-by: syzbot+27b35e5675b1753cec03@syzkaller.appspotmail.com Reported-by: syzbot+99071492e3de2eff49e9@syzkaller.appspotmail.com	2022-12-09 10:33:18 +00:00
hannken	cfe3f2c399	Add a helper to set or clear lower mount and use it. Always add a reference to the lower mount. Ride 9.99.105	2022-11-04 11:20:39 +00:00
riastradh	f29311a918	miscfs/fifofs/fifo.h: New home for extern fifo_vnodeop_opv_desc. Add include guard and fix missing includes while here too.	2022-10-26 23:40:20 +00:00
riastradh	fa76fa97ef	miscfs/specfs/specdev.h: New home for extern spec_vnodeop_opv_desc. Also use it for extern spec_vnodeop_p, which is already there.	2022-10-26 23:40:08 +00:00
riastradh	26784725ee	miscfs/deadfs/deadfs.h: New home for deadfs-related externs. XXX regen sys/kern/vnode_if.c and the others	2022-10-26 23:39:43 +00:00
riastradh	d86da41736	specfs(9): Attribute blame by stack trace for write to r/o medium.	2022-10-15 15:20:46 +00:00
riastradh	ae1d5f78a6	specfs(9): XXX comment: what if read downgrades lock?	2022-09-21 10:59:10 +00:00
riastradh	2ab9954344	specfs: Refuse to open a closing-in-progress block device. We could wait for close to complete, but if this happened ever so slightly earlier it would lead to EBUSY anyway, so there's no point in adding logic for that -- either way the caller neglected to wait for the last close to finish before trying to open it the device again. https://mail-index.netbsd.org/current-users/2022/08/09/msg042800.html Reported-by: syzbot+4388f20706ec8a4c8db0@syzkaller.appspotmail.com https://syzkaller.appspot.com/bug?id=47c67ab6d3a87514d0707882a9ad6671beaa8642 Reported-by: syzbot+0f1756652dce4cb341ed@syzkaller.appspotmail.com https://syzkaller.appspot.com/bug?id=a632ce762d64241fc82a9bc57230b7b7c7095d1a	2022-08-12 21:25:39 +00:00
riastradh	a5fdd92957	specfs: Assert !closing on successful open. - If there's a prior concurrent close, it must have interrupted this open. - If there's a new concurrent close, it must wait until this open has released device_lock before it can revoke.	2022-08-12 17:06:01 +00:00
riastradh	938c8ecc70	specfs: Assert opencnt>0 on successful open.	2022-08-12 17:05:49 +00:00
riastradh	25af48d0b5	specfs: Sprinkle opencnt/opened/closing assertions. There seems to be a bug here but I'm not sure what it is yet: https://mail-index.netbsd.org/current-users/2022/08/09/msg042800.html https://syzkaller.appspot.com/bug?id=47c67ab6d3a87514d0707882a9ad6671beaa8642 The decision to actually invoke d_close is serialized under device_lock, so it should not be possible for more than one process to close at the same time, but syzbot and kre found a way for sd_closing to be false later in spec_close. Let's make sure it's false when we're making what should be the exclusive decision to close. We can't assert !sd_opened before cancel and spec_io_drain, because those are necessary to interrupt and wait for pending opens that might later set sd_opened, but we can assert !sd_opened afterward because once sd_closing is true nothing should set sd_opened.	2022-08-11 12:52:24 +00:00
thorpej	75d451f371	Make kqueue event status for vnodes shareable, and for stacked file systems like nullfs, make the upper vnode share that status with the lower vnode. And, lo, NetBSD 9.99.99. Fixes PR kern/56713.	2022-07-18 04:30:30 +00:00
hannken	4c6398a93d	Make dead vfs ops "vfs_statvfs" and "vfs_vptofh" return EOPNOTSUPP. Both operations may originate from (possible dead) vnodes. Reported-by: syzbot+eceb203d44457742be3b@syzkaller.appspotmail.com	2022-07-08 07:44:17 +00:00
hannken	cc261bb6d7	Don't use LK_RETRY as we need an active vnode here.	2022-07-08 07:43:48 +00:00
hannken	4ad9c3956a	Handle IMNT_GONE on the file system we want suspended not its lowest mount we really suspend.	2022-07-08 07:42:05 +00:00
shm	49bf48f547	Add missing permission check	2022-06-17 14:30:37 +00:00
andvar	75d2abaeb1	fix various typos in comments and output/log messages.	2022-04-10 09:50:44 +00:00
riastradh	c8ea12a8d3	driver(9): New devsw d_cancel op to interrupt I/O before close. If specified, when revoking a device node or closing its last open node, specfs will: 1. Call d_cancel, which should return promptly without blocking. 2. Wait for all concurrent d_read/write/ioctl/&c. to drain. 3. Call d_close. Otherwise, specfs will: 1. Call d_close. 2. Wait for all concurrent d_read/write/ioctl/&c. to drain. This fallback is problematic because often parts of d_close rely on concurrent devsw operations to have completed already, so it is up to each driver to have its own mechanism for waiting, and the extra step in (2) is almost redundant. But it is still important to ensure that devsw operations are not active by the time a module tries to invoke devsw_detach, because only d_open is protected against that. The signature of d_cancel matches d_close, mostly so we don't raise questions about `why is this different?'; the lwp argument is not useful but we should remove it from open/cancel/close all at the same time. The only way d_cancel should fail, if it does at all, is with ENODEV, meaning the driver doesn't support cancelling outstanding I/O, and will take responsibility for that in d_close. I would make it return void and only have bdev_cancel and cdev_cancel possibly return ENODEV so specfs can detect whether a driver supports it, but this would break the pattern around devsw operation types. Drivers are allowed to omit it from struct bdevsw, struct cdevsw -- if so, it is as if they used a function that just returns ENODEV. XXX kernel ABI change to struct bdevsw/cdevsw requires bump	2022-03-28 12:39:10 +00:00
riastradh	24d512b12a	specfs: Reorder struct specnode members to save padding. Shrinks from 40 bytes to 32 bytes on LP64 systems this way.	2022-03-28 12:38:04 +00:00
riastradh	51a3f758d3	specfs: Remove specnode from hash table in spec_node_revoke. Previously, it was possible for spec_node_lookup_by_dev to handle a speconde that a concurrent spec_node_destroy is about to remove from the hash table and then free, as soon as spec_node_lookup_by_dev releases device_lock. Now, the ordering is: 1. Remove specnode from hash table in spec_node_revoke. At this point, no _new_ vnode references are possible (other than possibly one acquired by vcache_vget under v_interlock), but there may be existing ones. 2. Mark vnode reclaimed so vcache_vget will fail. 3. The last vrele (or equivalent logic in vcache_vget) will then free the specnode in spec_node_destroy. This way, _if_ a thread in spec_node_lookup_by_dev finds a specnode in the hash table under device_lock/v_interlock, _then_ it will not be freed until the thread completes vcache_vget. This change requires calling spec_node_revoke unconditionally for device special nodes, not just for active ones. Might introduce slightly more contention on device_lock but not much because we already have to take it in this path anyway a little later in spec_node_destroy.	2022-03-28 12:37:56 +00:00
riastradh	a2155d69ea	specfs: Let spec_node_lookup_by_dev wait for reclaim to finish. vdevgone relies on this to ensure that if there is a concurrent revoke in progress, it will wait for that revoke to finish -- that way, it can guarantee all I/O operations have completed and the device is closed.	2022-03-28 12:37:46 +00:00
riastradh	b89ee9efb4	specfs: Assert opencnt is nonzero before decrementing.	2022-03-28 12:37:35 +00:00
riastradh	bd75be3e3d	specfs: Take an I/O reference across bdev/cdev_open. - Revoke is used to invalidate all prior access control checks when device permissions are changing, so it must wait for .d_open to exit so any new access must go through new access control checks. - Revoke is used by vdevgone in xyz_detach to wait until all use of the driver's data structures have completed before xyz_detach frees them. So we need to make sure spec_close waits for .d_open too.	2022-03-28 12:37:26 +00:00
riastradh	a6a4e9ecd6	specfs: Wait for last close in spec_node_revoke. Otherwise, revoke -- and vdevgone, in the detach path of removable devices -- may complete while I/O operations are still running concurrently.	2022-03-28 12:37:18 +00:00
riastradh	daba8dd87b	specfs: Prevent new opens while close is waiting to drain. Otherwise, bdev/cdev_close could have cancelled all _existing_ opens, and waited for them to complete (and freed resources used by them) -- but a new one could start, and hang (e.g., a tty), at the same time spec_close tries to drain all pending I/O operations, one of which (the new open) is now hanging indefinitely. Preventing the new open from even starting until bdev/cdev_close is finished and all I/O operations have drained avoids this deadlock.	2022-03-28 12:37:09 +00:00
riastradh	c7aa557804	specfs: Take an I/O reference in spec_node_setmountedfs. This is not quite correct. We _should_ require the caller to hold a vnode lock around spec_node_getmountedfs, and an exclusive vnode lock around spec_node_setmountedfs, so that it is only necessary to check whether revoke has already happened, not hold an I/O reference. Unfortunately, various callers in various file systems don't follow this sensible rule. So let's at least make sure the vnode can't be revoked in spec_node_setmountedfs, while we're in bdev_ioctl, and leave a comment explaining what the sorry state of affairs is and how to fix it later.	2022-03-28 12:37:01 +00:00
riastradh	66ae10f9a4	specfs: Drain all I/O operations after last .d_close call. New kind of I/O reference on specdevs, sd_iocnt. This could be done with psref instead; I chose a reference count instead for now because we already have to take a per-object lock anyway, v_interlock, for vdead_check, so another atomic is not likely to hurt much more. We can always change the mechanism inside spec_io_enter/exit/drain later on. Make sure every access to vp->v_rdev or vp->v_specnode and every call to a devsw operation is protected either: - by the vnode lock (with vdead_check if we unlocked/relocked), - by positive sd_opencnt, - by spec_io_enter/exit, or - by sd_opencnt management in open/close.	2022-03-28 12:36:51 +00:00
riastradh	71a1e06a17	specfs: Resolve a race between close and a failing reopen.	2022-03-28 12:36:42 +00:00
riastradh	eb8abdf33b	specfs: Document sn_opencnt, sd_opencnt, sd_refcnt.	2022-03-28 12:36:34 +00:00
riastradh	820dfcc78f	specfs: Paranoia: Assert opencnt is zero on reclaim.	2022-03-28 12:36:26 +00:00
riastradh	aa0e9abd05	specfs: Omit needless vdead_check in spec_fdiscard. The vnode lock is held, so the vnode cannot be revoked without also changing v_op so subsequent uses under the vnode lock will go to deadfs's VOP_FDISCARD instead (which is genfs_eopnotsupp).	2022-03-28 12:36:18 +00:00
riastradh	eecc7d184f	specfs: Add a comment and assertion to spec_close about refcnts.	2022-03-28 12:36:09 +00:00
riastradh	8068be2d26	specfs: If sd_opencnt is zero, sn_opencnt had better be zero.	2022-03-28 12:36:00 +00:00
riastradh	442b916b2c	specfs: Factor KASSERT out of switch in spec_open. No functional change.	2022-03-28 12:35:52 +00:00
riastradh	4241b4d4c6	specfs: sn_gone cannot be set while we hold the vnode lock. Revoke runs with the vnode lock too, which is exclusive. Add an assertion to this effect in spec_node_revoke to make it clear.	2022-03-28 12:35:44 +00:00
riastradh	ce78318abc	specfs: Reorganize D_DISK tail of spec_open and explain what's up. No functional change intended.	2022-03-28 12:35:35 +00:00
riastradh	39225fd515	specfs: Factor VOP_UNLOCK/vn_lock out of switch for clarity. No functional change.	2022-03-28 12:35:26 +00:00
riastradh	1ba4ba4825	specfs: Factor common device_lock out of switch for clarity. No functional change.	2022-03-28 12:35:17 +00:00
riastradh	39e7464227	specfs: Delete bogus comment about .d_open/.d_close at same time. Annoying as it is that .d_open and .d_close can run at the same time, it is also necessary for tty semantics, where open can block indefinitely, and it is the responsibility of close (called via revoke) necessary to interrupt it.	2022-03-28 12:35:08 +00:00
riastradh	c466e4b35b	specfs: Split spec_open switch into three sections. The sections are now: 1. Acquire open reference. 1a (intermezzo). Set VV_ISTTY. 2. Drop the vnode lock to call .d_open and autoload modules if necessary. 3. Handle concurrent revoke if it happenend, or release open reference if .d_open failed. No functional change. Sprinkle comments about problems.	2022-03-28 12:34:59 +00:00
riastradh	3213d6e5bd	specfs: Factor common kauth check out of switch in spec_open. No functional change.	2022-03-28 12:34:51 +00:00
riastradh	8f24965bf7	specfs: Assert v_type is VBLK or VCHR in spec_open. Nothing else makes sense. Prune dead branches (and replace default case by panic).	2022-03-28 12:34:42 +00:00
riastradh	7005cbf9c7	specfs: Call bdev_open without the vnode lock. There is no need for it to serialize opens, because they are already serialized by sd_opencnt which for block devices is always either 0 or 1. There's not obviously any other reason why the vnode lock should be held across bdev_open, other than that it might be nice to avoid dropping it if not necessary. For character devices we always have to drop the vnode lock because open might hang indefinitely, when opening a tty, which is not allowed while holding the vnode lock.	2022-03-28 12:34:34 +00:00
riastradh	0760adde4b	specfs: Note lock order for vnode lock, device_lock, v_interlock.	2022-03-28 12:34:25 +00:00
riastradh	04c6cbac06	driver(9): Eliminate D_MCLOSE. D_MCLOSE was introduced a few years ago by mistake for audio(4), which should have used -- and now does use -- fd_clone to create per-open state. The semantics was originally to call close once every time the device node is closed, not only for the last close. Nothing uses it any more, and it complicates reasoning about the system, so let's simplify it away.	2022-03-28 12:34:17 +00:00

1 2 3 4 5 ...

1446 Commits