This allows us to return EEXIST instead of EPERM for higher secure levels.
My use case was to stop npfctl complaining that it could not load bpfjit
on ERLITE when it was compiled into the kernel.
It then went on to complain that NPF performance would be de-graded,
but this is clearly not the case.
When called from vrecycle() or vgone() there is a window where the refcount
is greater than zero and another thread could get and release a reference
that would miss VOP_INACTIVE() as the refcount doesn't drop to zero.
Adjust test fs/puffs/t_basic: test VOP_INACTIVE count being greater zero.
- Make vrecycle() more robust by checking v_usecount first and preventing
further references across vn_lock(). Fixes a deadlock where one thread
starts unmount, second thread locks a directory and allocates a vnode
and first thread tries to vrecycle() the directory.
First thread holds vfs_busy and wants vnode, second thread holds vnode
and wants vfs_busy.
- With these fixes in place change cleanvnode() to use vget()/vrecycle()
to reclaim the vnode.
xc_lowpri and xc_thread are racy and xc_wait may return during/before
executing all xcall callbacks, resulting in a kernel panic at worst.
xc_lowpri serializes multiple jobs by a mutex and a cv. If all xcall
callbacks are done, xc_wait returns and also xc_lowpri accepts a next job.
The problem is that a counter that counts the number of finished xcall
callbacks is incremented *before* actually executing a xcall callback
(see xc_tailp++ in xc_thread). So xc_lowpri accepts a next job before
all xcall callbacks complete and a next job begins to run its xcall callbacks.
Even worse the counter is global and shared between jobs, so if a xcall
callback of the next job completes, the shared counter is incremented,
which confuses wc_wait of the previous job as all xcall callbacks of the
previous job are done and wc_wait of the previous job returns during/before
executing its xcall callbacks.
How to fix: there are actually two counters that count the number of finished
xcall callbacks for low priority xcall for historical reasons (I guess):
xc_tailp and xc_low_pri.xc_donep. xc_low_pri.xc_donep is incremented correctly
while xc_tailp is incremented wrongly, i.e., before executing a xcall callback.
We can fix the issue by dropping xc_tailp and using only xc_low_pri.xc_donep.
PR kern/51632
various sysctl/procfs interfaces that allow it to be interrogated.
(This is rather than the temporary parent's pid when a process is
being traced and has been reparented.)
XXX The ppid in elf32 core files has not been similarly adjusted,
XXX Should it be ?
Only change it when we are being permanently reparented to init. Since
p_ppid is only used as a cached value to retrieve the parent's process id
from userland, this change makes it correct at all times. Idea from kre@
Revert specialized logic from getpid/getppid now that it is not needed.
before recursing into lower blocks, to make sure that it will be removed after
all its referenced blocks are removed
fixes 'ffs_blkfree_common: freeing free block' panic triggered by
ufs_truncate_retry() when just the upper indirect block registration failed,
code tried to free the lower blocks again after wapbl flush
problem found by hannken@, thank you
Revert 1.264 - that was intended to fix 51600, but didn't, it just
hid the problem, and caused 51606. This fixes 51606.
Handle waiting on a process that has been detatched from its parent
because of being ptrace'd by some other process. This fixes 51600.
("handle" here means that the wait() hangs, or with WNOHANG, returns 0,
we cannot actually wait on a process that is not currently an attached
child.)
Note: the detatched process waiting is not yet perfect (it fails to
take account of options like WALLSIG and WALTSIG) - suport for those
(that is, ignoring a detatched child that one of those options will
later cause to be ignored when the process is re-attached.)
For now, for ither than when waiting for a specific process ID, when
a process does a wait() sys call (any of them), has no applicable
children attached that can be returned, and has at least one detatched
child, then we do a linear search of all processes to look for a
suitable detatched child. This is likely to be slow - but very rare.
Eventually it might be better to keep a list of detatched children
per process.
- Move _VFS_VNODE_PRIVATE protected operations into vnode_impl.h.
- Move struct vnode_impl definition and operations into vnode_impl.h.
- Include vnode_impl.h where we include vnode.h with _VFS_VNODE_PRIVATE defined.
- Get rid of _VFS_VNODE_PRIVATE.
- Rename struct vcache_node to vnode_impl, start its fields with vi_.
- Rename enum vcache_state to vnode_state, start its elements with VS_.
- Rename macros VN_TO_VP and VP_TO_VN to VIMPL_TO_VNODE and VNODE_TO_VIMPL.
- Add typedef struct vnode_impl vnode_impl_t.
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE
* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).
* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).
* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.
XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.
succeed; change wapbl_register_deallocation() to return EAGAIN
rather than panic when code hits the limit
callers changed to either loop calling ffs_truncate() using new
utility ufs_truncate_retry() if their semantics requires it, or
just ignore the failure; remove ufs_wapbl_truncate()
this fixes possible user-triggerable panic during truncate, and
resolves WAPBL performance issue with truncates of large files
PR kern/47146 and kern/49175
Marking up some zeroes with a type suffix, while not marking others in
the very same function does nothing but places cognitive burden on the
reader.
Spelling "clear bits" as "&~" is actually not uncommon (and some say
is more readable).