Commit Graph

399 Commits

Author SHA1 Message Date
thorpej adc241a744 sig_filtops is MPSAFE. 2021-09-26 17:34:19 +00:00
thorpej 12ae65d98c Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89
2021-09-26 01:16:07 +00:00
simonb bef679e4ab CTASSERT that NSIG <= 128. There are many hard-coded assumptions that
there are <= 4 x 32bit signal mask bits.
2021-04-03 11:19:11 +00:00
skrll 63292dfbcd Trailing whitespace 2021-01-11 17:18:51 +00:00
pgoyette 575be43dde Separate the compat_netbsd32_coredump from the compat_netbsd32 and
coredump modules, into its own module.

Welcome to 7.99.75 !!!
2020-11-01 18:51:02 +00:00
christos 0c5f517c54 fix indentation 2020-10-30 22:19:00 +00:00
christos 92d8eebacc Depend directly on EXEC_ELF{32,64} to determine which versions of the coredump
code are available.
2020-10-26 17:35:39 +00:00
christos 184e35ae6c Fix build for _LP64 machines that don't have COMPAT_NETBSD32 (alpha, ia64) 2020-10-20 13:16:26 +00:00
christos 3d852e0ec8 Arrange so that no options COREDUMP and no options PTRACE work together.
Thanks to Paul Goyette for testing.
2020-10-19 19:33:01 +00:00
ad 0eaaa024ea Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.
2020-05-23 23:42:41 +00:00
kamil 48b46ced17 Introduce new ptrace(2) operations: PT_SET_SIGPASS and PT_GET_SIGPASS
They deliver the logic of bypassing selected signals directly to the
debuggee, without informing the debugger.

This can be used to implement the QPassSignals GDB/LLDB protocol.

This call can be useful to avoid signal races in ATF ptrace tests.
2020-05-14 13:32:15 +00:00
kamil 9d3d9bd2de On debugger attach to a prestarted process don't report SIGTRAP
Introduce PSL_TRACEDCHILD that indicates tracking of birth of a process.
A freshly forked process checks whether it is traced and if so, reports
SIGTRAP + TRAP_CHLD event to a debugger as a result of tracking forks-like
events. There is a time window when a debugger can attach to a newly
created process and receive SIGTRAP + TRAP_CHLD instead of SIGSTOP.

Fixes races in t_ptrace_wait* tests when a test hangs or misbehaves,
especially the ones reported in tracer_sysctl_lookup_without_duplicates.
2020-05-07 20:02:34 +00:00
kamil 848901a664 Reintroduce struct proc::p_oppid
Relying on p_opptr is not safe as there is a race between:
 - spawner giving a birth to a child process and being killed
 - spawnee accessng p_opptr and reporting TRAP_CHLD

PR kern/54786 by Andreas Gustafsson
2020-04-06 08:20:05 +00:00
christos 1d19491032 - Untangle spawn_return by splitting it up to sub-functions.
- Merge the eventswitch parent notification code which was copied in two
  places (eventswitchchild)
- Fix bugs in the eventswitch parent notification code:
  1. p_slflags should be accessed holding both proc_lock and p->p_lock
  2. p->p_opptr can be NULL if the parent was PSL_CHTRACED and exited.

Fixes random crashes the posix_spawn_kill_spawner unit test which tried
to dereference a NULL pptr.
2020-04-05 20:53:17 +00:00
ad b00d9a59ce sigpost(): check for LSZOMB, not l_refcnt == 0. 2020-03-26 21:25:26 +00:00
riastradh 17201b1c03 Load struct fdfile::ff_file with atomic_load_consume.
Exceptions: when we're only testing whether it's there, not about to
dereference it.

Note: We do not use atomic_store_release to set it because the
preceding mutex_exit should be enough.

(That said, it's not clear the mutex_enter/exit is needed unless
refcnt > 0 already, in which case maybe it would be a win to switch
from the membar implied by mutex_enter to the membar implied by
atomic_store_release -- which I would generally expect to be much
cheaper.  And a little clearer without a long comment.)
2020-02-01 02:23:23 +00:00
riastradh 8e6cd4ce57 Load struct filedesc::fd_dt with atomic_load_consume.
Exceptions: when fd_refcnt <= 1, or when holding fd_lock.

While here:

- Restore KASSERT(mutex_owned(&fdp->fd_lock)) in fd_unused.
  => This is used only in fd_close and fd_abort, where it holds.
- Move bounds check assertion in fd_putfile to where it matters.
- Store fd_dt with atomic_store_release.
- Move load of fd_dt under lock in knote_fdclose.
- Omit membar_consumer in fdesc_readdir.
  => atomic_load_consume serves the same purpose now.
  => Was needed only on alpha anyway.
2020-02-01 02:23:03 +00:00
ad 29bbec1956 PAX_SEGVGUARD doesn't seem to work properly in testing for me, but at least
make it not cause problems:

- Cover it with exec_lock so the updates are not racy.
- Using fileassoc is silly.  Just hang a pointer off the vnode.
2020-01-23 10:21:14 +00:00
ad 4477d28d73 Make it possible to call mi_switch() and immediately switch to another CPU.
This seems to take about 3us on my Intel system.  Two changes required:

- Have the caller to mi_switch() be responsible for calling spc_lock().
- Avoid using l->l_cpu in mi_switch().

While here:

- Add a couple of calls to membar_enter()
- Have the idle LWP set itself to LSIDL, to match softint_thread().
- Remove unused return value from mi_switch().
2019-12-06 21:36:10 +00:00
ad e57dd2ba56 - lwp_need_userret(): only do it if ONPROC and !curlwp, and explain why.
- Use signotify() in a couple more places.
2019-11-21 18:17:36 +00:00
pgoyette 1d577fe379 Move all non-emulation-specific coredump code into the coredump module,
and remove all #ifdef COREDUMP conditional compilation.  Now, the
coredump module is completely separated from the emulation modules, and
they can all be independently loaded and unloaded.

Welcome to 9.99.18 !
2019-11-20 19:37:51 +00:00
pgoyette 7384474702 Convert the coredump_vec modular function pointer to use the new
compat_hook mechanism.

XXX Should be pulled up to -9 despite the kernel <--> module ABI
XXX change.
2019-11-10 14:20:50 +00:00
pgoyette 7b0f5c9e07 Convert the sendsig_sigcontext_16 function pointer to use the new
compat_hook mechanism.

XXX Despite being a kernel<-->module abi change, this should be
XXX pulled up to -9
2019-11-10 13:28:06 +00:00
mgorny 050caffe42 Fix a race condition when handling concurrent LWP signals and add a test
Fix a race condition that caused PT_GET_SIGINFO to return incorrect
information when multiple signals were delivered concurrently
to different LWPs.  Add a regression test that verifies that when 50
threads concurrently use pthread_kill() on themselves, the debugger
receives all signals with correct information.

The kernel uses separate signal queues for each LWP.  However,
the signal context used to implement PT_GET_SIGINFO is stored in 'struct
proc' and therefore common to all LWPs in the process.  Previously,
this member was filled in kpsignal2(), i.e. when the signal was sent.
This meant that if another LWP managed to send another signal
concurrently, the data was overwritten before the process was stopped.

As a result, PT_GET_SIGINFO did not report the correct LWP and signal
(it could even report a different signal than wait()).  This can be
quite reliably reproduced with the number of 20 LWPs, however it can
also occur with 10.

This patch moves setting of signal context to issignal(), just before
the process is actually stopped.  The data is taken from per-LWP
or per-process signal queue.  The added test confirms that the debugger
correctly receives all signals, and PT_GET_SIGINFO reports both correct
LWP and signal number.

Reviewed by kamil.
2019-10-21 17:07:00 +00:00
christos 176ada4b2b Add and use __FPTRCAST, requested by uwe@ 2019-10-16 18:29:49 +00:00
christos d2348edc56 Add void * function pointer casts. There are different ways to "fix" those
warnings:
    1. this one: add a void * cast (which I think is the least intrusive)
    2. add pragmas to elide the warning
    3. add intermediate inline conversion functions
    4. change the called function prototypes, adding unused arguments and
       converting some of the pointer arguments to void *.
    5. make the functions varyadic (which defeats the purpose of checking)
    6. pass command line flags to elide the warning
I did try 3 and 4 and I was not pleased with the result (sys_ptrace_common.c)
(3) added too much code and defines, and (4) made the regular use clumsy.
2019-10-16 15:27:38 +00:00
kamil 29be9f8e91 Remove the short-circuit lwp_exit() path from sigswitch()
sigswitch() can be called from exit1() through:

   ttywait()->ttysleep()-> cv_timedwait_sig()->sleepq_block()->issignal()->sigswitch()

lwp_exit() called for the last LWP triggers exit1() and this causes a panic.

The debugger related signals have short-circuit demise paths in
eventswitch() and other functions, before calling sigswitch().

This change restores the original behavior, but there is an open question
whether the kernel crash is a red herring of misbehavior of ttywait().

This should fix PR kern/54618 by David H. Gutteridge
2019-10-15 13:59:57 +00:00
kamil 305335a1e9 Avoid double lwp_exit() in eventswitch()
For the PTRACE_LWP_EXIT event, the eventswitch() call is triggered from
lwp_exit(). In the case of setting the program status to PS_WEXIT, do not
try to demise in place, by calling lwp_exit() as it causes panic.

In this scenario bail out from the function and resume the lwp_exit()
procedure.
2019-10-13 03:50:26 +00:00
kamil 130e572a10 Fix one the the root causes of unreliability of the ptrace(2)ed threads
In case of sigswitchin away in issignal() and continuing the execution on
PT_CONTINUE (or equivalent call), there is a time window when another
thread could cause the process state to be changed to PS_STOPPING.

In the current logic, a thread would receive signal 0 (no-signal) and exit
from issignal(), returning to userland and never finishing the process of
stopping all LWPs. This causes hangs waitpid() waiting for SIGCHLD and
the callout polling for the state of the process in an infinite loop.

Instead of prompting for a returned signal from a debugger, repeat the
issignal() loop, this will cause checking the PS_STOPPING flag again and
sigswitching away in the scenario of stopping the process.
2019-10-13 03:19:57 +00:00
kamil 0998dd273e Add sigswitch_unlock_and_switch_away(), extracted from sigswitch()
Use sigswitch_unlock_and_switch_away() whenever there is no need for
sigswitch().
2019-10-13 03:10:22 +00:00
kamil 1249b6bf7e Refactor sigswitch()
Make the function static as it is now local to kern_sig.c.

Rename the 'relock' argument to 'proc_lock_held' as it is more verbose.
This was suggested by mjg@freebsd. While there this flips the users between
true<->false.

Add additional KASSERT(9) calls here to validate whethe proc_lock is used
accordingly.
2019-10-12 19:57:09 +00:00
kamil b3bca7a74f Remove p_oppid from struct proc
This field is not needed as it duplicated p_opptr that is alread safe to
use, unless proven otherwise.

eventswitch() already contained a check for != initproc (pid1).

Ride ABI bump for 9.99.16.
2019-10-12 10:55:23 +00:00
kamil f3a317a980 Enhance reliability of ptrace(2) in a debuggee with multiple LWPs
Stop competing between threads which one emits event signal quicker and
overwriting the signal from another thread.

This fixes missed in action signals.

NetBSD truss can now report reliably all TRAP_SCE/SCX/etc events without
reports of missed ones.

his was one of the reasons why debuggee with multiple threads misbehaved
under a debugger.


This change is v.2 of the previously reverted commit for the same fix.

This version contains recovery path that stopps triggering event SIGTRAP
for a detached debugger.
2019-10-08 18:02:46 +00:00
kamil a35a4fe3b8 Separate flag for suspended by _lwp_suspend and suspended by a debugger
Once a thread was stopped with ptrace(2), userland process must not
be able to unstop it deliberately or by an accident.

This was a Windows-style behavior that makes threading tracing fragile.
2019-10-03 22:48:44 +00:00
kamil 5e4bbc4985 Move TRAP_CHLD/TRAP_LWP ptrace information from struct proc to siginfo
Storing struct ptrace_state information inside struct proc was vulnerable
to synchronization bugs, as multiple events emitted in the same time were
overwritting other ones.

Cache the original parent process id in p_oppid. Reusing here p_opptr is
in theory prone to slight race codition.

Change the semantics of PT_GET_PROCESS_STATE, reutning EINVAL for calls
prompting for the value in cases when there wasn't registered an
appropriate event.

Add an alternative approach to check the ptrace_state information, directly
from the siginfo_t value returned from PT_GET_SIGINFO. The original
PT_GET_PROCESS_STATE approach is kept for compat with older NetBSD and
OpenBSD. New code is recommended to keep using PT_GET_PROCESS_STATE.

Add a couple of compile-time asserts for assumptions in the code.

No functional change intended in existing ptrace(2) software.

All ATF ptrace(2) and ATF GDB tests pass.

This change improves reliability of the threading ptrace(2) code.
2019-09-30 21:13:33 +00:00
kamil bd08c835ff Revert previous
There is fallout in gdb that will be investigated before relanding this.
2019-06-21 04:28:12 +00:00
kamil 14d51c2ac0 Enhance reliability of ptrace(2) in a debuggee with multiple LWPs
Stop competing between threads which one emits event signal quicker and
overwriting the signal from another thread.

This fixes missed in action signals.

NetBSD truss can now report reliably all TRAP_SCE/SCX/etc events without
reports of missed ones.

This was one of the reasons why debuggee with multiple threads misbehaved
under a debugger.
2019-06-21 04:02:57 +00:00
kamil 21a72dea25 Eliminate PS_NOTIFYSTOP remnants from the kernel
This flag used to be useful in /proc (BSD4.4-style) debugging semantics.
Traced child events were notified without signaling the parent.

This property was removed in NetBSD-8.0 and had no users.

This change simplifies the signal code, removing dead branches.

NFCI
2019-06-21 01:03:51 +00:00
kamil f32d6f14d4 Add support for KTR logs of SIGTRAP for TRAP_CHILD events
Previously it was disabled due to vfork(2) synchronization issues.
These problems are now gone.

While there, set l_vforkwaiting to false in posix_spawn. This is not very
needed but it does not make harm to keep it initialized explicitly.
2019-06-18 23:53:55 +00:00
kamil 293d38fbef Correct inversed condition for dying process in sigswitch()
If a process is exiting and it was not asked to relock proc_lock, do not
free the mutex as it causes panic. This bug is a timing bug as the faulty
condition is not deterministic and fires only somtimes, but is quickly
triggerable when executed in an infinite loop.

Detected and reported with LLDB test-suite by <mgorny>
2019-06-13 00:07:19 +00:00
kamil bcb2d04797 Stop trying to inform debugger about events from an exiting child
Do not emit signals to parent for if a process is demising:

 - fork/vfork/similar
 - lwp created/exited
 - exec
 - syscall entry/exit

With these changes Go applications can be traced without a clash under
a debugger, at least without deadlocking always. The culprit reason was
an attempt to inform a debugger in the middle of exit1() call about
a dying LWP. Go applications perform exit(2) without collecting threads
first. Verified with GDB and picotrace-based utilities like sigtracer.

PR kern/53120
PR port-arm/51677
PR bin/54060
PR bin/49662
PR kern/52548
2019-06-04 11:54:03 +00:00
kamil 7dee562254 Ship with syscall information with SIGTRAP TRAP_SCE/TRAP_SCX for tracers
Expand siginfo_t (struct size not changed) to new values for
SIGTRAP TRAP_SCE/TRAP_SCX events.

 - si_sysnum  -- syscall number (int)
 - si_retval  -- return value (2 x int)
 - si_error   -- error code (int)
 - si_args    -- syscall arguments (8 x uint64_t)

TRAP_SCE delivers si_sysnum and si_args.

TRAP_SCX delivers si_sysnum, si_retval, si_error and si_args.

Users: debuggers (like GDB) and syscall tracers (like strace, truss).

This MI interface is similar to the Linux kernel proposal of
PTRACE_GET_SYSCALL_INFO by the strace developer team.
2019-05-06 08:05:03 +00:00
kamil efd4138069 Register KTR events for debugger related signals
Register signals for:

 - crashes (FPE, SEGV, FPE, ILL, BUS)
 - LWP events
 - CHLD (FORK/VFORK/VFORK_DONE) events -- temporarily disabled
 - EXEC events

While there refactor related functions in order to simplify the code.

Add missing comment documentation for recently added kernel functions.
2019-05-03 22:34:21 +00:00
kamil ac37cdce0c Introduce fixes for ptrace(2)
Stop disabling LWP create and exit events for PT_SYSCALL tracing.
PT_SYSCALL disabled EXEC reporting for legacy reasons, there is no need
to repeat it for LWP and CHLD events.

Pass full siginfo from trapsignal events (SEGV, BUS, ILL, TRAP, FPE).
This adds missing information about signals like fault address.

Set ps_lwp always.

Before passing siginfo to userland through p_sigctx.ps_info, make sure
that it was zeroed for unused bytes. LWP and CHLD events do not set si_addr
and si_trap, these pieces of information are passed for crashes (like
software breakpoint).

LLDB crash reporting works now correctly:

(lldb) r
Process 552 launched: '/tmp/a.out' (x86_64)
Process 552 stopped
* thread #1, stop reason = signal SIGSEGV: invalid address (fault address: 0x123456)
2019-05-02 22:23:49 +00:00
kamil 6386bed56f Assert that debugger event is triggered only for userland LWP
All passing ATF ptrace(2) tests still pass.
2019-05-01 21:52:35 +00:00
kamil f93c604b79 Correct handling of corner cases in fork1(9) code under a debugger
Correct detaching and SIGKILLing forker and vforker in the middle of its
operation.
2019-05-01 18:01:54 +00:00
kamil d55073af77 Add eventswitch() in signal code
Route all crash and debugger related signal through eventswitch(), that
calls sigswitch() with preprocessed arguments.

This code avoids code duplication and allows to introduce changes that
will affect all callers of sigswitch() in debugger-related events.

No functional change intended.
2019-05-01 17:21:55 +00:00
kamil 1d035fe56b Remove support for early SIGTRAP (fork related) signals in kpsignal2()
This function is no longer used to handle early SIGTRAP signals for
fork-related events for ptrace(2).
2019-04-03 08:34:33 +00:00
kamil 0d862b3599 Stop resetting signal context on a trap signal under a debugger
In case of a crash signal, notify debugger immediately passing the signal
regardless of signal masking/ignoring.

While there pass signals emitted by a debugger to debuggee. Debugger calls
proc_unstop() that sets p_stat to SACTIVE and this signal wasn't passed
to tracee.

This scenario appeared to be triggered in recently added crash signal ATF
ptrace(2) tests.
2019-03-08 23:32:30 +00:00
maxv 590d64ce40 Fix kernel info leak, 4 bytes of padding at the end of struct sigaction.
+ Possible info leak: [len=32, leaked=4]
	| #0 0xffffffff80baf327 in kleak_copyout
	| #1 0xffffffff80bd9ca8 in sys___sigaction_sigtramp
	| #2 0xffffffff80259c42 in syscall
2018-11-29 10:27:36 +00:00