any threads are created turned out to be not such a good idea.
there are stronger requirements on what has to work in a forked child
while a process is still single-threaded. so take all that stuff
back out and fix the problems with single-threaded programs that
are linked with libpthread differently, by checking if the library
has been started and doing completely different stuff if it hasn't been:
- for pthread_rwlock_timedrdlock(), just fail with EDEADLK immediately.
- for sem_wait(), the only thing that can unlock the semaphore is a
signal handler, so use sigsuspend() to wait for a signal.
- for pthread_mutex_lock_slow(), just go into an infinite loop
waiting for signals.
I also noticed that there's a "sem2" test that has never worked in its
single-threaded form. the problem there is that a signal handler tries
to take a sem_t interlock which is already held when the signal is received.
fix this too, by adding a single-threaded case for sig_trywait() that
blocks signals instead of using the userland interlock.
call pthread__start() if it hasn't already been called. this avoids
an internal assertion from the library if these routines are used
before any threads are created and they need to sleep.
fixes PR 20256, PR 24241, PR 25722, PR 26096.
- enable concurrency according to environment variable PTHREAD_CONCURRENCY
- add idle VP wakeup if there are additional jobs and idle VPs
- make reidlequeue per VP
- enable spinning for locks
- fix race condition in alarm processing
- fix race condition in mutex locking
- make debugging output line buffered and add VP prefix to debug lines
officially have undefined behavior, from returning an error code to raising
an assertion failure.
Also, don't bother to explicitly test for (illegal) null pointers and return
an error; they'll bomb out soon enough.
front of the sleep queue rather than the back. This is more
cache-friendly behavior and within the (lack of) constraints on wakeup
ordering imposed on equal-priority threads.
* Use a double-checked locking technique to avoid taking
the interlock in pthread_mutex_unlock().
* In pthread_mutex_lock() and pthread_mutex_trylock(), only store the
stack pointer, not the thread ID, in ptm_owner. Do the translation
to a thread ID in the slow-lock, errorcheck, and recursive mutex
cases rather than in the common path.
* Juggle where pthread__self() is called, to move it out of the fast path.
Overall, this means that neither pthread_self() nor
pthread_spin[un]lock() are used in the course of locking and unlocking
an uncontested mutex. Speeds up the fast path by 40-50%, and
eliminates about 98% of spinlocks used by a couple of large threaded
applications.
(Still a GET_MUTEX_PRIVATE() in the fast path... perhaps the type
should be in the main body of the mutex).