Chops another ~10% off create/join in a loop on i386.
- Disable low level debugging as this is stable. Improves benchmarks
across the board by a small percentage. Uncontested mutex acquire
and release in a loop becomes about 8% quicker.
- Minor cleanup.
hint pointer, but do so in a way that remains compatible with older
pthread libraries. This can be used to wake another thread before the
calling thread goes asleep, saving at least one syscall + involuntary
context switch. This turns out to be a fairly large win on the condvar
benchmarks that I have tried.
detach/join.
- Make mutex acquire spin for a short time, as done with spinlocks.
- Make the number of spins controllable with the env var PTHREAD_NSPINS.
- Reduce the amount of time that libpthread internal spinlocks are held.
- Rely more on the barrier effects of park/unpark to avoid taking spinlocks.
- Simplify the locking around pthreads and the global queues.
- Align per-thread sync data on a 128 byte boundary.
- Offset thread stacks by a small amount to try and reduce cache thrash.
After resuming execution, the thread must check to see if it
has been restarted as a result of pthread_cond_signal(). If it
has, but cannot take the wakeup (because of eg a pending Unix
signal or timeout) then try to ensure that another thread sees
it. This is necessary because there may be multiple waiters,
and at least one should take the wakeup if possible.
so avoid making them.
- When parking an LWP on a condition variable, point the hint argument at
the mutex's waiters queue. Chances are we will be awoken from that later.
the mutex that the waiters are using to synchronise, then transfer them
to the mutex's waiters list so that the wakeup is deferred until release
of the mutex. Improves the timings for CV sleep/wakeup by between 30-100%
in tests conducted locally on a UP system. There can be a penalty for MP
systems when only one thread is being awoken, but in practice I think it
won't be be an issue.
- pthread_signal: search for a thread that does not have a pending wakeup.
Threads can have a pending wakeup and still be on the waiters list if we
clash with an earlier pthread_cond_broadcast().
be used only as as a hint. Clear the pointer when releasing the mutex.
- When releasing a mutex, wake all waiters. Makes it possible to tranfer
waiters from another object to a mutex.