- enable concurrency according to environment variable PTHREAD_CONCURRENCY
- add idle VP wakeup if there are additional jobs and idle VPs
- make reidlequeue per VP
- enable spinning for locks
- fix race condition in alarm processing
- fix race condition in mutex locking
- make debugging output line buffered and add VP prefix to debug lines
- add PTHREAD_PID_DEBUG which prints the pid before each debuglog line
- output thread returned in pthread__next
- add asserts in pthread__sched akin to asserts in pthread__sched_bulk:
check if scheduled thread is at front/end of queue
- pthread__upcall: output event/interrupted LWP count instead of LWPid
of the first event/interrupted LWP (since unblock upcalls can have
multiple event LWPs).
- pthread__find_interrupted: output LWPid here
still possible for multiple threads to write into the same space, but
they shouldn't be able to corrupt the write pointer in the process.
Also, check for pointer-lapping a bit more carefully in the wrap
vs. non-wrap case.