- Back out the previous workaround now that the sleep queue code has
been changed to never let the queue become empty if there are valid
waiters.
- Use sleepq_hashlock() to improve clarity.
- Sprinkle some assertions.
PR kern/38761.
The condvar must access the sleepq with the sleepq lock held, doing so
is causing inconsistent sleepq state to be read.
This is because some accesses to the sleepq don't come via the cv code,
but are call directly into sleepq_changepri and sleepq_lendpri, which take
the sleepq lock, and removes then re-inserts lwps into the sleepq.
Running a build.sh with -j8 now completes on my quad-core, also tested by
Simon@ on a 8-core server and matt@ on a quad-core.
I believe there is room to be more efficient with this, as we now take the
sleepq lock for all cv_broadcast and cv_signal calls. I'll look into this
and post a diff to tech-kern.
to scale more gracefully when there are thousands of active threads.
Proposed on tech-kern@.
- Use LOCKDEBUG to catch some errors in the use of condition variables:
freeing an active CV
re-initializing an active CV
using multiple distinct mutexes during concurrent waits
not holding the interlocking mutex when calling cv_broadcast/cv_signal
waking waiters and destroying the CV before they run and exit it
existing behaviour: the unsleep method unlocks and wakes the swapper if
needs be. If false, the caller is doing a batch operation and will take
care of that later. This is kind of ugly, but it's difficult for the caller
to know which lock to release in some situations.
tech-kern:
- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
from doc/BRANCHES:
idle lwp, and some changes depending on it.
1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
- cv_wait and friends: after resuming execution, check to see if we have
been restarted as a result of cv_signal. If we have, but cannot take
the wakeup (because of eg a pending Unix signal or timeout) then try to
ensure that another LWP sees it. This is necessary because there may
be multiple waiters, and at least one should take the wakeup if possible.
Prompted by a discussion with pooka@.
- typedef struct lwp lwp_t;
- int -> bool, struct lwp -> lwp_t in a few places.