use both types of list.
- Make page coloring and idle zero state per-CPU.
- Maintain per-CPU page freelists. When freeing, put pages onto the local
CPU's lists and the global lists. When allocating, prefer to take pages
from the local CPU. If none are available take from the global list as
done now. Proposed on tech-kern@.
mi_switch(), migration for LSONPROC is now performed via idle loop.
Handles/fixes on-CPU case in lwp_migrate(), misc.
Closes PR/38169, idea of migration via idle loop by Andrew Doran.
- Fix performance regression inroduced by the workaround by making job
stealing a lot simpler: if the local run queue is empty, let the CPU enter
the idle loop. In the idle loop, try to steal a job from another CPU's run
queue if we are idle. If we succeed, re-enter mi_switch() immediatley to
dispatch the job.
- When stealing jobs, consider a remote CPU to have one less job in its
queue if it's currently in the idle loop. It will dispatch the job soon,
so there's no point sloshing it about.
- Introduce a few event counters to monitor what's happening with the run
queues.
- Revert the idle CPU bitmap change. It's pointless considering NUMA.
tech-kern:
- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
from doc/BRANCHES:
idle lwp, and some changes depending on it.
1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.