This was a very nice win in my tests on a 48 CPU box.
- Reorganise cpu_data slightly according to usage.
- Put cpu_onproc into struct cpu_info alongside ci_curlwp (now is ci_onproc).
- On x86, put some items in their own cache lines according to usage, like
the IPI bitmask and ci_want_resched.
- sched_tick: cpu_need_resched is no longer the correct thing to do here.
All we need to do is OR the request into the local ci_want_resched.
- sched_resched_cpu: we need to set RESCHED_UPREEMPT even on softint LWPs,
especially in the !__HAVE_FAST_SOFTINTS case, because the LWP with the
LP_INTR flag could be running via softint_overlay() - i.e. it has been
temporarily borrowed from a user process, and it needs to notice the
resched after it has stopped running softints.
These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.
HOWEVER! Some subsystems have
#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))
even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.
To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.
I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:
cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))
It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.
Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
- Change minimal time-quantum to ~20 ms.
- Thus remove unneeded pool in M2, and unused sched_lwp_exit().
- Do not increase l_slptime twice for SCHED_4BSD (regression fix).
mandatory. Remove the 4BSD run queue code. Effects:
- Pluggable scheduler is only responsible for co-ordinating timeshared jobs.
- All systems run with per-CPU run queues.
- 4BSD scheduler gets processor sets / affinity.
- 4BSD scheduler gets a significant peformance boost on some workloads.
Discussed on tech-kern@.
never sleep. Should fix PR/37245 by <yamt>.
- Fix a regression - dissalow catching of bound threads. Also, allow
migration of non-bound kthreads, this restriction seems pointless.
- Few micro-optimisations, misc.
Fixes the problems with systems having > 2GB of memory;
From <drochner>, thanks for catching this!
- Convert pool to pool-cache;
- Adjust copyright while here;
- Include PRI_HIGHEST_TS value into the increasion range,
it was missed previously by mistake;
- More KASSERTs to handle invalid priorities of threads;
- Remove PRI_DEFAULT;
- Misc;
Add schedctl(8) - a program to control scheduling of processes and threads.
Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;
Proposed on: <tech-kern>. Reviewed by: <ad>.
- Re-enqueue the thread when priority changes and it is in LSRUN state;
- Handle the __HAVE_FAST_SOFTINTS case in sched_curcpu_runnable_p();
- Few minor changes;
tech-kern:
- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.
of CPU is non-power of 2. Fixes PR/37244.
- sched_enqueue: initialize sl_lrtime, when it is zero (new thread).
Part of PR/37245.
- Fix the mints/maxts sysctl helpers, use mstohz() for the checks. Also,
I meant miliseconds, not microseconds. Found by <bjs>.
- Fix inverted logic with r_mcount in M2;
- setrunnable: perform sched_takecpu() when making the LWP runnable;
- setrunnable: l_mutex cannot be spc_mutex here;
This makes cpuctl(8) work with SCHED_M2.
OK by <ad>.
on the original approach of SVR4 with some inspirations about balancing
and migration from Solaris. It implements per-CPU runqueues, provides a
real-time (RT) and time-sharing (TS) queues, ready to support a POSIX
real-time extensions, and also prepared for the support of CPU affinity.
The following lines in the kernel config enables the SCHED_M2:
no options SCHED_4BSD
options SCHED_M2
The scheduler seems to be stable. Further work will come soon.
http://mail-index.netbsd.org/tech-kern/2007/10/04/0001.htmlhttp://www.netbsd.org/~rmind/m2/mysql_bench_ro_4x_local.png
Thanks <ad> for the benchmarks!