Commit Graph

86 Commits

Author SHA1 Message Date
martin
d505b18964 Make sure to include opt_foo.h if a defflag option FOO is used. 2003-06-23 11:00:59 +00:00
drochner
0256604827 nuke unnecessary #include <sys/dkstat.h> 2003-06-12 14:44:36 +00:00
thorpej
1b84adbe5f New callout implementation. This is based on callwheel implementation
done by Artur Grabowski and Thomas Nordin for OpenBSD, which is more
efficient in several ways than the callwheel implementation that it is
replacing.  It has been adapted to our pre-existing callout API, and
also provides the slightly more efficient (and much more intuitive)
API (adapted to the callout_*() naming scheme) that the OpenBSD version
provides.

Among other things, this shaves a bunch of cycles off rescheduling-in-
the-future a callout which is already scheduled, which the common case
for TCP timers (notably REXMT and KEEP).

The API has been simplified a bit, as well.  The (very confusing to
a good many people) "ACTIVE" state for callouts has gone away.  There
is now only "PENDING" (scheduled to fire in the future) and "EXPIRED"
(has fired, and the function called).

Kernel version bump not done; we'll ride the 1.6N bump that happened
with the malloc(9) change.
2003-02-04 01:21:03 +00:00
pk
5e14aa69a8 There's a locking order issue with the scheduler and the callwheel locks
as ltsleep() may call callout_reset() with the scheduler lock held.
So, prevent interrupts that may take the scheduler lock while holding
the callwheel lock.
2003-01-27 22:38:24 +00:00
thorpej
e0d8d366df Merge the nathanw_sa branch. 2003-01-18 10:06:22 +00:00
perry
6858187df6 /*CONTCOND*/ while (0)'ed macros 2002-11-02 07:20:42 +00:00
briggs
0b956d0b8b Implement pmc(9) -- An interface to hardware performance monitoring
counters.  These counters do not exist on all CPUs, but where they
do exist, can be used for counting events such as dcache misses that
would otherwise be difficult or impossible to instrument by code
inspection or hardware simulation.

pmc(9) is meant to be a general interface.  Initially, the Intel XScale
counters are the only ones supported.
2002-08-07 05:14:47 +00:00
simonb
21d2b8b53d We don't need to include <uvm/uvm_extern.h> before <sys/sysctl.h> anymore. 2002-03-17 11:10:43 +00:00
lukem
adc783d537 add RCSIDs 2001-11-12 15:25:01 +00:00
enami
163c9dd7c1 Defopt CALLWHEEL_STATS. 2001-09-13 05:22:16 +00:00
thorpej
16c229ea7c Optimization suggested by Bill Sommerfeld: Keep a hint as to the
"earliest" firing callout in a bucket.  This allows us to skip
the scan up the bucket if no callouts are due in the bucket.

A cheap O(1) hint update is done at callout insertion (if new callout
is earlier than hint) and removal (is bucket empty).  A thorough
refresh of the hint is done when the bucket is traversed.

This doesn't matter much on machines with small values of hz
(e.g. i386), but on systems with large values of hz (e.g. Alpha),
it has a definite positive effect.

Also, keep the callwheel stats in evcnts, so that you can view them
with "vmstat -e".
2001-09-11 04:32:19 +00:00
simonb
cbbd901bdc Declare schedhz. 2001-05-06 13:46:34 +00:00
thorpej
2f89e3d744 Explicitly include <machine/intr.h> if __HAVE_GENERIC_SOFT_INTERRUPTS. 2001-01-17 18:21:41 +00:00
thorpej
d74e432ed3 Make softclock a generic soft interrupt of the API is available,
adding the requisite void * argument to softclock().
2001-01-15 20:19:50 +00:00
mycroft
66610a4779 Introduce PROC_PC(), which is used to get a process's user PC. If this is
defined, call addupc_intr() directly from statclock() in the system time case,
using the same P_OWEUPC path if the copyin/copyout fails.
Use this in i386 to remove profiling code from the normal userret() path.
2000-12-10 19:29:30 +00:00
sommerfeld
340951f9d1 On second thought.. pass cpu_info * to roundrobin() explicitly. 2000-08-26 04:01:16 +00:00
sommerfeld
ec08310fab More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
 - Make pscnt per-cpu.
 - Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.
2000-08-26 03:34:36 +00:00
thorpej
f759220f40 Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.
2000-08-22 17:28:28 +00:00
eeh
3787c3f7fd Should use an intrptr_t' for address calculations rather than int'. 2000-08-22 16:44:51 +00:00
thorpej
25fe521af4 Fix a locking glitch in callwheel_slock handling. Noted by Bill Sommerfeld. 2000-08-22 15:30:59 +00:00
thorpej
14c0be9cd4 Protect hardclock_ticks and softclock_ticks with the callwheel
lock to prevent a race between hardclock() and callout_reset().
2000-08-21 23:51:33 +00:00
thorpej
b7e86fa7a8 spllowersoftclock() is already void; no need to cast it. 2000-08-21 23:43:30 +00:00
thorpej
012500bf1f Add a lock for the callwheel (callout facility), and only go to
splclock() while holding it.
2000-08-21 23:40:56 +00:00
thorpej
cd32ace8bb ANSI'ify. 2000-08-01 04:57:28 +00:00
thorpej
c0c8481a2a New hzto() function from FreeBSD and Artur Grabowski <art@stacken.kth.se>.
Stops sleeps from returning early (by up to a clock tick), and return 0
ticks for timeouts that should happen now or in the past.

Returning 0 is different from the legacy hzto() interface, and callers
need to check for it.
2000-07-13 17:06:15 +00:00
mrg
32aa199ccf remove include of <vm/vm.h> 2000-06-27 17:41:07 +00:00
thorpej
5b281c5932 Move schedticks and cp_time into schedstate_percpu. Also, allow
non-primary CPUs to call hardclock(), but make them bail about
before updating global timekeeping state (that's the job of the
primary CPU).
2000-06-03 20:42:42 +00:00
simonb
38cc1b3975 Add new sysctl node "KERN_SYSVIPC_INFO" with "KERN_SYSVIPC_MSG_INFO",
"KERN_SYSVIPC_SEM_INFO" and "KERN_SYSVIPC_SHM_INFO" to return the
info and data structures for the relevent SysV IPC types.  The return
structures use fixed-size types and should be compat32 safe.  All
user-visible changes are protected with
	#if !defined(_POSIX_C_SOURCE) && !defined(_XOPEN_SOURCE)

Make all variable declarations extern in msg.h, sem.h and shm.h and
add relevent variable declarations to sysv_*.c and remove unneeded
header files from those .c files.

Make compat14 SysV IPC conversion functions and sysctl_file() static.

Change the data pointer to "void *" in sysctl_clockrate(),
sysctl_ntptime(), sysctl_file() and sysctl_doeproc().
2000-06-02 15:53:03 +00:00
mycroft
da42c608fe Use a better multiplier for the 60Hz case. 2000-05-29 23:48:33 +00:00
mycroft
7513b8e18d Update an outdated comment.
Allow all powers of 2 from 2^0 to 2^16 for hz.
Enable hz==1200.
2000-05-29 15:05:10 +00:00
mycroft
8dcf08ff77 Improve the time_adj multiplier for the 100Hz and 1000Hz cases, and add a
1200Hz case.
2000-05-29 14:58:59 +00:00
augustss
264f1d27c6 Get rid of register declarations. 2000-03-30 09:27:11 +00:00
enami
f9c7a69ff5 Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called.  While I'm here, I renamed some function.
2000-03-24 11:57:14 +00:00
thorpej
2b58edac40 Remove the CALLWHEEL_SORT code. It was implemented just for experimenting,
and I had no plans to ever enable it.  A record of the code is now in the
CVS history of the file, so we can unclutter now.
2000-03-23 20:51:09 +00:00
thorpej
b667a5a357 New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
  resource allocation.
- Insertion and removal of callouts is constant time, important as
  this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.
2000-03-23 06:30:07 +00:00
thorpej
a0397a2573 Move callout initialization to a single location; no need to duplicate
that code all over the place.
2000-01-19 20:05:30 +00:00
sommerfeld
c450ebbbe7 If using kernel PLL (for NTP), initialize "fixtick" to a reasonable
approximation of reality if the MD code doesn't.  This variable is the
equivalent of "tickfix" for the non-NTP path.

This allows an alpha kernel (where hz=1024) with "options NTP" to
synch up quite nicely (as opposed to having an frequency error of
~560ppm, which is outside the capture range of the PLL).
1999-09-06 20:44:02 +00:00
thorpej
eb20bbc780 Change the semantics of splsoftclock() to be like other spl*() functions,
that is priority is rasied.  Add a new spllowersoftclock() to provide the
atomic drop-to-softclock semantics that the old splsoftclock() provided,
and update calls accordingly.

This fixes a problem with using the "rnd" pseudo-device from within
interrupt context to extract random data (e.g. from within the softnet
interrupt) where doing so would incorrectly unblock interrupts (causing
all sorts of lossage).

XXX 4 platforms do not have priority-raising capability: newsmips, sparc,
XXX sparc64, and VAX.  This platforms still have this bug until their
XXX spl*() functions are fixed.
1999-08-05 18:08:08 +00:00
christos
a32f7169fc Align struct timeval time to the same alignment requirements of a quad.
This broke the sparc elf kernel which in microtime uses ldd to load both
words at the same time. The a.out kernel, just got lucky.
1999-05-04 16:16:54 +00:00
ross
e47e3c9f45 schedclk() -> schedclock(), for consistency with hardclock(), statclock(), ...
update comments for recent scheduler mods
1999-02-28 18:14:57 +00:00
mycroft
d4955ba8a9 While I'm on a fixed point kick, improve the NTP clock factor correction to
give <.1% error in all (supported) cases.  It doesn't cost us much.
1999-02-23 17:41:48 +00:00
ross
b4a33c4e60 Scheduler bug fixes and reorganization
* fix the ancient nice(1) bug, where nice +20 processes incorrectly
  steal 10 - 20% of the CPU, (or even more depending on load average)
* provide a new schedclk() mechanism at a new clock at schedhz, so high
  platform hz values don't cause nice +0 processes to look like they are
  niced
* change the algorithm slightly, and reorganize the code a lot
* fix percent-CPU calculation bugs, and eliminate some no-op code

=== nice bug === Correctly divide the scheduler queues between niced and
compute-bound processes. The current nice weight of two (sort of, see
`algorithm change' below) neatly divides the USRPRI queues in half; this
should have been used to clip p_estcpu, instead of UCHAR_MAX.  Besides
being the wrong amount, clipping an unsigned char to UCHAR_MAX is a no-op,
and it was done after decay_cpu() which can only _reduce_ the value.  It
has to be kept <= NICE_WEIGHT * PRIO_MAX - PPQ or processes can
scheduler-penalize themselves onto the same queue as nice +20 processes.
(Or even a higher one.)

=== New schedclk() mechansism === Some platforms should be cutting down
stathz before hitting the scheduler, since the scheduler algorithm only
works right in the vicinity of 64 Hz. Rather than prescale hz, then scale
back and forth by 4 every time p_estcpu is touched (each occurance an
abstraction violation), use p_estcpu without scaling and require schedhz
to be generated directly at the right frequency. Use a default stathz (well,
actually, profhz) / 4, so nothing changes unless a platform defines schedhz
and a new clock.  Define these for alpha, where hz==1024, and nice was
totally broke.

=== Algorithm change === The nice value used to be added to the
exponentially-decayed scheduler history value p_estcpu, in _addition_ to
be incorporated directly (with greater wieght) into the priority calculation.
At first glance, it appears to be a pointless increase of 1/8 the nice
effect (pri = p_estcpu/4 + nice*2), but it's actually at least 3x that
because it will ramp up linearly but be decayed only exponentially, thus
converging to an additional .75 nice for a loadaverage of one. I killed
this, it makes the behavior hard to control, almost impossible to analyze,
and the effect (~~nothing at for the first second, then somewhat increased
niceness after three seconds or more, depending on load average) pointless.

=== Other bugs === hz -> profhz in the p_pctcpu = f(p_cpticks) calcuation.
Collect scheduler functionality. Try to put each abstraction in just one
place.
1999-02-23 02:56:03 +00:00
jonathan
6c0abe64fc defopt NTP and PPS_SYNC, in preparation for adding PPS support. 1998-04-22 07:08:11 +00:00
ross
7516424fe6 Teach the NTP PLL how to lock when hz == 1000. 1998-01-31 10:42:11 +00:00
gwr
55f621803a Moved db_show_callout() to ddb/db_xxx.c 1997-05-21 19:55:45 +00:00
tls
02697a5d47 add case for 512Hz in NTP code 1997-05-05 19:25:26 +00:00
mycroft
85b2440284 Use splclock() to block time updates, not splhigh(). 1997-02-28 04:45:35 +00:00
cgd
008f5aedf9 apply patch from PR 2788 (from Dennis Ferguson <dennis@jnx.com>) to
more smoothly apply "tickfix"es microsecond deltas (when compensating
for clocks running at > 1000Hz).
1997-01-15 04:59:39 +00:00
cgd
99fb250b5c fix from PR 2787 (from Dennis Ferguson <dennis@jnx.com>): when adjtime
is running (and NTP is not enabled), the adjtime()-handling code clobbers
any tickfix that may be necessary for systems with clocks with frequency
greater than 1000Hz.
1997-01-15 04:27:35 +00:00
cgd
30ca7eaa8d clean up a few spaces vs. tabs bogons 1996-11-15 23:51:23 +00:00