- Ditch the cross-CPU calibration stuff. It didn't work properly, and it's
near impossible to synchronize the CPUs in a running system, because bus
traffic will interfere with any calibration attempt, messing up the
timings.
- Only enable the TSC on CPUs where we are sure it does not drift. If we are
On a known good CPU, give the TSC high timecounter quality, making it the
default.
- When booting CPUs, detect TSC skew and account for it. Most Intel MP
systems have synchronized counters, but that need not be true if the
system has a complicated bus structure. As far as I know, AMD systems
do not have synchronized TSCs and so we need to handle skew.
- While an AP is waiting to be set running, try and make the TSC drift by
entering a reduced power state. If we detect drift, ensure that the TSC
does not get a high timecounter quality. This should not happen and is
only for safety.
- Make cpu_counter() stuff LKM safe.
the auich auich_calibrate() function to get the wrong ac97 freq
(may cause audio to play at wrong speed on some systems). this
error was inadvertently introduced in rev 1.98 of the old
src/sys/arch/i386/isa/clock.c (2006/09/03) and manifests itself
on systems that do not use an alternate timecounter (e.g. ACPI-Fast).
the basic problem is that the code that handled when the i8254
counter wrapped was firing in cases when it shouldn't have,
causing the counter to run fast. a more detailed discussion
can be found here:
http://mail-index.netbsd.org/tech-kern/2008/01/15/0001.htmlhttp://mail-index.netbsd.org/tech-kern/2008/01/16/0000.html
two muls and a shift, which needs at most 2ms on a 25MHz i386 and should
end up as fast as delay(1) was before due to using a reminder of 2.
Discussed with ad@.
delay function were wildly inaccurate due to multiple CPUs competing
in DELAY() during calibration, confusing the clock chip.
- Use i8254_delay() explictly in a few more places.
argument. Use this and replace the inline assembly (mul + div using the
64bit intermediate result) with normal 32bit multiplication and
division. The compiler can turn the division into a multiplication and
shift, making it even cheaper then the original assembly. For extreme
long delays, just use 64bit arithmetic.
This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.
TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.
NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
int _bus_dmatag_subregion(bus_dma_tag_t tag,
bus_addr_t min_addr,
bus_addr_t max_addr,
bus_dma_tag_t *newtag,
int flags)
void _bus_dmatag_destroy(bus_dma_tag_t tag)
that allow a (normally broken/limited) device to restrict the bus address
range it can talk to. this is used by bce(4) to limit DMA addresses to
1GB range, the maximum the chip can address.
all this is from Yorick Hardy <yhardy@uj.ac.za> with input from several
people on tech-kern.
XXX: bus_dma(9) needs an update still.
- distinguish paddr_t and bus_addr_t.
for xen, use bus_addr_t in the sense of machine address.
- move _X86_BUS_DMA_PRIVATE part of bus.h into bus_private.h.
- remove special handling of xen_shm. we can always grab
machine address from pte.
Make ALLOCNOW the default iff bouncing might be needed (this has
no effect on i386 because ISA DMA devices already had to use
ALLOCNOW, and PCI isn't bounced (yet), since we don't do > 4G
at this point for i386.