Also do the missed rename of multiboot_ksyms_init to
multiboot_ksyms_addsyms_elf to compile with MULTIBOOT set.
This caused a minor and a more serious bug in the past:
- dmesg did not contain the information about the loader
- /dev/ksyms did not work when the kernel was booted with a
multiboot bootloader (grub for example)
Ok by jmcneill, joerg.
into modules. By and large this commit:
- shuffles header files and ifdefs
- splits code out where necessary to be modular
- adds module glue for each of the components
- adds/replaces hooks for things that can be installed at runtime
Patch provided by Robert Swindell with fixes for the command line
parsing and addition of passing module options from me. The kernel now
always gets the full string for modules like for the command line,
everything before the first space/tab is the path name of the module.
for ioapics and faked up for others. Add it to "struct ioapic_softc"
for now, until device/softc get split.
This required all typecasts between "struct pic" and "struct ioapic_softc"
to be replaced, I hope I got them all.
functionally tested on i386, compile-tested on xen, untested on amd64
- Ditch the cross-CPU calibration stuff. It didn't work properly, and it's
near impossible to synchronize the CPUs in a running system, because bus
traffic will interfere with any calibration attempt, messing up the
timings.
- Only enable the TSC on CPUs where we are sure it does not drift. If we are
On a known good CPU, give the TSC high timecounter quality, making it the
default.
- When booting CPUs, detect TSC skew and account for it. Most Intel MP
systems have synchronized counters, but that need not be true if the
system has a complicated bus structure. As far as I know, AMD systems
do not have synchronized TSCs and so we need to handle skew.
- While an AP is waiting to be set running, try and make the TSC drift by
entering a reduced power state. If we detect drift, ensure that the TSC
does not get a high timecounter quality. This should not happen and is
only for safety.
- Make cpu_counter() stuff LKM safe.
Select the Xen idle routine for Xen, mwait if supported by the CPU and
it is not AMD and halt otherwise. As reported by Christoph Egger,
AMD Barcelona keeps the CPU in C0 state with MWAIT, contrary to HLT,
which uses C1 and therefore much less power.
- I have seen one isolated panic in the x86 pmap, but otherwise i386
seems stable with preemption enabled.
- amd64 is missing the FPU handling changes and it's not yet safe to
enable it there.
- The usual level for kern.sched.kpreempt_pri will be 128 once enabled
by default. For testing, setting it to 0 helps to shake out bugs.
way identcpu.c and machdep.c are not cluttered with foreign code.
The driver is built by default as before, but the sysctl subtree will
only be created if longrun is detected and not always as the old code
did. This matches what the FreeBSD code does.
Ok by christos@.
- atomic_cas_ni() does an implicit membar_exit()
- all other atomic operations do an implicit membar_sync()
While this might seem kind of arbitrary it's the basis for some important
optimizations.
(domU only). PAE support is enabled by 'options PAE', see the new XEN3PAE_DOMU
and INSTALL_XEN3PAE_DOMU kernel config files.
See the comments in arch/i386/include/{pte.h,pmap.h} to see how it works.
In short, we still handle it as a 2-level MMU, with the second level page
directory being 4 pages in size. pmap switching is done by switching the
L2 pages in the L3 entries, instead of loading %cr3. This is almost required
by Xen, which handle the last L2 page (the one mapping 0xc0000000 - 0xffffffff)
in a very special way. But this approach should also work for native PAE
support if ever supported (in fact, the pmap should almost suport native
PAE, what's missing is bootstrap code in locore.S).
- use a hash rather than SPLAY trees.
SPLAY tree is a wrong algorithm to use here.
will be revisited if it slows down anything other than
micro-benchmarks.
- optimize the single mapping case (it's a common case) by
embedding an entry into mdpage.
- don't keep a pmap pointer as it can be obtained from ptp.
(discussed on port-i386 some years ago.)
ideally, a single paddr_t should be enough to describe a pte.
but it needs some more thoughts as it can increase computational
costs.
- pmap_enter: simplify and fix races with pmap_sync_pv.
- don't bother to lock pm_obj[i] where i > 0, unless DIAGNOSTIC.
- kill mp_link to save space.
- add many KASSERTs.