problem reported here but is part of the solution):
xen_kpm_sync() is not working as expected,
leading to races between CPUs.
1 the check (xpq_cpu != &x86_curcpu) is always false because we
have different x86_curcpu symbols with different addresses in the kernel.
Fortunably, all addresses dissaemble to the same code.
Because of this we always use the code intended for bootstrap, which doesn't
use cross-calls or lock.
2 once 1 above is fixed, xen_kpm_sync() will use xcalls to sync other CPUs,
which cause it to sleep and pmap.c doesn't like that. It triggers this
KASSERT() in pmap_unmap_ptes():
KASSERT(pmap->pm_ncsw == curlwp->l_ncsw);
3 pmap->pm_cpus is not safe for the purpose of xen_kpm_sync(), which
needs to know on which CPU a pmap is loaded *now*:
pmap->pm_cpus is cleared before cpu_load_pmap() is called to switch
to a new pmap, leaving a window where a pmap is still in a CPU's
ci_kpm_pdir but not in pm_cpus. As a virtual CPU may be preempted
by the hypervisor at any time, it can be large enough to let another
CPU free the PTP and reuse it as a normal page.
To fix 2), avoid cross-calls and IPIs completely, and instead
use a mutex to update all CPU's ci_kpm_pdir from the local CPU.
It's safe because we just need to update the table page, a tlbflush IPI will
happen later. As a side effect, we don't need a different code for bootstrap,
fixing 1). The mutex added to struct cpu needs a small headers reorganisation.
to fix 3), introduce a pm_xen_ptp_cpus which is updated from
cpu_pmap_load(), whith the ci_kpm_mtx mutex held. Checking it with
ci_kpm_mtx held will avoid overwriting the wrong pmap's ci_kpm_pdir.
While there I removed the unused pmap_is_active() function;
and added some more details to DIAGNOSTIC panics.
Reflect this change in the suspend/resume routines so they can cope with
domU CPU suspend, instead of setting their cpu_info pointer to NULL.
Avoid copy/pasting by using the resume routines during attachement.
ok releng@.
No regression observed, and allows domU to suspend successfully again.
Restore is a different beast as PD/PT flags are marked "invalid" by Xen-4
hypervisor, and blocks resuming. Looking into it.
According to "Intel 64 and IA-32 Architectures Software Developer's Manual"
Vol. 3, May 2011, Order Number: 325384-039US, Section 10.6.1:
LEVEL_ASSERT is bit #14, bit #13 is reserved.
With this change NetBSD now boots multiple processors under CentOS 6.2/kvm.
Implemented and enabled via CPU_UCODE kernel config option
for x86 and Xen Dom0.
Tested on different AMD machines with different
CPU families.
ok wiz@ for the manpages
ok releng@
ok core@ via releng@
from the bootloader. This can fix the problem of poor quality keys
for other kernel modules which call arc4random() early in kernel startup
(NFS startup, in particular, causes this).
We continue to rely on the etc/rc.d/random_seed script to save entropy
to the seed file at shutdown and erase the seed file at startup.
Boot loader support implemented only for i386 and amd64 ports for now but
it should be easy for other ports to do the same or similar.
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:
An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.
A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.
The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.
An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.
A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.
An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.
In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.
The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.
The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.
A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.
The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.
Manual pages for the new kernel interfaces are forthcoming.
http://mail-index.netbsd.org/source-changes/2011/10/22/msg028271.html
From the Log:
Log Message:
Various interrupt fixes, mainly:
keep a per-cpu mask of enabled events, and use it to get pending events.
A cpu-specific event (all of them at this time) should not be ever masked
by another CPU, because it may prevent the target CPU from seeing it
(the clock events all fires at once for example).
speaking we are also running with PAE enabled in long mode under amd64,
so this variable will be used in various places across x86 machdep to
branch at runtime to functions that require extra handling for PAE mode.
in a safe way by handling the fault that might trigger for certain
register <> CPU/arch combos.
Requested by Jukka. Patch adapted from one found in DragonflyBSD.
Goal: save/restore support in NetBSD domUs, for i386, i386 PAE and amd64.
Executive summary:
- split all Xen drivers (xenbus(4), grant tables, xbd(4), xennet(4))
in two parts: suspend and resume, and hook them to pmf(9).
- modify pmap so that Xen hypervisor does not cry out loud in case
it finds "unexpected" recursive memory mappings
- provide a sysctl(7), machdep.xen.suspend, to command suspend from
userland via powerd(8). Note: a suspend can only be handled correctly
when dom0 requested it, so provide a mechanism that will prevent
kernel to blindly validate user's commands
The code is still in experimental state, use at your own risk: restore
can corrupt backend communications rings; this can completely thrash
dom0 as it will loop at a high interrupt level trying to honor
all domU requests.
XXX PAE suspend does not work in amd64 currently, due to (yet again!)
page validation issues with hypervisor. Will fix.
XXX secondary CPUs are not suspended, I will write the handlers
in sync with cherry's Xen MP work.
Tested under i386 and amd64, bear in mind ring corruption though.
No build break expected, GENERICs and XEN* kernels should be fine.
./build.sh distribution still running. In any case: sorry if it does
break for you, contact me directly for reports.
of the memory & I/O space reserved by the PCI BIOS for PCI devices
(including bridges) and recording that information for later use.
The code takes between 13k and 50k (depends on the architecture and,
bizarrely, the kernel configuration) so I am going to move it from
pci_machdep.c into its own module on Monday.
Interrupts). Successfully tested with hdaudio and "wpi" wireless
ethernet.
notes:
-There seem to be buggy chips around which announce MSI support
but don't correctly implement it. Thus the final word whether MSIs
can be used should be by the driver.
-Only a single vector is supported. For multiple vectors, the IDT
allocation code would have to be changed. (And we would possibly
run into problems due to the limited number of vectors supported
by the current code.)
-The code is "#if NIOAPIC > 0" because it uses the ioapic_edge
interrupt stubs. These actually don't touch any ioapic, so this
is somewhat a misnomer.
-MSIs can't be identified by a "pin" but only by a cpu/vector
pair. Common intr code soesn't deal well with this yet.
-Drivers need to take care of saving/restoring MSI data in the device's
config space on suspend/resume.
if it is "-1" -- this is a hack to allow MSIs which don't have a concept
of pin numbers, and are generally not shared
(This doesn't give us sensible event names for statistics display. The
whole abstraction has more exceptions than regular cases, it should
be redesigned imho.)
<http://mail-index.netbsd.org/tech-kern/2010/04/02/msg007941.html>,
divide each machine's bus.h into bus_defs.h (constants & data types)
and bus_funcs.h (macro implementations of bus_space(9) routines and MD
prototypes).
Note that some bus_space(9) routines' implementation will move to .c
files from inline subroutines or macros in .h files.
I've only made the split for machine architectures where there is PCI.
All of the non-PCI-having architectures will require a similar split.
These #include files are not referenced by any (committed) Makefiles or
header files, yet. Changes to Makefiles, to <sys/bus.h>, and to some
more machine-dependent files will dribble in before I throw the switch.
driver to the main acpi(4) stack. Follow Linux and evaluate it early.
Should fix PR port-amd64/42895, possibly also PR kern/42583, and many
other comparable bugs.
A common sense explanation is that Intel supplies additional CPU tables to
OEMs. BIOS writers do not bother to modify their DSDTs, but instead load
these extra tables dynamically as secondary SSDT tables. The actual Load()
happens when the _PDC method is invoked, and thus namespace errors occur
when the CPU-specific ACPI methods are not yet present but referenced in the
AML by various drivers, including, but not limited to, acpitz(4).
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
or do not link in subr_userconf.c and x86_userconf.c.
Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().
Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.
pci_find_rom(), pci_intr_map(9), pci_enumerate_bus(), nor the match
predicate passed to pciide_compat_intr_establish() should ever modify
their pci_attach_args argument, so make their pci_attach_args arguments
const and deal with the fallout throughout the kernel.
For the most part, these changes add a 'const' where there was no
'const' before, however, some drivers and MD code used to modify
pci_attach_args. Now those drivers either copy their pci_attach_args
and modify the copy, or refrain from modifying pci_attach_args:
Xen: according to Manuel Bouyer, writing to pci_attach_args in
pci_intr_map() was a leftover from Xen 2. Probably a bug. I
stopped writing it. I have not tested this change.
siside(4): sis_hostbr_match() needlessly wrote to pci_attach_args.
Probably a bug. I use a temporary variable. I have not tested this
change.
slide(4): sl82c105_chip_map() overwrote the caller's pci_attach_args.
Probably a bug. Use a local pci_attach_args. I have not tested
this change.
viaide(4): via_sata_chip_map() and via_sata_chip_map_new() overwrote the
caller's pci_attach_args. Probably a bug. Make a local copy of the
caller's pci_attach_args and modify the copy. I have not tested
this change.
While I'm here, make pci_mapreg_submap() static.
With these changes in place, I have tested the compilation of these
kernels:
alpha GENERIC
amd64 GENERIC XEN3_DOM0
arc GENERIC
atari HADES MILAN-PCIIDE
bebox GENERIC
cats GENERIC
cobalt GENERIC
evbarm-eb NSLU2
evbarm-el ADI_BRH ARMADILLO9 CP3100 GEMINI GEMINI_MASTER GEMINI_SLAVE GUMSTIX
HDL_G IMX31LITE INTEGRATOR IQ31244 IQ80310 IQ80321 IXDP425 IXM1200
KUROBOX_PRO LUBBOCK MARVELL_NAS NAPPI SHEEVAPLUG SMDK2800 TEAMASA_NPWR
TEAMASA_NPWR_FC TS7200 TWINTAIL ZAO425
evbmips-el AP30 DBAU1500 DBAU1550 MALTA MERAKI MTX-1 OMSAL400 RB153 WGT624V3
evbmips64-el XLSATX
evbppc EV64260 MPC8536DS MPC8548CDS OPENBLOCKS200 OPENBLOCKS266
OPENBLOCKS266_OPT P2020RDB PMPPC RB800 WALNUT
hp700 GENERIC
i386 ALL XEN3_DOM0 XEN3_DOMU
ibmnws GENERIC
macppc GENERIC
mvmeppc GENERIC
netwinder GENERIC
ofppc GENERIC
prep GENERIC
sandpoint GENERIC
sgimips GENERIC32_IP2x
sparc GENERIC_SUN4U KRUPS
sparc64 GENERIC
As of Sun Apr 3 15:26:26 CDT 2011, I could not compile these kernels
with or without my patches in place:
### evbmips-el GDIUM
nbmake: nbmake: don't know how to make /home/dyoung/pristine-nbsd/src/sys/arch/mips/mips/softintr.c. Stop
### evbarm-el MPCSA_GENERIC
src/sys/arch/evbarm/conf/MPCSA_GENERIC:318: ds1672rtc*: unknown device `ds1672rtc'
### ia64 GENERIC
/tmp/genassym.28085/assym.c: In function 'f111':
/tmp/genassym.28085/assym.c:67: error: invalid application of 'sizeof' to incomplete type 'struct pcb'
/tmp/genassym.28085/assym.c:76: error: dereferencing pointer to incomplete type
### sgimips GENERIC32_IP3x
crmfb.o: In function `crmfb_attach':
crmfb.c:(.text+0x2304): undefined reference to `ddc_read_edid'
crmfb.c:(.text+0x2304): relocation truncated to fit: R_MIPS_26 against `ddc_read_edid'
crmfb.c:(.text+0x234c): undefined reference to `edid_parse'
crmfb.c:(.text+0x234c): relocation truncated to fit: R_MIPS_26 against `edid_parse'
crmfb.c:(.text+0x2354): undefined reference to `edid_print'
crmfb.c:(.text+0x2354): relocation truncated to fit: R_MIPS_26 against `edid_print'
rdmsr instruction (CPUID_MSR is not there).
Introduce a cpu_counter_serializing() function to remplace rdmsr(MSR_TSC)
calls, which does a rdmsr(MSR_TSC) if available and cpu_counter() otherwise.
This makes the cpu counter useable on vortex86 CPUs.
OK ad@
existing CPU power management implementations could peacefully coexist with
the acpicpu(4) driver. The following options can not be used with acpicpu(4):
ENHANCED_SPEEDSTEP, INTEL_ONDEMAND_CLOCKMOD, POWERNOW_K7, and POWERNOW_K8.
reviews.
In kernel, it matches the 'i386_use_pae' variable (0: kernel does not use
PAE, 1: kernel uses PAE). Will be used by i386 kvm(3) to know the functions
that should get called for VA => PA translations.
the acpicpu(4) driver should attach even if the existing frequency management
code fails to attach, mainly because ACPI is the only proper way to deal
with EST on new Intel system.
Use a more drastic hack to deal with this: when acpicpu(4) attachs, it tears
down any existing sysctl(8) controls and installs identical ones in place.
Upon detachment, the initialization function of the existing EST is called.
Remarks:
1. All processors (x86 or not) for which the vendor has implemented
ACPI I/O access routines are supported. Native instructions are
currently supported only for Intel's "Enhanced Speedstep". Code for
"PowerNow!" (AMD) will be merged later. Native support for VIA's
"PowerSaver" will be investigated.
2. Backwards compatibility with existing userland code is maintained.
Comparable to the case with cpu_idle(9), the ACPI CPU driver
installs alternative functions for the existing sysctl(8) controls.
The "native" behavior (if any) is restored upon detachment.
3. The dynamic nature of ACPI-provided P-states needs more investigation.
The maximum frequency induced (but not forced) by the firmware may
change dynamically. Currently, the sysctl(8) controls error out with
a value larger than the dynamic maximum. The code itself does not
however yet react to the notifications from the firmware by changing
the frequencies in-place. Presumably the system administrator should
be able to choose whether to use dynamic or static frequencies.
Initialize mutexes and condition variables in attach and not in the
asynchronously started kernel thread.
Increase BMC spin timeout from 5ms to 15ms, this is necessary to detect
the BMC in a HP ML110G4 reliably.
Implement non-linear sensors as defined in IPMIv2.0 with some crude
32.32 fixed point arithmetic. This adds some small errors as logarithm
and power functions are only approximated.
Fix sensor index mapping so that sensor limits are computed correctly.
This patch is inspired by work previously done by Jeremy Morse, ported by me
to -current, merged with the work previously done for port-xen, together with
additionals fixes and improvements.
PAE option is disabled by default in GENERIC (but will be enabled in ALL in
the next few days).
In quick, PAE switches the CPU to a mode where physical addresses become
36 bits (64 GiB). Virtual address space remains at 32 bits (4 GiB). To cope
with the increased size of the physical address, they are manipulated as
64 bits variables by kernel and MMU.
When supported by the CPU, it also allows the use of the NX/XD bit that
provides no-execution right enforcement on a per physical page basis.
Notes:
- reworked locore.S
- introduce cpu_load_pmap(), used to switch pmap for the curcpu. Due to the
different handling of pmap mappings with PAE vs !PAE, Xen vs native, details
are hidden within this function. This helps calling it from assembly,
as some features, like BIOS calls, switch to pmap_kernel before mapping
trampoline code in low memory.
- some changes in bioscall and kvm86_call, to reflect the above.
- the L3 is "pinned" per-CPU, and is only manipulated by a
reduced set of functions within pmap. To track the L3, I added two
elements to struct cpu_info, namely ci_l3_pdirpa (PA of the L3), and
ci_l3_pdir (the L3 VA). Rest of the code considers that it runs "just
like" a normal i386, except that the L2 is 4 pages long (PTP_LEVELS is
still 2).
- similar to the ci_pae_l3_pdir{,pa} variables, amd64's xen_current_user_pgd
becomes an element of cpu_info (slowly paving the way for MP world).
- bootinfo_source struct declaration is modified, to cope with paddr_t size
change with PAE (it is not correct to assume that bs_addr is a paddr_t when
compiled with PAE - it should remain 32 bits). bs_addrs is now a
void * array (in bootloader's code under i386/stand/, the bs_addrs
is a physaddr_t, which is an unsigned long).
- fixes in multiboot code (same reason as bootinfo): paddr_t size
change. I used Elf32_* types, use RELOC() where necessary, and move the
memcpy() functions out of the if/else if (I do not expect sym and str tables
to overlap with ELF).
- 64 bits atomic functions for pmap
- all pmap_pdirpa access are now done through the pmap_pdirpa macro. It
hides the L3/L2 stuff from PAE, as well as the pm_pdirpa change in
struct pmap (it now becomes a PDP_SIZE array, with or without PAE).
- manipulation of recursive mappings ( PDIR_SLOT_{,A}PTEs ) is done via
loops on PDP_SIZE.
See also http://mail-index.netbsd.org/port-i386/2010/07/17/msg002062.html
No objection raised on port-i386@ and port-xen@R for about a week.
XXX kvm(3) will be fixed in another patch to properly handle both PAE and !PAE
kernel dumps (VA => PA macros are slightly different, and need proper 64 bits
PA support in kvm_i386).
XXX Mixing PAE and !PAE modules may lead to unwanted/unexpected results. This
cannot be solved easily, and needs lots of thinking before being declared
safe (paddr_t/bus_addr_t size handling, PD/PT macros abstractions).
also known as C-states. The code is modular and provides an easy way to add
the remaining functionality later (namely throttling and P-states).
Remarks:
1. Commented out in the GENERICs; more testing exposure is needed.
2. The C3-state is disabled for the time being because it turns off
timers, among them the local APIC timer. This may not be universally
true on all x86 processors; define ACPICPU_ENABLE_C3 to test.
3. The algorithm used to choose a power state may need tuning. When
evaluating the appropriate state, the implementation uses the
previous sleep time as an indicator. Additional hints would include
for example the system load.
Also bus master activity is evaluated when choosing a state. The
usb(4) stack is notorious for such activity even when unused.
Typically it must be disabled in order to reach the C3-state,
but it may also prevent the use of C2.
4. While no extensive empirical measurements have been carried out, the
power savings are somewhere between 1-2 W with C1 and C2, depending
on the processor, firmware, and load. With C3 even up to 4 W can be
saved. The less something ticks, the more power is saved.
ok jmcneill@, joerg@, and discussed with various people.
reworked by me to support 32-bit processes as well.
we now keep %fs and %gs loaded with the user values
while in the kernel, which means we don't need to
reload them when returning to user mode.
Add MI flags PMAP_WRITE_COMBINE, PMAP_WRITE_BACK, PMAP_NOCACHE_OVR.
Update pmap(9) manpage.
hppa: Remove MD PMAP_NOCACHE flag as it exists as MI flag
mips: Rename MD PMAP_NOCACHE to PGC_NOCACHE.
x86: Implement new MI flags using Page-Attribute Tables.
x86: Implement BUS_SPACE_MAP_PREFETCHABLE.
Patch presented on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2010/06/30/msg008458.html
No comments on this last version.
kernels, and use them in the bus_space(9) implementation instead of ugly
Xen #ifdef-age. In a non-Xen kernel, the _ma() functions either call or
alias the equivalent _pa() functions.
Reviewed on port-xen@netbsd.org and port-i386@netbsd.org. Passes
rmind@'s and bouyer@'s inspection. Tested on i386 and on Xen DOMU /
DOM0.
bus_space_tag. For now, bus_space_tag's only member is
bst_type, the type of space, which is either X86_BUS_SPACE_IO
or X86_BUS_SPACE_MEM. In the future, new bus_space_tag members
will refer to override-functions installed by a new function,
bus_space_tag_create(9).
Add pointers to constant struct bus_space_tag, x86_bus_space_io and
x86_bus_space_mem. Use them to replace most uses of X86_BUS_SPACE_IO
and X86_BUS_SPACE_MEM.
Add an x86-specific bus_space_is_equal(9) implementation that compares
the two tags' bst_type.
per-page execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).
- replace cpu_feature and ci_feature_flags variables by cpu_feature and
ci_feat_val arrays. This makes it cleaner and brings kernel code closer
to the design of cpuctl(8). A warning will be raised for each CPU that
does not expose the same features as the Boot Processor (BP).
- the blacklist of CPU features is now a macro defined in the
specialreg.h header, instead of hardcoding it inside MD initialization
code; fix comments.
- replace checks against CPUID_TSC with the cpu_hascounter() function.
- clean up the code in init_x86_64(), as cpu_feature variables are set
inside cpu_probe().
- use cpu_init_msrs() for i386. It will be eventually used later for NX
feature under i386 PAE kernels.
- remove code that checks for CPUID_NOX in amd64 mptramp.S, this is already
performed by cpu_hatch() through cpu_init_msrs().
- remove cpu_signature and feature_flags members from struct mpbios_proc
(they were never used).
This patch was tested with i386 MONOLITHIC, XEN3PAE_DOM0 and XEN3_DOM0 under
a native i386 host, and amd64 GENERIC, XEN3_DOM0 via QEMU virtual machines.
XXX Should kernel rev be bumped?
XXX A similar patch should be pulled-up for NetBSD-5, hopefully tomorrow.
to the tag that pc inherits its behavior from. Add code to deal with
pc.pc_super.
Pull identical declarations out of xen/include/pci_machdep.h and
x86/include/pci_machdep.h into x86/include/pci_machdep_common.h.
code to override the default pci(9) behavior by creating a non-NULL
pci_attach_args_t (on x86, pci_attach_args_t is always NULL) containing
one or more non-NULL function pointers.
pci_chipset_tag *pci_chipset_tag_t' for an improvement in type safety.
(Back when I did the same for cardbus_chipset_tag_t, it helped to turn
up some bugs!)
that initializes pci_mode, which I have moved to the top.
Make pci_mode private to pci_machdep.c.
Provide pci_mode_set() for pcibios.c to configure the PCI Configuration
Mechanism. KASSERT() in pci_mode_set() that the mechanism is not
changing from anything but the "don't know" value, -1.
Probably needed for all C7 and Nano processors, but to be safe only use
this alternate method on Esther for now.
Now est on my C7-M 1.6GHz properly reports frequencies from 1600 to 400,
instead of 2133 to 533.
are now compiled in by default.
Note that MSR support in Xen depends on its version. rdmsr() should always
succeed, but wrmsr() to certain registers can end in a NOOP. In that case,
the error will be logged (see xm dmesg).
Setting CPU frequency (SpeedStep) requires Xen 3.3 with the option
cpufreq="dom0-kernel" passed down to hypervisor during boot.
Compiled and tested for SpeedStep under i386 for XEN3_DOM0 and XEN3PAE_DOM0
by jym@. amd64 was tested by Joel Carnat.
See also http://mail-index.netbsd.org/port-xen/2009/08/02/msg005213.html .
Commit requested by bouyer@.
device registers with a mutex. Convert tsleep/wakeup calls to
cv_wait/cv_signal.
Do not repeatedly malloc/free tiny buffers for sending/receiving
commands, but reserve a command buffer in the softc.
Tickle the watchdog in the sensors-refreshing thread.
I am fairly certain that after the device is attached, every register
access happens in the sensors-refreshing thread. Moreover, no
software interrupt touches any register, now. So I may get rid of
the mutex that protects register accesses, sc_cmd_mtx.
activity of other threads will perform the TLB flush for the processes using
emap as a side effect. To track that, global and per-CPU generation numbers
are used. This idea was suggested by Andrew Doran; various improvements to
it by me. Notes:
- For now, zero-copy on pipe is not yet enabled.
- TCP socket code would likely need more work.
- Additional UVM loaning improvements are needed.
Proposed on <tech-kern>, silence there.
Quickly reviewed by <ad>.
Call rtc_set_ymdhms() from xen/xen/clock.c:xen_rtc_set() for xen3 dom0
kernels as the Xen3 hypervisor doesn't write the new date/time to the CMOS
by itself.
Now a XEN3_DOM0 kernel properly updates the CMOS time.