i386 Xen PV domU get spurious fpudna traps from userland. Older eager FPU
contact switching code took care of ignoring them. When transitioning
from eager switching to awlays switching, this special handling was
removed, causing "fpudna from userland" panics.
This change restores the previosu behavior where fpudna traps from
userland are ignored on Xen PV domU.
On this architecture vmt(4) used to search for a node "/hypervisor" in the
FDT and probed the VMware hypervisor call only when the node was
found. However, things appear to have changed and VMware no longer provides
the FDT node.
Since vmt(4) doesn't actually need to read anything from FDT, and the
hypervisor call logically resides in virtual CPUs themselves, it would be
better to attach it directly to cpu, just like how it's probed on x86.
documented cpuid instruction and eax register.
This approach is adapted from linux via-cputemp.c, no official documentation is
currently available. However, msr value seems to work on all tested CPUs while
documented cpuid instruction typically reports 0, even for my C7-D CPU.
msr value seems to have temperature in Celsius in lower 24-bits without fraction
(thus "msr & 0xffffff;" is used).
Tested on my personal systems based on CPUs below (i386 and amd64):
C7-D 1.6GHz (i386 only), Nano X2 L4350E, Nano X2 U4300, U2300 Nano, KX-U6580.
Also got one response via email which was based on Nano X2 L4050 (VE-900).
Nano reports independent values for each core.
KX-U6580 seems to show the same value for all cores but more testing is needed.
Since it works on amd64 capable CPUs, adding driver to GENERIC kernel config.
Also moving viac7temp man page to x86 instead of i386 (with updates).
In theory the change should add support for all VIA Nano CPUs and Zhaoxin CPUs
at least up to KX-6000(G) series.
In the future I may need to introduce amd64 kernel module as well.
Plan to pullup to at least netbsd-10.
Patch mainly reviewed by riastradh.
(Path from Paolo Pisati in current_users@)
While here:
Simplify mp_cpu_start() ifdefs. MULTIPROCESSOR and HYPERV code falls under
NLAPIC > 0, thus just combine all blocks under this guard.
Rearrange opt_acpi.h include alphabetically.
r. 1.39 introduced a regression where instead of applying a reasonable
default maximum (as was done prior to that change), incorrect values
were accepted and applied, as failures to retrieve an expected MSR
value weren't accounted for.
Apply different logic for unexpectedly low vs. high maximums, with
distinct warnings for each. Also add another warning about a retrieval
failure right at the outset (which also just uses the default, then).
This change fundamentally doesn't address the fact that
__SHIFTOUT(msr, MSR_TEMP_TARGET_READOUT)
doesn't necessarily return a valid value. It just restores prior
behaviour, which is more reasonable than applying a zero value, which
started happening on some older hardware. (I infer this is most likely
an issue with dated generations of Intel hardware with this feature.)
The challenge is that this evidently isn't all documented properly
anywhere. Various "magic values" in this driver need further
investigation.
While here, also fix output so warnings are cleanly formatted, rather
than the slightly scrambled way they were appearing.
Tested on older Intel hardware I had on hand:
E7500 (now falls back to default 100 rather than 0)
E5540 (successfully retrieves 97, as before)
i5-3340M (successfully retrieves 105, as before)
It seems that even though both these platforms have 12-byte floats
that are pretty much the same representation and both allegedly
IEEE-compliant, they manifest the top bit of the mantissa and then
differ slightly in the behavior of the extra encodings this permits.
Thanks to riastradh@ for helping sort this out.
from the new comment:
* This requires disabling CC6 power level, which can be a performance
* issue since it stops full turbo in some implementations (eg, half the
* cores must be in CC6 to achieve the highest boost level.) Set a timer
* to fire in 1000 days -- except NetBSD timers end up having a signed
* 32-bit hz-based value, which rolls over in under 25 days with HZ=1000,
* and doing xcall(9) or kthread(9) from a callout is not allowed anyway,
* so just have a kthread wait 1 day for 1000 times.
documented in:
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/56323-PUB_1_01.pdf
zenbleed is reported as "erratum 65535" currently, this adds a name
for it, and enables the name for any others as well.
pull logging into a function with a tag message.
console on EFI-only hardware.
Add a xen_genfb_getbtinfo() function which will return a btinfo_framebuffer
structure, filled in with parameters provided by Xen
when runing as a Xen dom0, call xen_genfb_getbtinfo() instead of
lookup_bootinfo(BTINFO_FRAMEBUFFER) when adding properties to the
PCI graphic device (when genfb is attached) and in x86_genfb_init()
when genfb is used as console.
x86/x86/consinit.c: If running as a Xen dom0, use xen_genfb_getbtinfo()
to check if we have a genfb console
xen/x86/consinit.c: support genfb as possible console
xen/x86/consinit.c: use the hypervior IO as console until a better one
is found. If the hypervisor is using a serial port for boot messages,
we'll get NetBSD's boot message on the serial port too until
the real console takes over.
xen/x86/autoconf.c: rework device_register() to be closer to the x86 version.
Especially make sure that device_pci_register() is called.
This shouldn't break any existing systems (for real this time), but
it should make the failure mode more obvious on systems that are
already broken.
PR kern/57661
XXX pullup-10
XXX pullup-9
XXX pullup-8
int acpi_md_vesa_modenum;
int acpi_md_vbios_reset;
struct vcons_screen x86_genfb_console_screen;
in genfb_machdep.h instead of locally as extern in various .c files.
This is apparently so broken that the error check for what should
have been a safe size fails, which is breaking boot on x86 all the
way back to Sandy Bridge at this point. Grrr.
We need to expand savefpu so that it supports the maximum size
instead.
Ideally this wouldn't panic, but the alternative right now is to
crash in a memset later -- or silently corrupt kernel memory -- so
this doesn't make the situation worse than it was before.
PR kern/57661
XXX pullup-10
XXX pullup-9
XXX pullup-8
syscall/trap entry, eliminating a test+branch on every syscall/trap.
This wasn't possible in the 3.99.x timeframe when l->l_cred came about
because there wasn't a reliable/timely way to force an ONPROC LWP running on
a remote CPU into the kernel (which is just about the only new thing in
this scheme).
the 0x80000001 CPUID result needs some parsing to match against
actual family/model/stepping values. 4-bit 'family' values of
15 or 6 change how to parse the 4-bit extended model and 8-bit
extended family value - for family 6 or 15, the extended model
bits (4) are concatenated with the base 4-bits to create an
8-bit value, and for family 15, the family value is addition
of the family value and the 8-bit extended-family value, giving
a range of 0 to 15 + 0xff aka 270.
use a CPUREV(family, model, stepping) macro that builds the
relevant bit-representation of a CPUID, making it far easier
to understand what each entry means, and to add new ones too.
i have confirmed that the emitted cpurevs[] array has the same
values before/after this change, ie, NFCI or observed.