Add support for the forthcoming AVX-512 registers.
Code compiled with -mavx seems to work, but I've not tested context
switches with live ymm registers.
There is a small cost on fork/exec (a larger area is copied/zerod),
but I don't think the ymm registers are read/written unless they
have been used.
The code use XSAVE on all cpus, I'm not brave enough to enable XSAVEOPT.
being read.
The most obvious side effect the anita tests failed to detect they were
running under qemu - so reported failures under qemu for things
that qemu doesn't support.
x87 control word.
This means that nothing outside fpu.c cares about the internals of the
fpu save area.
New kernel modules won't load with the old kernel - but that won't matter.
The result is only written to sysctl nodes at the moment.
I see:
machdep.fpu_save = 3 (implies xsaveopt)
machdep.xsave_size = 832
machdep.xsave_features = 7
Completely common up the i386 and amd64 machdep sysctl creation.
for the normal and extended leafs.
(The 'normal' one might be luring in the global cpulevel.)
Read the 'extended feature' from cpuid.80000001.%ecx/edx into
ci_feat_val[3/2] just after saving cpuid.1.%ecx/dx in ci_feat_val[1/0]
instead of doing it separately for amd k678 and via c3 processors
in their probe functions and repeating it for all cpus a few instructions
later when x86_cpu_topology() is called.
x86_cpu_topology() is only called from cpu_probe() and really doesn't
deserve its own source file. Chasing the setup code is bad enough anyway.
kernel stack to the top.
Change the pcb layouts so that fpu save area is at the end and is
64byte aligned ready for xsave (saving the ymm registers).
Welcome to 6.99.32
Add the file back so that the firwfox source doesn't have to depend
on the version of netbsd it is being compiled for.
(The i386 version doesn't play the same games in its SIGFPE handler.)
helper functions in arch/x86/x86/fpu.c
They (hopefully) ensure that we write to the entire buffer and don't load
values that might cause faults in kernel.
Also zero out the 'pad' field of the i386 mcontext fp area that I think
once contained the registers of any Weitek fpu.
Dunno why it wasn't pasrt of the union.
Some of these copies could be removed if the code directly copied the save
area to/from userspace addresses.
The cost of zeroing the save area on exec is minimal.
This stops the FP registers of a random process being used the first
time an lwp uses the fpu.
sendsig_siginfo() and get_mcontext() now unconditionally copy the FP
registers.
I'll remove the double-copy for signal handlers soon.
get_mcontext() might have been leaking kernel memory to userspace - and
may still do so if i386_use_fxsave is false (short copies).
This changes the trap10 and trap13 code to call directly into fpu.c,
removing all the code for T_ARITHTRAP, T_XMM and T_FPUNDA from i386/trap.c
Not all of the code thate appeared to handle fpu traps was ever called!
Most of the changes just replace the include of machine/npx.h with x86/fpu.h
(or remove it entirely).
Move the checks for fpu traps in kernel into x86/fpu.c.
Remove the code from amd64/trap.c related to fpu traps (they've not gone
there for ages - expect to panic in kernel mode).
In fpudna():
- Don't actually enable hardware interrupts unless we need to
allow in IPIs.
- There is no point in enabling them when they are blocked in software
(by splhigh()).
- Keep the splhigh() to avoid a load of the KASSERTS() firing.
inside the ucontext structure passed to signal handlers to modify the
xmm registers.
This should make the code compile - I'm not at all sure it works as expected,
the interactions between FP and signal handlers aren't at all clear.
AFAICT the FP state is saved on the user stack when the handler is called,
however the FP trap code can already done odd things to the FPU....
definitions match those of i386.
Mostly just structure and field renames, in addition:
1) process_xmm_to_s87() and process_s87_to_xmm() moved into
x86/convert_xmm_s87.c so they can be used by amd64's netbsd32 code.
2) The linux signal code simplified to use a structure copy for ths fxsave
data - it matches the hardware definition and won't change.
Rename the associated ci_fpsaving field to 'unused'.
I'm not sure they could ever happen, you could get unwanted calls into
the fpu trap code while saving state when using INT13 - but these are
different.
The return value from the i386 fpudna() was always 1 - possibly a historic
relic of the kernel fp emulation. Remove and don't check in trap.S.
The amd64 and i386 fpudna() code is now almost identical.
Set default CR) so that the FPU is enabled (unset CR0_EM) and initialise
i386_fpu_present to 1.
No need to call the npx trap indirectly, rename to fpunda() to match amd64.
Remove the i386_fpu_exception variable and sysctl (It used to indicate
which irq was used for fpu exceptions, but we only support 'internal'
now). Hopefully no one cares.
fpuinit() now only needs to clear TS before the fninit(). Apart from the
checks for 486SX and the 'fdiv bug' this matches the amd64 version.
Exclude fpuinit() from XEN kernels, they don't call it - which rather begs
the question as to whether it is needed at all!
compatible method of handling floating point exceptions.
Make kernel support for teh fpu non-optional (486SX should still work).
Only 386 cpus support external fpu, and i386 support was removed years ago.
This means that the npx code no longer uses port 0xf0 or interupt 13.
All the "npx at isa" lines go from the configs, arch/i386/isa/npx.c
is now mandatory for all i386 kernels.
I've renamed npxinit() to fpuinit() and npxinit_cpu() to fpuinit_cpu()
to match the very similar amd64 functions.
The fpu of the boot cpu is now initialised by a direct call from
cpu_configure(), this enables FP emulation for a 486SX.
(for amd64 the cr0 values are set in locore.S and similar).
This fixes a long-standing bug in linux_setregs() - which did not
save the fpu regsiters if they were active.
I've test booted a single cpu i386 kernel (using anita).
amd64 builds - none of teh changes should affect it.
The i386 XEN kernels build, but I'm not sure where they set cr0, and
it might have got lost!
- Fix a bug that the puc cn mechanism doesn't use the UART's frequency
in pucdata.c's table.
- Add a new option PUC_CNAUTO. If this option is set, consinit() in
x86/x86/consinit.c checks puc com device to use it as console.
Without this option, the behavior is the same as before.
- Add a new config parameter PUC_CNBUS. The old code scans bus #0 only.
If PUC_CNBUS is set, the specified number's bus will be scanned.
- Rename comcnprobe() to puc_cnprobe() to make it clear.
- Rename comcninit() to puc_cninit() to make it clear.
- Add code for a device that a device's com register is MMIO (#if0 ed).
I haven't studied the code, but I'm concerned that not initializing
sf->sf_edi could potentially leak a few bytes of information to a new
userspace process.
of the fp save area to all the process_read_fpregs() and
process_write_fpregs() functions.
None of the functions have been modified to use the new parameters.
The size is set for all the writes, but some of the arch-specific reads
just pass NULL.
The amd64 (and i386) need variable sized fp register save areas in order
to support AVX and other enhanced register areas.
These functions are rarely called - so the extra argument won't matter.
from 'void *' to the actual type 'struct coredump_iostate *'.
In most of the code the contents of the structure are still unknown.
This just stops the wrong type of pointer being passed to the 'void *'
parameter.
I hope I've found everything, amd64 GENERIC and i386 GENERIC & ALL compile.
Adjust pcu(9) to this xcall(9) change. This may fix the problems after
x86 FPU was converted to use PCU, since it avoids heavy contention at the
lower levels (particularly, IPL_SOFTNET). This is a good illustration why
software interrupts should generally avoid any blocking on locks.
the document (AMD64 Architecture ProgrammerVolume 3: General-Purpose and
System Instructions. Document revision 3.20)
- "s/MXX/MMXX/" because this bit is "MMX eXtention".
to reduce code duplication and to avoid bug.
CPUID_TO_STEPPING(cpuid) (not changed)
CPUID_TO_FAMILY(cpuid) (new)
CPUID_TO_MODEL(cpuid) (new)
Return the display family and the display model.
The macro names are the same as FreeBSD.
CPUID_TO_BASEFAMILY(cpuid) (The old name was CPUID2FAMILY)
CPUID_TO_BASEMODEL(cpuid) (The old name was CPUID2MODEL)
Only for the base field.
CPUID_TO_EXTFAMILY(cpuid) (The old name was CPUID2EXTFAMILY)
CPUID_TO_EXTMODEL(cpuid) (The old name was CPUID2EXTMODEL)
Only for the extended field.
See http://mail-index.netbsd.org/port-amd64/2013/11/12/msg001978.html
More than bit 3 in cpu_family variable is checked in the function, so the
variable is assumed that it is not the base family but the display family
(base family + extended family).
The CPUID2MODEL() must be used only when the CPUID2FAMILY() macro returns
0xf or 0x6. Also fix a bug that CPUID2EXTMODEL() is _ADDED_. The correct way
is shifting the return value of CPUID2EXTMODEL() 4bit left and _OR_ it.
The CPUID2MODEL() macro returns only low 4bit, so the checking against 0x17
doesn't work correctly. The correct way is to use the display model.
Remove incorrect extmodel check. Same as FreeBSD.
- avoid running over the end of an array (this is a real bug, but
i didn't really look closely at what memory is clobbered. it
may not actually matter.)
- move variables inside their #if usage.
This reduces the amount of MD code enormously, and makes it easier
to implement support for newer CPU features which require more fpu
state, or for fpu usage by the kernel.
For access to FPU state across CPUs, an xcall kthread is used now
rather than a dedicated IPI.
No user visible changes intended.
this change is intended to mirror what ipmitool does.
(their macros for these bits are IS_READING_UNAVAILABLE and
IS_SCANNING_DISABLED.)
see also:
second-gen-interface-spec-v2-rev1-4
Table 35-15, Get Sensor Reading Command
might fix PR/46833 from Francois Tigeot
reviewed by Masanobu SAITOH and Tom Ivar Helbekkmo
tested by Tom Ivar Helbekkmo
can become headless after the first reboot (sadly, e.g. Intel AMT presents
as a com_puc, but doesn't appear in the BIOS serial port table, so you need
a keyboard and monitor to install and set the installboot parameters first).
Fix com_puc console on devices with offset BAR's.
when using a temporary mp_intr_map, initialize the "flags" field
as well as "redir" since apic_set_redir() uses both. fix how
the flags field is change when applying an override, the trigger
and polarity sub-fields aren't just one bit like they are in redir.
Add periodic clock synchronization to vmt(4) so that the guest clock
remains synchronized even when the host is suspended (which is a very
typical situation in a laptop).
Do this by default once per minute, but provide a sysctl to tune this
value (machdep.vmt0.clock_sync.period).
Sent to tech-kern@ for review and addressed a couple of issues.
Taken from the August 2012 Intel SDM (intel_x86_325462.pdf).
Split all the snprintb() format strings to make them (almost) readable.
Fix CPUID_AMD_FLAGS4 to not try to print bits \41 and \42.
broke the build for x86 systems that have MULTIPROCESSOR but which do not
include MPBIOS. So let's try to untangle things just a bit. Presented
on current-users (and referenced on source-changes-d) without any comment.
XXX We really should find a better method to select kernel options; #ifdef
spaghetti is rather sub-optimal.