99ed8445ea
Latest Intel platform Granite Rapids has introduced a new instruction - AMX-FP16, which performs dot-products of two FP16 tiles and accumulates the results into a packed single precision tile. AMX-FP16 adds FP16 capability and allows a FP16 GPU trained model to run faster without loss of accuracy or added SW overhead. The bit definition: CPUID.(EAX=7,ECX=1):EAX[bit 21] Add CPUID definition for AMX-FP16. Signed-off-by: Jiaxi Chen <jiaxi.chen@linux.intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-Id: <20230303065913.1246327-3-tao1.su@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
---|---|---|
.. | ||
alpha | ||
arm | ||
avr | ||
cris | ||
hexagon | ||
hppa | ||
i386 | ||
loongarch | ||
m68k | ||
microblaze | ||
mips | ||
nios2 | ||
openrisc | ||
ppc | ||
riscv | ||
rx | ||
s390x | ||
sh4 | ||
sparc | ||
tricore | ||
xtensa | ||
Kconfig | ||
meson.build |