https://software.intel.com/sites/default/files/managed/c6/a9/319433-020.pdf
Most of the instructions are implemented, more on the way.
+ few bugfixes for legacy AVX-512 emulation
AVX-512: Fixed bug in VCMPPS masked instruction implementation with 512-bit data size
AVX-512: Fixed AVX-512 masked convert instructions with non-k0 mask (behaved as non masked versions)
AVX-512: Fixed missed #UD due to invalid EVEX prefix fields for several AVX-512 opcodes (VFIXUPIMMSS/SD, FMA)
Fixed memory access size for AVX shift instructions with shift count in memory
Do not allow to encode with EVEX.b instructions which do not support implicit broadcast
softfloat: prepare float32/64 to uint32 conversion functions
Debugger: if AVX-512 if not supported by current configuration do not print high256 of vector registers and zmm15..zmm31 in AVX command
Implement VBROADCASTF64x4, VBROADCASTF32x4, VBROADCASTFI64x4, VBROADCASTI32x4 AVX-512 instructions
Fetchdecode optimizations and bugfixes
Implemented VCOMISS/VCOMISD/VUCOMISS/VUCOMISD AVX512 instructions
Fix vector length values for AVX-512 (512-bit vector should have length 4)
support mis-alignment #GP exception for VMOVAPS/PD/DQA32/DQ64 AVX512 instructions
move AVX512 load/store and register move operations into dedicated file avx512_move.cc
The code already knows to disasm most of the opcodes with their operands.
- Split according to OSIZE opcodes RDFSBASE/WRFSBASE / RDGSBASE/WRGSBASE both for disasm and performance
- Minimize amount of opcode forms in ia_opcodes.h again.
For example Udq means the same as Wdq but with no memory form.
Fixed bug in stack optimization in 64-bit mode (should result in some speedup)
ia_opcode.h - eliminate some OP_M cases when they actually meant "value of specific type in the memory"
example: "MOVBE Md, Gd" actually means "MOVBE Ed, Gd"which just not have reg/reg form.
The end goal will be also merging of disasm and cpu decoder to one module and remove the disasm.
Two bug fixes on the way:
TBM: fixed 64-bit TBM instructions with memory access (did 32-bit load instead of 64-bit)
BMI2: fixed operands order for PEXT/PDEP instructions
AVX2: fixed gather instruction decoding bug from decoder alias commit