Bochs/bochs/cpu
Stanislav Shwartsman 01af7f5346 Implemented VRSQRT14 AVX-512 instructions & optimized legacy SSE RSQRTSS/PS instructions handling
//
// The table lookup was reverse-engineered from VRSQRT14SS instruction implementation available
// in the Intel Software Development Emulator rev6.20 (released February 13, 2014)
// http://software.intel.com/en-us/articles/intel-software-development-emulator/
//

// TODO: find better way to emulate these instructions, I am sure the HW doesn't have 64K entry lookup tables

Now only missed AVX-512 opcodes now are:

512.66.0F38.W0 2C VSCALEFPS
512.66.0F38.W1 2C VSCALEFPD
NDS.LIG.66.0F38.W0 2D VSCALESS
NDS.LIG.66.0F38.W1 2D VSCALESD

512.66.0F3A.W0 08 VRNDSCALEPS
512.66.0F3A.W1 09 VRNDSCALEPD
NDS.LIG.66.0F3A.W1 0A VRNDSCALESS
NDS.LIG.66.0F3A.W1 0B VRNDSCALESD
2014-02-25 18:57:49 +00:00
..
cpudb various fixes 2013-08-29 19:43:15 +00:00
fpu bugfix and code cleanup 2014-02-12 20:31:22 +00:00
3dnow.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
access32.cc first infrastructure changes to support EVEX prefix and AVX-512 extensions recently published by Intel 2013-07-26 12:50:56 +00:00
access64.cc fixed segfault in AVX emulation 2013-12-22 21:16:10 +00:00
access.cc move canonical check of high part of page split access to another function to fix code duplication 2013-12-21 21:56:55 +00:00
aes.cc Implemented VCMPPS/PD/SS/SD AVX512 instructions 2013-12-03 15:44:23 +00:00
apic.cc Move INTR, Local APIC INTR and SVN VINTR into new event interface (hardest part) 2012-10-03 20:24:29 +00:00
apic.h preparations for apic regs virtualization feature described in SDM rev044 2012-09-06 15:21:08 +00:00
arith8.cc CMPXHG should always write to memory dest - affects APIC virtualization VMEXIT conditions 2013-07-24 21:06:24 +00:00
arith16.cc Infrstructure change to support disasm of BxInstruction_c directly (without calling disasm) 2013-09-24 05:21:00 +00:00
arith32.cc Infrstructure change to support disasm of BxInstruction_c directly (without calling disasm) 2013-09-24 05:21:00 +00:00
arith64.cc VMX: CMPXHG instructions should always write to the memory destination, even if the value unchanged - it affects VMEXIT conditions for the full apic virtualization 2013-08-04 19:37:04 +00:00
avx2.cc Implemented VPMOVSX*/VPMOVZX* AVX-512 instructions 2014-02-02 19:56:08 +00:00
avx512_cvt.cc Implemented VCVTPS2PH AVX-512 instruction 2014-02-15 19:21:08 +00:00
avx512_fma.cc implement EVEX SAE (suppress all exceptions) contol, implement AVX512 INSTERT/EXTRACTPS opcodes 2013-12-14 12:45:06 +00:00
avx512_mask16.cc avx512 implementation fixes and next steps 2013-10-08 18:31:18 +00:00
avx512_move.cc Implemented VPMOVSX*/VPMOVZX* AVX-512 instructions 2014-02-02 19:56:08 +00:00
avx512_pfp.cc Implemented VRCP14 AVX-512 instructions. 2014-02-24 21:31:52 +00:00
avx512_rcp14.cc Implemented VRSQRT14 AVX-512 instructions & optimized legacy SSE RSQRTSS/PS instructions handling 2014-02-25 18:57:49 +00:00
avx512_rsqrt14.cc Implemented VRSQRT14 AVX-512 instructions & optimized legacy SSE RSQRTSS/PS instructions handling 2014-02-25 18:57:49 +00:00
avx512.cc bugfix 2014-02-11 17:47:52 +00:00
avx_cvt.cc remove code duplication, prepare for 512-bit evrsion of cvtps2ph 2014-02-08 19:18:17 +00:00
avx_fma.cc implement EVEX SAE (suppress all exceptions) contol, implement AVX512 INSTERT/EXTRACTPS opcodes 2013-12-14 12:45:06 +00:00
avx_pfp.cc implement DPPS/DPPD ops using existing primitives; added some missing defs 2014-02-02 18:57:25 +00:00
avx.cc infrastructure change for avx-512: before going to more new instructions modelling 2014-01-10 19:40:38 +00:00
bcd.cc reword all the CPU code in preparation for future CPU speedup implementation. 2011-07-06 20:01:18 +00:00
bit16.cc optimize POPCNT implementation 2012-09-21 14:56:56 +00:00
bit32.cc optimize POPCNT implementation 2012-09-21 14:56:56 +00:00
bit64.cc optimize POPCNT implementation 2012-09-21 14:56:56 +00:00
bit.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
bmi32.cc add dedicated 8bit low register accessor 2013-12-01 22:18:38 +00:00
bmi64.cc add dedicated 8bit low register accessor 2013-12-01 22:18:38 +00:00
call_far.cc stack direct access optimization - 5% emu speedup to all 32-bit guests, for 64-bit guests speedup is less because they have less stack accesses 2012-03-25 11:54:32 +00:00
cpu.cc added lock prefix used info into bx_Instriction_c and use it in disasm 2013-11-08 21:43:21 +00:00
cpu.h rewritten xsave/xrestor for more modular functionality. todo: replace walk through state using simple for loop 2014-02-22 21:00:47 +00:00
cpuid.h added definitions (CPUID bit, VMX fields and VMXEXIT reasons, etc) from recently published Intel SDM rev049 2014-02-06 17:05:20 +00:00
crc32.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
crregs.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
crregs.h more avx-512 instructions implemented 2013-12-01 19:39:18 +00:00
ctrl_xfer16.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
ctrl_xfer32.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
ctrl_xfer64.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
ctrl_xfer_pro.cc - Do not compile support for alignment check (#AC exception) by default 2012-03-25 19:07:17 +00:00
data_xfer8.cc Infrstructure change to support disasm of BxInstruction_c directly (without calling disasm) 2013-09-24 05:21:00 +00:00
data_xfer16.cc Infrstructure change to support disasm of BxInstruction_c directly (without calling disasm) 2013-09-24 05:21:00 +00:00
data_xfer32.cc Infrstructure change to support disasm of BxInstruction_c directly (without calling disasm) 2013-09-24 05:21:00 +00:00
data_xfer64.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
debugstuff.cc fixed compilation error with vs2008 2013-10-25 05:36:10 +00:00
descriptor.h fixed 64-bit segment print from internal debugger 2012-06-14 18:56:47 +00:00
disasm.cc fix of compilation err 2014-02-17 16:19:43 +00:00
event.cc update (c) for few files 2013-09-05 18:40:14 +00:00
exception.cc Added VMEXIT instrumentation callback 2013-10-23 21:18:19 +00:00
fetchdecode64.cc implementation of AVX-512 compressed displacement feature which is required for AVX-512 emu correctness (first step). todo: fix rest of EVEX opcodes 2014-02-10 21:12:08 +00:00
fetchdecode_avx.h implemented few more avx-512 cvt opcodes 2014-01-21 21:00:40 +00:00
fetchdecode_evex.h added template for missing avx-512 instructions 2014-02-17 20:21:58 +00:00
fetchdecode_sse.h finish sse tables cleanup in disasm and fetchdecode 2013-10-11 20:09:51 +00:00
fetchdecode_x87.h disasm fixes 2013-10-07 19:02:53 +00:00
fetchdecode_xop.h Debugger: fixed param tree access to 64-bit variables (need to use get64() instead of get()) 2013-12-05 19:17:16 +00:00
fetchdecode.cc complete compressed displ feature support, bugfixes in AVX-512 code 2014-02-11 16:10:31 +00:00
fetchdecode.h cover some more opcodes with compressed displ 2014-02-10 21:34:26 +00:00
flag_ctrl_pro.cc Move INTR, Local APIC INTR and SVN VINTR into new event interface (hardest part) 2012-10-03 20:24:29 +00:00
flag_ctrl.cc Move INTR, Local APIC INTR and SVN VINTR into new event interface (hardest part) 2012-10-03 20:24:29 +00:00
fpu_emu.cc reword all the CPU code in preparation for future CPU speedup implementation. 2011-07-06 20:01:18 +00:00
gather.cc implemented few more AVX-512 floating point convert instructions 2014-01-18 20:10:05 +00:00
generic_cpuid.cc updates to AVX512 decoding and CPUID 2013-10-07 20:39:34 +00:00
generic_cpuid.h remove unused leafs from generic_cpuid 2012-05-11 06:51:04 +00:00
i387.h Adding Id and Rev property to all files 2011-02-24 21:54:04 +00:00
ia_opcodes.h added template for missing avx-512 instructions 2014-02-17 20:21:58 +00:00
icache.cc fixes for disasm 2013-10-02 19:23:34 +00:00
icache.h Thanks to avanced trace linking 256K entries ICache is not needed anymore. 2013-06-29 10:25:56 +00:00
init.cc Debugger: fixed param tree access to 64-bit variables (need to use get64() instead of get()) 2013-12-05 19:17:16 +00:00
instr.h new function for disasm. todo: support it independently of CPU 2014-01-26 20:01:50 +00:00
io.cc updated + fixed instrumentation example for instr histogram, code cleanup in the cpu 2012-03-28 21:11:19 +00:00
iret.cc stack direct access optimization - 5% emu speedup to all 32-bit guests, for 64-bit guests speedup is less because they have less stack accesses 2012-03-25 11:54:32 +00:00
jmp_far.cc - Implemented Task Switch intercept in SVM, cleanup in task switch handling code 2012-01-11 20:21:29 +00:00
lazy_flags.h small optimization in lazy flags code 2012-09-06 19:49:14 +00:00
load.cc fixed bug in LOAD_BROADCAST_MASK_Half_VectorD method 2014-02-11 20:13:42 +00:00
logical8.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
logical16.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
logical32.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
logical64.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
Makefile.in Implemented VRSQRT14 AVX-512 instructions & optimized legacy SSE RSQRTSS/PS instructions handling 2014-02-25 18:57:49 +00:00
mmx.cc Implemented VPCMP* AVX512 instructions 2013-12-02 18:05:18 +00:00
msr.cc do not recognize MTRR MSRs when mtrr is not enabled 2013-04-17 19:59:56 +00:00
mult8.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
mult16.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
mult32.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
mult64.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
paging.cc move canonical check of high part of page split access to another function to fix code duplication 2013-12-21 21:56:55 +00:00
proc_ctrl.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
protect_ctrl.cc SVM: implemented missed RSM, LDTR READ/WRITE, TR READ/WRITE and IRET intercepts 2013-02-25 19:36:41 +00:00
rdrand.cc Add RDRAND/RDSEED instructions support (+ disasm) 2012-10-09 15:16:48 +00:00
resolver.cc Adding Id and Rev property to all files 2011-02-24 21:54:04 +00:00
ret_far.cc stack direct access optimization - 5% emu speedup to all 32-bit guests, for 64-bit guests speedup is less because they have less stack accesses 2012-03-25 11:54:32 +00:00
segment_ctrl_pro.cc - Do not compile support for alignment check (#AC exception) by default 2012-03-25 19:07:17 +00:00
segment_ctrl.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
sha.cc properly added sha.cc to the tree 2013-07-24 18:56:37 +00:00
shift8.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
shift16.cc fixed comments for SHLD/SHRD instructrions and make code a little more clear 2012-09-09 17:44:42 +00:00
shift32.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
shift64.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
simd_compare.h Implemented VPERMILPS/PD AVX512 instructions 2013-12-04 18:30:44 +00:00
simd_int.h fixed compilation err without avx 2014-01-23 17:08:30 +00:00
simd_pfp.h implemented avx-512 getexp instructions 2014-01-27 21:25:07 +00:00
smm.cc implementation of virtual NMI 2013-03-05 21:12:43 +00:00
smm.h Fixed SF bug [3548109] VMX State Not Restored After Entering SMM on 32-bit Systems 2012-07-27 08:13:39 +00:00
soft_int.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
sse_move.cc Implemented VPCMP* AVX512 instructions 2013-12-02 18:05:18 +00:00
sse_pfp.cc implemented AVX-512 version of VCVTPH2PS 2014-02-04 20:32:54 +00:00
sse_rcp.cc Implemented VRSQRT14 AVX-512 instructions & optimized legacy SSE RSQRTSS/PS instructions handling 2014-02-25 18:57:49 +00:00
sse_string.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
sse.cc make use of new accessor 2013-12-01 22:21:55 +00:00
stack16.cc small optimization 2014-02-01 19:23:41 +00:00
stack32.cc Infrstructure change to support disasm of BxInstruction_c directly (without calling disasm) 2013-09-24 05:21:00 +00:00
stack64.cc Infrstructure change to support disasm of BxInstruction_c directly (without calling disasm) 2013-09-24 05:21:00 +00:00
stack.cc properly added sha.cc to the tree 2013-07-24 18:56:37 +00:00
stack.h stack direct access optimization - 5% emu speedup to all 32-bit guests, for 64-bit guests speedup is less because they have less stack accesses 2012-03-25 11:54:32 +00:00
string.cc fixes for disasm 2013-10-15 17:19:18 +00:00
svm.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
svm.h updates in CPUID defines after new published AMD SDM 2013-05-17 19:41:57 +00:00
tasking.cc hw task switch tempdr6 hanlding fix 2013-03-15 08:26:22 +00:00
tbm32.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
tbm64.cc Standartization of Bochs instruction handlers. 2012-08-05 13:52:40 +00:00
todo added masked operations to simd_pfp.h, optimize simd_int.h, rewrite dpps instr using new masked op from simd_pfp.h 2013-09-17 20:49:26 +00:00
vapic.cc fixed compilation issue 2012-11-05 06:41:10 +00:00
vm8086.cc - Do not compile support for alignment check (#AC exception) by default 2012-03-25 19:07:17 +00:00
vmcs.cc rename some VMX controls to match intel docs. added missed VMX consistency check 2013-02-24 20:22:22 +00:00
vmexit.cc use shorter opcode names in the debug prints (skip the BX_IA_ prefix) 2013-12-02 20:06:59 +00:00
vmfunc.cc implemented virtualization exception feature 2013-01-28 16:30:25 +00:00
vmx.cc downgrade VMEXIT message to BX_DEBUG 2014-01-24 18:58:57 +00:00
vmx.h moved (c) to year 2014 for few files 2014-02-06 17:06:25 +00:00
xmm.h complete compressed displ feature support, bugfixes in AVX-512 code 2014-02-11 16:10:31 +00:00
xop.cc implemented few more AVX-512 floating point convert instructions 2014-01-18 20:10:05 +00:00
xsave.cc rewritten xsave/xrestor for more modular functionality. todo: replace walk through state using simple for loop 2014-02-22 21:00:47 +00:00