Stanislav Shwartsman
d8d4d2f0c1
Implemented VPSRLVW/VPSRAVW/VPSLLVW AVX512BW instructions
...
The only missing AVX512BW/AVX512DQ opcodes are now:
"512.66.0F3A.W1 0F VPALIGNR"
"NDS.66.0F3A.W0 42 VDBPSADBW"
"NDS.512.66.0F3A.W0 50 VRANGEPS
NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
NDS.512.66.0F3A.W1 51 VRANGESD"
"NDS.512.66.0F3A.W0 56 VREDUCEPS
NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-25 21:15:48 +00:00
Stanislav Shwartsman
79456eb7e1
Implemented VPCMP* AVX512 instructions
...
Implemented VMOVNTPS/PD/DQ AVX512 instructions
Implemented VMOVNTDQA AVX512 instruction
Bugfixes for AVX-512
2013-12-02 18:05:18 +00:00
Stanislav Shwartsman
940c2a1c8e
fixes for disasm
2013-10-15 17:19:18 +00:00
Stanislav Shwartsman
ff79cbd596
Infrstructure change to support disasm of BxInstruction_c directly (without calling disasm)
...
The end goal will be also merging of disasm and cpu decoder to one module and remove the disasm.
Two bug fixes on the way:
TBM: fixed 64-bit TBM instructions with memory access (did 32-bit load instead of 64-bit)
BMI2: fixed operands order for PEXT/PDEP instructions
AVX2: fixed gather instruction decoding bug from decoder alias commit
2013-09-24 05:21:00 +00:00
Stanislav Shwartsman
df90b80352
set of small cpu fixes
2012-08-09 13:11:25 +00:00
Stanislav Shwartsman
cc694377b9
Standartization of Bochs instruction handlers.
...
Bochs instruction emulation handlers won't refer to direct fields of instructions like MODRM.NNN or MODRM.RM anymore.
Use generic source/destination indications like SRC1, SRC2 and DST.
All handlers are modified to support new notation. In addition fetchDecode module was modified to assign sources to instructions properly.
Immediate benefits:
- Removal of several duplicated handlers (FMA3 duplicated with FMA4 is a trivial example)
- Simpler to understand fetch-decode code
Future benefits:
- Integration of disassembler into Bochs CPU module, ability to disasm bx_instruction_c instance (planned)
Huge patch. Almost all source files wre modified.
2012-08-05 13:52:40 +00:00
Stanislav Shwartsman
002c86660a
reword all the CPU code in preparation for future CPU speedup implementation.
...
Bochs emulation can be another 10-15% faster using technique described in paper
"Fast Microcode Interpretation with Transactional Commit/Abort"
http://amas-bt.cs.virginia.edu/2011proceedings/amasbt2011-p3.pdf
2011-07-06 20:01:18 +00:00
Stanislav Shwartsman
87953711b1
cleanup in mmx code
2011-06-26 19:31:42 +00:00
Stanislav Shwartsman
2f582db722
compile less stuff for cpu-level=5
2011-06-26 19:15:30 +00:00
Stanislav Shwartsman
7d80a6ebe0
Adding Id and Rev property to all files
2011-02-24 21:54:04 +00:00
Stanislav Shwartsman
5915d92775
very small optimizations + indent
2011-01-25 20:59:26 +00:00
Stanislav Shwartsman
5917eb29ab
sse + mmx optimizations
2011-01-16 21:01:28 +00:00
Stanislav Shwartsman
8c5c078b13
optimize sse and mmx code
2011-01-16 20:42:28 +00:00
Stanislav Shwartsman
c5aca5ac21
move function to inline
2011-01-08 19:50:22 +00:00
Stanislav Shwartsman
f705cbbc63
rename functions
2010-12-25 19:34:43 +00:00
Stanislav Shwartsman
1bd512e98d
split more SSE ops, optimizations in MMX code
2010-12-25 17:04:36 +00:00
Stanislav Shwartsman
471d33fc1d
fix BE issue
2010-09-26 20:20:27 +00:00
Stanislav Shwartsman
d36e6d78e7
typo fix
2010-03-02 06:49:41 +00:00
Stanislav Shwartsman
01cfbdccbc
Move MMX to be runtime option
2010-03-01 18:53:53 +00:00
Stanislav Shwartsman
927c3594d6
enable compilation with CPU_LEVEL <= 6
...
converted SEP to runtime option as well
2010-02-26 11:44:50 +00:00
Stanislav Shwartsman
033a20b3b2
allow to configure CPU features at runtime - implemened on example of SSE/AES/MOVBE/POPCNT
2010-02-25 22:04:31 +00:00
Stanislav Shwartsman
70dc124b3a
1st step of moving CPU options to runtime
2010-02-24 19:27:51 +00:00
Stanislav Shwartsman
5f89b554aa
split few more opcodes
2010-02-10 17:21:15 +00:00
Stanislav Shwartsman
edaf19f0a1
Split MOVQ_PqQq opcode
2009-12-14 11:55:42 +00:00
Stanislav Shwartsman
67e4f97e73
make maskmov fault order like in real HWQ
2009-11-13 09:55:22 +00:00
Stanislav Shwartsman
78e4b3d616
split SSE move instructions
2009-10-24 11:17:51 +00:00
Stanislav Shwartsman
7254ea36a1
copyright fixes + small optimization
2009-10-14 20:45:29 +00:00
Stanislav Shwartsman
f1eb1d00fd
do not produce fpu2mmx transition if mem write faults
2009-02-13 10:15:16 +00:00
Stanislav Shwartsman
9929e6ed78
- updated FSF address
2009-01-16 18:18:59 +00:00
Stanislav Shwartsman
ab716f62aa
inline prepareMMX method
2008-10-08 11:14:35 +00:00
Stanislav Shwartsman
489447ae57
Fixed FPU2MMX state transition - should be done only fater all memory faults already checked
2008-10-08 10:51:38 +00:00
Stanislav Shwartsman
5dd02b26e3
Make even more efficient RmAddr calculation - good optimizing compiler could make more efficient code than it was before
2008-08-08 09:22:49 +00:00
Stanislav Shwartsman
709d74728d
Call #UD exception directly instead of UndefinedOpcode function - for future use
2008-07-13 15:35:10 +00:00
Stanislav Shwartsman
ec1ff39a5f
Splitted memory access methods for 32 and 64-bit code.
...
The 64-bit code got >10% speedup, the 32-bit code also got about 2% because laddr cacluation optimization
2008-05-10 18:10:53 +00:00
Stanislav Shwartsman
b3167d1a8f
Docs for MASKMOVQ were also not correct :(
2008-04-16 05:45:45 +00:00
Stanislav Shwartsman
4f3f8608f7
Fixed MASKMOVDQU instruction decoding
2008-04-16 05:41:43 +00:00
Stanislav Shwartsman
420f30816d
inline integer saturation code - speedup for MMX/SSE integer
2008-04-06 13:56:22 +00:00
Stanislav Shwartsman
167c7075fb
Use fastcall gcc attribute for all cpu execution functions - this pure "compiler helper" optimization brings additional 2% speedup to Bochs code
2008-03-22 21:29:41 +00:00
Stanislav Shwartsman
457152334e
step2 in XSAVE implementation
2008-02-13 16:45:21 +00:00
Stanislav Shwartsman
a2897933a3
white space cleanup
2008-02-02 21:46:54 +00:00
Stanislav Shwartsman
37fbb82baa
Cleanups. Move bxInstruction_c definition to separate file instr.h
2008-01-29 17:13:10 +00:00
Stanislav Shwartsman
d9984bb3a1
Eliminate BxResolve call from the heart of cpu loop and move into instructions that really require this calculation. Yes, it blows the code of EVERY CPU method but it has >15% speedup !
2008-01-10 19:37:56 +00:00
Stanislav Shwartsman
79fc57dec8
Fixed more VCPP2008 warnings
2007-12-26 23:07:44 +00:00
Stanislav Shwartsman
838fb2a048
Fixing V2008 warnings - they found a bug in sse_pfp.cc !
2007-12-23 17:21:28 +00:00
Stanislav Shwartsman
5d4e32b8da
Avoid pointer params for every read_virtual_* except 16-byte SSE and 10-byte x87 reads
2007-12-20 20:58:38 +00:00
Stanislav Shwartsman
b516589e4e
Changes in write_virtual_* and pop_* functions -> avoid moving parameteres by pointer
2007-12-20 18:29:42 +00:00
Stanislav Shwartsman
6ac7fa7106
MMX - modify masked write to RMW - faster execution
...
CMPXCHG8B/16B - fixed possible problem. Instruction not allowed to fault after some part of it written to the memory
2007-12-19 23:21:11 +00:00
Stanislav Shwartsman
c9932e97eb
Fixes in resolve.cc -> reduce amount of resolve functions even more
2007-12-18 21:41:44 +00:00
Stanislav Shwartsman
8cfd17202a
some simple SSE code optimizations
2007-11-27 22:12:45 +00:00
Stanislav Shwartsman
d9e58bd598
split11b on opcode tables level - split almost eevery splittable instruction
...
will be continued
2007-11-17 12:44:10 +00:00