Stanislav Shwartsman
17c89d1c78
masked load-store optimization for avx-512
2015-01-26 20:52:03 +00:00
Stanislav Shwartsman
5e6955c5e7
Major rewrite of memory access methods to avoid massive code duplication and enable inlining of memory access methods
2015-01-25 20:55:10 +00:00
Stanislav Shwartsman
cb18f1e0a1
more use of the clearflagsOSZAPC
2014-10-22 18:24:33 +00:00
Stanislav Shwartsman
6ebbb886c4
implemented VPMULTISHIFTQB VBMI instruction
2014-09-26 13:19:45 +00:00
Stanislav Shwartsman
e2e6f5a62b
Update CPUID defines after recently published
...
Intel Architecture Instruction Set Extensions Programming Reference rev-021
Enable AVX-512 with all implemented extensions in generic CPUID when simd=AVX512 is supplied
implemented AVX512_IFMA532 instructions
implemented AVX512_VBMI instructions
still missing: VPMULTISHIFTQB - VBMI instruction (coming soon)
2014-09-26 12:14:53 +00:00
Stanislav Shwartsman
8e632c1bbe
fixed bug in vrsqt14* implementation
2014-08-16 18:15:02 +00:00
Stanislav Shwartsman
e1bcc8cb1e
bugfix with denormal arguments in avx-512 14-bit reciprocal
2014-08-15 19:00:12 +00:00
Stanislav Shwartsman
c064a09348
regen dependencies in makefile for cpu objects
2014-08-14 19:53:57 +00:00
Stanislav Shwartsman
128137b421
avx512 bugfixes
2014-08-13 18:34:42 +00:00
Stanislav Shwartsman
fb526a0670
implemented (not yet 100% correct) VREDUCE* AVX512 opcode
2014-08-08 19:12:18 +00:00
Stanislav Shwartsman
4b03966176
Implemented VDBPSADBW AVX512BW instruction
...
The only missing AVX512BW/AVX512DQ opcodes are now:
"NDS.512.66.0F3A.W0 56 VREDUCEPS
NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-08-05 20:18:42 +00:00
Stanislav Shwartsman
fefa61a7cb
Implemented VRANGE* AVX512DQ instructions
...
The only missing AVX512BW/AVX512DQ opcodes are now:
"NDS.66.0F3A.W0 42 VDBPSADBW"
"NDS.512.66.0F3A.W0 56 VREDUCEPS
NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-08-04 20:30:46 +00:00
Stanislav Shwartsman
b7f62cdf47
Implemented VPALIGNR AVX512BW instructions
...
The only missing AVX512BW/AVX512DQ opcodes are now:
"NDS.66.0F3A.W0 42 VDBPSADBW"
"NDS.512.66.0F3A.W0 50 VRANGEPS
NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
NDS.512.66.0F3A.W1 51 VRANGESD"
"NDS.512.66.0F3A.W0 56 VREDUCEPS
NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-26 18:59:01 +00:00
Stanislav Shwartsman
d8d4d2f0c1
Implemented VPSRLVW/VPSRAVW/VPSLLVW AVX512BW instructions
...
The only missing AVX512BW/AVX512DQ opcodes are now:
"512.66.0F3A.W1 0F VPALIGNR"
"NDS.66.0F3A.W0 42 VDBPSADBW"
"NDS.512.66.0F3A.W0 50 VRANGEPS
NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
NDS.512.66.0F3A.W1 51 VRANGESD"
"NDS.512.66.0F3A.W0 56 VREDUCEPS
NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-25 21:15:48 +00:00
Stanislav Shwartsman
7ad7383fd2
implement 256-wide SHUFF/SHUFI ops
2014-07-25 20:08:08 +00:00
Volker Ruppert
59eac1f196
Moved AVX/EVEX stuff to a new cpu subfolder and updated build system
...
TODO: update MVSC workspace files
2014-07-25 08:35:06 +00:00