Commit Graph

16 Commits

Author SHA1 Message Date
Stanislav Shwartsman
17c89d1c78 masked load-store optimization for avx-512 2015-01-26 20:52:03 +00:00
Stanislav Shwartsman
5e6955c5e7 Major rewrite of memory access methods to avoid massive code duplication and enable inlining of memory access methods 2015-01-25 20:55:10 +00:00
Stanislav Shwartsman
cb18f1e0a1 more use of the clearflagsOSZAPC 2014-10-22 18:24:33 +00:00
Stanislav Shwartsman
6ebbb886c4 implemented VPMULTISHIFTQB VBMI instruction 2014-09-26 13:19:45 +00:00
Stanislav Shwartsman
e2e6f5a62b Update CPUID defines after recently published
Intel Architecture Instruction Set Extensions Programming Reference rev-021

Enable AVX-512 with all implemented extensions in generic CPUID when simd=AVX512 is supplied
implemented AVX512_IFMA532 instructions
implemented AVX512_VBMI instructions

still missing: VPMULTISHIFTQB - VBMI instruction (coming soon)
2014-09-26 12:14:53 +00:00
Stanislav Shwartsman
8e632c1bbe fixed bug in vrsqt14* implementation 2014-08-16 18:15:02 +00:00
Stanislav Shwartsman
e1bcc8cb1e bugfix with denormal arguments in avx-512 14-bit reciprocal 2014-08-15 19:00:12 +00:00
Stanislav Shwartsman
c064a09348 regen dependencies in makefile for cpu objects 2014-08-14 19:53:57 +00:00
Stanislav Shwartsman
128137b421 avx512 bugfixes 2014-08-13 18:34:42 +00:00
Stanislav Shwartsman
fb526a0670 implemented (not yet 100% correct) VREDUCE* AVX512 opcode 2014-08-08 19:12:18 +00:00
Stanislav Shwartsman
4b03966176 Implemented VDBPSADBW AVX512BW instruction
The only missing AVX512BW/AVX512DQ opcodes are now:

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-08-05 20:18:42 +00:00
Stanislav Shwartsman
fefa61a7cb Implemented VRANGE* AVX512DQ instructions
The only missing AVX512BW/AVX512DQ opcodes are now:

"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-08-04 20:30:46 +00:00
Stanislav Shwartsman
b7f62cdf47 Implemented VPALIGNR AVX512BW instructions
The only missing AVX512BW/AVX512DQ opcodes are now:

"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 50 VRANGEPS
 NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
 NDS.512.66.0F3A.W1 51 VRANGESD"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-26 18:59:01 +00:00
Stanislav Shwartsman
d8d4d2f0c1 Implemented VPSRLVW/VPSRAVW/VPSLLVW AVX512BW instructions
The only missing AVX512BW/AVX512DQ opcodes are now:

"512.66.0F3A.W1 0F VPALIGNR"
"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 50 VRANGEPS
 NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
 NDS.512.66.0F3A.W1 51 VRANGESD"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-25 21:15:48 +00:00
Stanislav Shwartsman
7ad7383fd2 implement 256-wide SHUFF/SHUFI ops 2014-07-25 20:08:08 +00:00
Volker Ruppert
59eac1f196 Moved AVX/EVEX stuff to a new cpu subfolder and updated build system
TODO: update MVSC workspace files
2014-07-25 08:35:06 +00:00