Commit Graph

3115 Commits

Author SHA1 Message Date
Stanislav Shwartsman
24cb334304 fixed large code duplication in write_new_stack methods 2014-10-12 18:59:10 +00:00
Stanislav Shwartsman
3db00a7e52 fixed CMPXHG16B implementation 2014-10-02 18:53:41 +00:00
Stanislav Shwartsman
4ab9f62a20 bugfix: register supported cpu extensions for trinity cpu model 2014-09-26 19:24:11 +00:00
Stanislav Shwartsman
6ebbb886c4 implemented VPMULTISHIFTQB VBMI instruction 2014-09-26 13:19:45 +00:00
Stanislav Shwartsman
e2e6f5a62b Update CPUID defines after recently published
Intel Architecture Instruction Set Extensions Programming Reference rev-021

Enable AVX-512 with all implemented extensions in generic CPUID when simd=AVX512 is supplied
implemented AVX512_IFMA532 instructions
implemented AVX512_VBMI instructions

still missing: VPMULTISHIFTQB - VBMI instruction (coming soon)
2014-09-26 12:14:53 +00:00
Stanislav Shwartsman
86bb2f97cc fixed shoft128right macro is softfloat 2014-09-19 16:01:46 +00:00
Stanislav Shwartsman
694069bd36 fixed cmpxchg16b bug (SF titcket #523 in patches tracker) 2014-09-14 18:13:08 +00:00
Stanislav Shwartsman
29efae3be3 adjust (c) in several files 2014-08-31 20:05:25 +00:00
Stanislav Shwartsman
5413b5c31b don't forget to initialize (clear) cpu features bitmask in the beginning ... 2014-08-31 19:48:58 +00:00
Stanislav Shwartsman
5eb781e45f cleanup after cpu features interface rework 2014-08-31 19:22:41 +00:00
Stanislav Shwartsman
b6147d9de8 fixed debugger enabled code 2014-08-31 18:48:04 +00:00
Stanislav Shwartsman
9f57e70d5f Rewritten handling of supported CPUID features to be able to handle large amount of CPU extensions
Now enable support for up to 128 CPU extensions and could easily extend it more
Also reduce memory footprint for bx_ia_opcodes.h arrays
2014-08-31 18:39:18 +00:00
Stanislav Shwartsman
8e632c1bbe fixed bug in vrsqt14* implementation 2014-08-16 18:15:02 +00:00
Stanislav Shwartsman
e1bcc8cb1e bugfix with denormal arguments in avx-512 14-bit reciprocal 2014-08-15 19:00:12 +00:00
Stanislav Shwartsman
7e1a31af5e fixed denormal arg handling in VGENTMANT* 2014-08-15 10:27:56 +00:00
Stanislav Shwartsman
c064a09348 regen dependencies in makefile for cpu objects 2014-08-14 19:53:57 +00:00
Stanislav Shwartsman
7ae5a1c6b3 fixed NaN handling for VRANGE* instructions 2014-08-14 19:42:34 +00:00
Stanislav Shwartsman
128137b421 avx512 bugfixes 2014-08-13 18:34:42 +00:00
Stanislav Shwartsman
fb526a0670 implemented (not yet 100% correct) VREDUCE* AVX512 opcode 2014-08-08 19:12:18 +00:00
Stanislav Shwartsman
4b03966176 Implemented VDBPSADBW AVX512BW instruction
The only missing AVX512BW/AVX512DQ opcodes are now:

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-08-05 20:18:42 +00:00
Stanislav Shwartsman
4455d9100b simplify code a little more 2014-08-05 19:20:15 +00:00
Stanislav Shwartsman
5b0d0519d5 fixed typo in prev checkin 2014-08-05 19:03:17 +00:00
Stanislav Shwartsman
2231ffb242 simplify legacy (sse and avx) sad calculation in simd_int.h 2014-08-05 19:01:01 +00:00
Stanislav Shwartsman
5e80c1f419 fixed vrange* abs comparisons in softfloat 2014-08-04 21:24:38 +00:00
Stanislav Shwartsman
63a4130311 changed polarity of is_min bit in range operation to better match vrange* instructions immediate encoding 2014-08-04 21:08:00 +00:00
Stanislav Shwartsman
fefa61a7cb Implemented VRANGE* AVX512DQ instructions
The only missing AVX512BW/AVX512DQ opcodes are now:

"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-08-04 20:30:46 +00:00
Stanislav Shwartsman
524a73f48c prepare softfloat functions for vrange* instructions implementation 2014-08-04 19:44:25 +00:00
Stanislav Shwartsman
23a8601ab8 fixed code duplication in floating point compare functions 2014-08-03 19:53:02 +00:00
Stanislav Shwartsman
b7f62cdf47 Implemented VPALIGNR AVX512BW instructions
The only missing AVX512BW/AVX512DQ opcodes are now:

"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 50 VRANGEPS
 NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
 NDS.512.66.0F3A.W1 51 VRANGESD"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-26 18:59:01 +00:00
Stanislav Shwartsman
b70ed32ea5 support fault suppression for recently added avx512bw ops 2014-07-25 21:45:09 +00:00
Stanislav Shwartsman
d8d4d2f0c1 Implemented VPSRLVW/VPSRAVW/VPSLLVW AVX512BW instructions
The only missing AVX512BW/AVX512DQ opcodes are now:

"512.66.0F3A.W1 0F VPALIGNR"
"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 50 VRANGEPS
 NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
 NDS.512.66.0F3A.W1 51 VRANGESD"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-25 21:15:48 +00:00
Stanislav Shwartsman
7ad7383fd2 implement 256-wide SHUFF/SHUFI ops 2014-07-25 20:08:08 +00:00
Volker Ruppert
59eac1f196 Moved AVX/EVEX stuff to a new cpu subfolder and updated build system
TODO: update MVSC workspace files
2014-07-25 08:35:06 +00:00
Stanislav Shwartsman
9cc7b67369 bugfix 2014-07-22 20:49:58 +00:00
Stanislav Shwartsman
c4c8652a3b Implemented VPMOV?2? and VPMIN* AVX512 instructions
The only missing AVX512BW/AVX512DQ opcodes are now:

"512.66.0F38.W1 10 VPSRLVW"
"512.66.0F38.W1 11 VPSRAVW"
"512.66.0F38.W1 12 VPSLLVW"

"512.66.0F3A.W1 0F VPALIGNR"
"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 50 VRANGEPS
 NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
 NDS.512.66.0F3A.W1 51 VRANGESD"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-22 20:36:55 +00:00
Stanislav Shwartsman
ad7ef68876 Implemented VMULLQ AVX512DQ instruction
The only missing AVX512BW/AVX512DQ opcodes are now:

"512.66.0F38.W1 10 VPSRLVW"
"512.66.0F38.W1 11 VPSRAVW"
"512.66.0F38.W1 12 VPSLLVW"

"512.F3.0F38.W0 28 VPMOVM2B
 512.F3.0F38.W1 28 VPMOVM2W"
"512.F3.0F38.W0 29 VPMOVB2M
 512.F3.0F38.W1 29 VPMOVW2M"

"512.F3.0F38.W0 38 VPMOVM2D
 512.F3.0F38.W1 38 VPMOVM2Q"
"512.F3.0F38.W0 39 VPMOVD2M
 512.F3.0F38.W1 39 VPMOVQ2M"

"NDS.512.66.0F38.WIG 38 VPMINSB"
"NDS.512.66.0F38.WIG 3A VPMINUW"

"512.66.0F3A.W1 0F VPALIGNR"
"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 50 VRANGEPS
 NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
 NDS.512.66.0F3A.W1 51 VRANGESD"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-21 19:08:44 +00:00
Stanislav Shwartsman
009629f4d9 Implemented VEXTRACT* AVX512DQ instructions
The only missing AVX512BW/AVX512DQ opcodes are now:

"512.66.0F38.W1 10 VPSRLVW"
"512.66.0F38.W1 11 VPSRAVW"
"512.66.0F38.W1 12 VPSLLVW"

"512.F3.0F38.W0 28 VPMOVM2B
 512.F3.0F38.W1 28 VPMOVM2W"
"512.F3.0F38.W0 29 VPMOVB2M
 512.F3.0F38.W1 29 VPMOVW2M"

"512.F3.0F38.W0 38 VPMOVM2D
 512.F3.0F38.W1 38 VPMOVM2Q"
"512.F3.0F38.W0 39 VPMOVD2M
 512.F3.0F38.W1 39 VPMOVQ2M"

"NDS.512.66.0F38.WIG 38 VPMINSB"
"NDS.512.66.0F38.WIG 3A VPMINUW"

"W1.NDS.512.66.0F38 40 VPMULLQ"

"512.66.0F3A.W1 0F VPALIGNR"
"NDS.66.0F3A.W0 42 VDBPSADBW"

"NDS.512.66.0F3A.W0 50 VRANGEPS
 NDS.512.66.0F3A.W1 50 VRANGEPD"
"NDS.512.66.0F3A.W0 51 VRANGESS
 NDS.512.66.0F3A.W1 51 VRANGESD"

"NDS.512.66.0F3A.W0 56 VREDUCEPS
 NDS.512.66.0F3A.W1 56 VREDUCEPD"
"NDS.512.66.0F3A.W0 57 VREDUCESS
 NDS.512.66.0F3A.W1 57 VREDUCESD"
2014-07-20 20:46:48 +00:00
Stanislav Shwartsman
8108da227d bugfix in canonical violation detection 2014-07-20 18:19:02 +00:00
Stanislav Shwartsman
65ffcf5dc8 avoid using access_write_linear when not strickly needed 2014-07-19 20:01:44 +00:00
Stanislav Shwartsman
8ef5dcaca3 implement more avx512bw opcodes 2014-07-19 19:01:17 +00:00
Stanislav Shwartsman
7083e94077 implemented VBROADCAST*32x2 AVX512DQ opcodes 2014-07-19 18:36:38 +00:00
Stanislav Shwartsman
ca2859f449 implemented more AVX512BW opcodes 2014-07-19 13:30:54 +00:00
Stanislav Shwartsman
caad957384 bugfix in softfloat uint64_to_float64 conversion 2014-07-18 20:30:48 +00:00
Stanislav Shwartsman
3e8a5b99fa bugfix for vpsrldq opcode decoding 2014-07-18 20:08:10 +00:00
Stanislav Shwartsman
7a133766a8 implemented more AVX512BW opcodes 2014-07-18 19:40:27 +00:00
Stanislav Shwartsman
a556eb21fa implemented more AVX512BW opcodes 2014-07-18 16:28:44 +00:00
Stanislav Shwartsman
47e7e4adde bugfix in VFPCLASSPD implementation 2014-07-18 16:01:11 +00:00
Stanislav Shwartsman
48c508012f semantic fix 2014-07-18 12:35:17 +00:00
Stanislav Shwartsman
a06fba41ab commit new added files 2014-07-18 11:16:22 +00:00
Stanislav Shwartsman
94864fb9bc Implement AVX512BW and AVX512DQ extensions published in recently published Intel Archtecture Extensions manual rev20.
https://software.intel.com/sites/default/files/managed/c6/a9/319433-020.pdf

Most of the instructions are implemented, more on the way.

+ few bugfixes for legacy AVX-512 emulation

AVX-512: Fixed bug in VCMPPS masked instruction implementation with 512-bit data size
AVX-512: Fixed AVX-512 masked convert instructions with non-k0 mask (behaved as non masked versions)
AVX-512: Fixed missed #UD due to invalid EVEX prefix fields for several AVX-512 opcodes (VFIXUPIMMSS/SD, FMA)
2014-07-18 11:14:25 +00:00