Commit Graph

271 Commits

Author SHA1 Message Date
Matus Kysel
21381dcf0c softfp: Added hardfloat conversion from float32 to float64
Reintroduce float32_to_float64 that was removed here:
https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00455.html

 - nbench test it not actually calling this function at all
 - SPECS 2006 significat number of tests impoved their runtime, just
   few of them showed small slowdown

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matus Kysel <mkysel@tachyum.com>
Message-Id: <20191017142133.59439-1-mkysel@tachyum.com>
[rth: Add comment about impossible inexact exceptions.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-10-30 19:03:37 +01:00
Alex Bennée
00f43279a3 fpu: rename softfloat-specialize.h -> .inc.c
This is not a normal header and should only be included in the main
softfloat.c file to bring in the various target specific
specialisations. Indeed as it contains non-inlined C functions it is
not even a legal header. Rename it to match our included C convention.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-19 12:07:13 +01:00
Alex Bennée
e932112420 fpu: replace LIT64 with UINT64_C macros
In our quest to eliminate the home rolled LIT64 macro we fixup usage
inside the softfloat code. While we are at it we remove some of the
extraneous spaces to closer fit the house style.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-08-19 12:07:13 +01:00
Alex Bennée
2c217da0fc fpu: use min/max values from stdint.h for integral overflow
Remove some more use of LIT64 while making the meaning more clear. We
also avoid the need of casts as the results by definition fit into the
return type.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-08-19 12:07:13 +01:00
Alex Bennée
e6b405fe00 fpu: convert float[16/32/64]_squash_denormal to new modern style
This also allows us to remove the extractFloat16exp/frac helpers. We
avoid using the floatXX_pack_raw functions as they are slight overkill
for masking out all but the top bit of the number. The generated code
is almost exactly the same as makes no difference to the
pre-conversion code.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-08-19 12:07:13 +01:00
Alex Bennée
f7e81a9457 fpu: replace LIT64 usage with UINT64_C for specialize constants
We have a wrapper that does the right thing from stdint.h so lets use
it for our constants in softfloat-specialize.h

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2019-08-19 12:07:13 +01:00
Kito Cheng
896f51fbfa hardfloat: fix float32/64 fused multiply-add
Before falling back to softfloat FMA, we do not restore the original
values of inputs A and C. Fix it.

This bug was caught by running gcc's testsuite on RISC-V qemu.

Note that this change gives a small perf increase for fp-bench:

  Host: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
  Command: perf stat -r 3 taskset -c 0 ./fp-bench -o mulAdd -p $prec

- $prec = single:
  - before:
    101.71 MFlops
    102.18 MFlops
    100.96 MFlops
  - after:
    103.63 MFlops
    103.05 MFlops
    102.96 MFlops

- $prec = double:
  - before:
    173.10 MFlops
    173.93 MFlops
    172.11 MFlops
  - after:
    178.49 MFlops
    178.88 MFlops
    178.66 MFlops

Signed-off-by: Kito Cheng <kito.cheng@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20190322204320.17777-1-cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-03-25 10:35:32 +00:00
Mateja Marjanovic
7ca96e1a9c target/mips: Fix minor bug in FPU
Wrong type of NaN was generated for IEEE 754-2008 by MADDF.<D|S> and
MSUBF.<D|S> instructions when the arguments were (Inf, Zero, NaN) or
(Zero, Inf, NaN).

The if-else statement establishes if the system conforms to IEEE
754-1985 or IEEE 754-2008, and defines different behaviors depending
on that. In case of IEEE 754-2008, in mentioned cases of inputs,
<MADDF|MSUBF>.<D|S> returns the input value 'c' [2] (page 53) and
raises floating point exception 'Invalid Operation' [1] (pages 349,
350).

These scenarios were tested and the results in QEMU emulation match
the results obtained on the machine that has a MIPS64R6 CPU.

[1] MIPS Architecture for Programmers Volume II-a: The MIPS64
    Instruction Set Reference Manual, Revision 6.06
[2] MIPS Architecture for Programmers Volume IV-j: The MIPS64
    SIMD Architecture Module, Revision 1.12

Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
Message-Id: <1553008916-15274-2-git-send-email-mateja.marjanovic@rt-rk.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[AJB: fixed up commit message]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-03-25 10:34:48 +00:00
Richard Henderson
5d64abb32f softfloat: Support float_round_to_odd more places
Previously this was only supported for roundAndPackFloat64.

New support in round_canonical, round_to_int, float128_round_to_int,
roundAndPackFloat32, roundAndPackInt32, roundAndPackInt64,
roundAndPackUint64.  This does not include any of the floatx80 routines,
as we do not have users for that rounding mode there.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190215170225.15537-1-richard.henderson@linaro.org>
Tested-by: David Hildenbrand <david@redhat.com>
[AJB: add missing break]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-02-26 14:08:03 +00:00
David Hildenbrand
e45de9922e softfloat: Implement float128_to_uint32
Handling it just like float128_to_uint32_round_to_zero, that hopefully
is free of bugs :)

Documentation basically copied from float128_to_uint64

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-02-26 14:05:19 +00:00
Emilio G. Cota
f6b3b108a8 softfloat: enforce softfloat if the host's FMA is broken
The added branch to the FMA ops is marked as unlikely and therefore
its impact on performance (measured with fp-bench) is within noise range
when measured on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-01-22 20:48:17 +00:00
Emilio G. Cota
d9fe9db943 hardfloat: implement float32/64 comparison
Performance results for fp-bench:

Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
cmp-single: 110.98 MFlops
cmp-double: 107.12 MFlops
- after:
cmp-single: 506.28 MFlops
cmp-double: 524.77 MFlops

Note that flattening both eq and eq_signaling versions
would give us extra performance (695v506, 615v524 Mflops
for single/double, respectively) but this would emit two
essentially identical functions for each eq/signaling pair,
which is a waste.

Aggregate performance improvement for the last few patches:
[ all charts in png: https://imgur.com/a/4yV8p ]

1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

                   qemu-aarch64 NBench score; higher is better
                 Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

  16 +-+-----------+-------------+----===-------+---===-------+-----------+-+
  14 +-+..........................@@@&&.=.......@@@&&.=...................+-+
  12 +-+..........................@.@.&.=.......@.@.&.=.....+befor===     +-+
  10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& =     +-+
   8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+  @@u& =     +-+
   6 +-+............@@@&&=+***##.$%.@.&.=***##$$%+@.&.=..###$$%%@i& =     +-+
   4 +-+.......###$%%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=+**.#+$ +@m& =     +-+
   2 +-+.....***.#$.%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=.**.#+$+sqr& =     +-+
   0 +-+-----***##$%%@@&&=-***##$$%@@&&==***##$$%@@&&==-**##$$%+cmp==-----+-+
            FOURIER    NEURAL NELU DECOMPOSITION         gmean

                              qemu-aarch64 SPEC06fp (test set) speedup over QEMU 4c2c101590
                                      Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
                                            error bars: 95% confidence interval

  4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+
    4 +-+..........................+@@+...........................................................................+-+
  3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub       +-+
  2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@&      +-+
    2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@&  %%@&+-+
  1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+
  0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+
    0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+
  410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean

2. Host: ARM Aarch64 A57 @ 2.4GHz

                    qemu-aarch64 NBench score; higher is better
                 Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz

    5 +-+-----------+-------------+-------------+-------------+-----------+-+
  4.5 +-+........................................@@@&==...................+-+
  3 4 +-+..........................@@@&==........@.@&.=.....+before       +-+
    3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&==     +-+
  2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+  @m@& =     +-+
    2 +-+............@@@&==.***#.$.%.@&.=.***#$$%%.@&.=.***#$$%%d@& =     +-+
  1.5 +-+.....***#$$%%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$ +f@& =     +-+
  0.5 +-+.....*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$+sqr& =     +-+
    0 +-+-----***#$$%%@@&==-***#$$%%@@&==-***#$$%%@@&==-***#$$%+cmp==-----+-+
             FOURIER    NEURAL NLU DECOMPOSITION         gmean

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-12-17 08:25:25 +00:00
Emilio G. Cota
f131bae8a7 hardfloat: implement float32/64 square root
Performance results for fp-bench:

Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
sqrt-single: 42.30 MFlops
sqrt-double: 22.97 MFlops
- after:
sqrt-single: 311.42 MFlops
sqrt-double: 311.08 MFlops

Here USE_FP makes a huge difference for f64's, with throughput
going from ~200 MFlops to ~300 MFlops.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-12-17 08:25:25 +00:00
Emilio G. Cota
ccf770ba73 hardfloat: implement float32/64 fused multiply-add
Performance results for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
fma-single: 74.73 MFlops
fma-double: 74.54 MFlops
- after:
fma-single: 203.37 MFlops
fma-double: 169.37 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
fma-single: 23.24 MFlops
fma-double: 23.70 MFlops
- after:
fma-single: 66.14 MFlops
fma-double: 63.10 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
fma-single: 37.26 MFlops
fma-double: 37.29 MFlops
- after:
fma-single: 48.90 MFlops
fma-double: 59.51 MFlops

Here having 3FP64 set to 1 pays off for x86_64:
[1] 170.15 vs [0] 153.12 MFlops

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-12-17 08:25:25 +00:00
Emilio G. Cota
4a6295613f hardfloat: implement float32/64 division
Performance results for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
div-single: 34.84 MFlops
div-double: 34.04 MFlops
- after:
div-single: 275.23 MFlops
div-double: 216.38 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
div-single: 9.33 MFlops
div-double: 9.30 MFlops
- after:
div-single: 51.55 MFlops
div-double: 15.09 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
div-single: 25.65 MFlops
div-double: 24.91 MFlops
- after:
div-single: 96.83 MFlops
div-double: 31.01 MFlops

Here setting 2FP64_USE_FP to 1 pays off for x86_64:
[1] 215.97 vs [0] 62.15 MFlops

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-12-17 08:25:25 +00:00
Emilio G. Cota
2dfabc86e6 hardfloat: implement float32/64 multiplication
Performance results for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
mul-single: 126.91 MFlops
mul-double: 118.28 MFlops
- after:
mul-single: 258.02 MFlops
mul-double: 197.96 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
mul-single: 37.42 MFlops
mul-double: 38.77 MFlops
- after:
mul-single: 73.41 MFlops
mul-double: 76.93 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
mul-single: 58.40 MFlops
mul-double: 59.33 MFlops
- after:
mul-single: 60.25 MFlops
mul-double: 94.79 MFlops

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-12-17 08:25:25 +00:00
Emilio G. Cota
1b615d4820 hardfloat: implement float32/64 addition and subtraction
Performance results (single and double precision) for fp-bench:

1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
add-single: 135.07 MFlops
add-double: 131.60 MFlops
sub-single: 130.04 MFlops
sub-double: 133.01 MFlops
- after:
add-single: 443.04 MFlops
add-double: 301.95 MFlops
sub-single: 411.36 MFlops
sub-double: 293.15 MFlops

2. ARM Aarch64 A57 @ 2.4GHz
- before:
add-single: 44.79 MFlops
add-double: 49.20 MFlops
sub-single: 44.55 MFlops
sub-double: 49.06 MFlops
- after:
add-single: 93.28 MFlops
add-double: 88.27 MFlops
sub-single: 91.47 MFlops
sub-double: 88.27 MFlops

3. IBM POWER8E @ 2.1 GHz
- before:
add-single: 72.59 MFlops
add-double: 72.27 MFlops
sub-single: 75.33 MFlops
sub-double: 70.54 MFlops
- after:
add-single: 112.95 MFlops
add-double: 201.11 MFlops
sub-single: 116.80 MFlops
sub-double: 188.72 MFlops

Note that the IBM and ARM machines benefit from having
HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance
can suffer significantly:
- IBM Power8:
add-single: [1] 54.94 vs [0] 116.37 MFlops
add-double: [1] 58.92 vs [0] 201.44 MFlops
- Aarch64 A57:
add-single: [1] 80.72 vs [0] 93.24 MFlops
add-double: [1] 82.10 vs [0] 88.18 MFlops

On the Intel machine, having 2F64 set to 1 pays off, but it
doesn't for 2F32:
- Intel i7-6700K:
add-single: [1] 285.79 vs [0] 426.70 MFlops
add-double: [1] 302.15 vs [0] 278.82 MFlops

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-12-17 08:25:25 +00:00
Emilio G. Cota
a94b783952 fpu: introduce hardfloat
The appended paves the way for leveraging the host FPU for a subset
of guest FP operations. For most guest workloads (e.g. FP flags
aren't ever cleared, inexact occurs often and rounding is set to the
default [to nearest]) this will yield sizable performance speedups.

The approach followed here avoids checking the FP exception flags register.
See the added comment for details.

This assumes that QEMU is running on an IEEE754-compliant FPU and
that the rounding is set to the default (to nearest). The
implementation-dependent specifics of the FPU should not matter; things
like tininess detection and snan representation are still dealt with in
soft-fp. However, this approach will break on most hosts if we compile
QEMU with flags that break IEEE compatibility. There is no way to detect
all of these flags at compilation time, but at least we check for
-ffast-math (which defines __FAST_MATH__) and disable hardfloat
(plus emit a #warning) when it is set.

This patch just adds common code. Some operations will be migrated
to hardfloat in subsequent patches to ease bisection.

Note: some architectures (at least PPC, there might be others) clear
the status flags passed to softfloat before most FP operations. This
precludes the use of hardfloat, so to avoid introducing a performance
regression for those targets, we add a flag to disable hardfloat.
In the long run though it would be good to fix the targets so that
at least the inexact flag passed to softfloat is indeed sticky.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-12-17 08:25:25 +00:00
Emilio G. Cota
f9943c7f76 softfloat: rename canonicalize to sf_canonicalize
glibc >= 2.25 defines canonicalize in commit eaf5ad0
(Add canonicalize, canonicalizef, canonicalizel., 2016-10-26).

Given that we'll be including <math.h> soon, prepare
for this by prefixing our canonicalize() with sf_ to avoid
clashing with the libc's canonicalize().

Reported-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-12-17 08:25:25 +00:00
Thomas Huth
97ff87c0ed qemu/compiler: Wrap __attribute__((flatten)) in a macro
Older versions of Clang (before 3.5) and GCC (before 4.1) do not
support the "__attribute__((flatten))" yet. We don't care about
such old versions of GCC anymore, but since Clang 3.4 is still
used in EPEL for RHEL7 / CentOS 7, we should not use this attribute
directly but with a wrapper macro instead.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2018-10-17 08:36:28 +02:00
Richard Henderson
5dfbc9e490 softfloat: Fix division
The __udiv_qrnnd primitive that we nicked from gmp requires its
inputs to be normalized.  We were not doing that.  Because the
inputs are nearly normalized already, finishing that is trivial.

Replace div128to64 with a "proper" udiv_qrnnd, so that this
remains a reusable primitive.

Fixes: cf07323d49
Fixes: https://bugs.launchpad.net/qemu/+bug/1793119
Tested-by: Emilio G. Cota <cota@braap.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-05 12:57:41 -05:00
Thomas Huth
0019d5c3a1 softfloat: Replace countLeadingZeros32/64 with clz32/64
Our minimum required compiler for compiling QEMU is GCC 4.1 these days,
so we can drop the support for compilers which do not provide the
__builtin_clz*() functions yet. Since the countLeadingZeros32/64 are
then identical to the clz32/64 functions, and we do not have to sync
the softloat 2 codebase with upstream anymore (softloat 3 is a complete
rewrite) we can simply replace the functions with our QEMU versions.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1538118095-7003-1-git-send-email-thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-05 12:57:41 -05:00
Emilio G. Cota
c953da8f0b softfloat: remove float64_trunc_to_int
It has not had users since f83311e476 ("target-m68k: use floatx80
internally", 2017-06-21).

Note that no other bit-width has floatX_trunc_to_int.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-10-05 12:57:41 -05:00
Richard Henderson
2f6c74be59 softfloat: Add scaling float-to-int routines
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180814002653.12828-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-24 13:17:30 +01:00
Richard Henderson
2abdfe2440 softfloat: Add scaling int-to-float routines
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180814002653.12828-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-24 13:17:29 +01:00
Richard Henderson
64d450a0ea softfloat: Fix missing inexact for floating-point add
For 0x1.0000000000003p+0 + 0x1.ffffffep+14 = 0x1.0001fffp+15
we dropped the sticky bit and so failed to raise inexact.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180810193129.1556-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-16 14:29:58 +01:00
Richard Henderson
377ed92679 fpu/softfloat: Define floatN_silence_nan in terms of parts_silence_nan
Isolate the target-specific choice to 3 functions instead of 6.

The code in floatx80_default_nan tried to be over-general.  There are
only two targets that support this format: x86 and m68k.  Thus there
is no point in inventing a mechanism for snan_bit_is_one.

Move routines that no longer have ifdefs out of softfloat-specialize.h.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
8fb3d90203 fpu/softfloat: Clean up parts_default_nan
Reduce the number of ifdefs.  Correct the result for OpenRISC
and TriCore (although TriCore fixed in target-specific code).

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
0218a16e54 fpu/softfloat: Define floatN_default_nan in terms of parts_default_nan
Isolate the target-specific choice to 2 functions instead of 6.

The code in float16_default_nan was only correct for ARM, MIPS, and X86.
Though float16 support is rare among our targets.

The code in float128_default_nan was arguably wrong for Sparc.  While
QEMU supports the Sparc 128-bit insns, no real cpu enables it.

The code in floatx80_default_nan tried to be over-general.  There are
only two targets that support this format: x86 and m68k.  Thus there
is no point in inventing a value for snan_bit_is_one.

Move routines that no longer have ifdefs out of softfloat-specialize.h.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
3bd2dec1a1 fpu/softfloat: Pass FloatClass to pickNaNMulAdd
For each operand, pass a single enumeration instead of a pair of booleans.
The commit also merges multiple different ifdef-selected implementations
of pickNaNMulAdd into a single function whose body is ifdef-selected.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
4f251cfd52 fpu/softfloat: Pass FloatClass to pickNaN
For each operand, pass a single enumeration instead of a pair of booleans.
The commit also merges multiple different ifdef-selected implementations
of pickNaN into a single function whose body is ifdef-selected.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
247d1f2190 fpu/softfloat: Make is_nan et al available to softfloat-specialize.h
We will need these helpers within softfloat-specialize.h, so move
the definitions above the include.  After specialization, they will
not always be used so mark them to avoid the Werror.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
03385dfdaa fpu/softfloat: Specialize on snan_bit_is_one
Only MIPS requires snan_bit_is_one to be variable.  While we are
specializing softfloat behaviour, allow other targets to eliminate
this runtime check.

Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@mips.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
5240a30dcc fpu/softfloat: Remove floatX_maybe_silence_nan
These functions are now unused.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
4885312f47 fpu/softfloat: Use float*_silence_nan in propagateFloat*NaN
We have already checked the arguments for SNaN;
we don't need to do it again.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Alex Bennée
6fed16b265 fpu/softfloat: re-factor float to float conversions
This allows us to delete a lot of additional boilerplate
code which is no longer needed.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Alex Bennée
ca3a3d5a31 fpu/softfloat: Partial support for ARM Alternative half-precision
For float16 ARM supports an alternative half-precision format which
sacrifices the ability to represent NaN/Inf in return for a higher
dynamic range.  The new FloatFmt flag, arm_althp, is then used to
modify the behaviour of canonicalize and round_canonical with respect
to representation and exception raising.

Usage of this new flag waits until we re-factor float-to-float conversions.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:27:15 -07:00
Richard Henderson
0bcfbcbea5 fpu/softfloat: Replace float_class_msnan with parts_silence_nan
With a canonical representation of NaNs, we can silence an SNaN
immediately rather than delay until the final format is known.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:24:19 -07:00
Richard Henderson
f7e598e264 fpu/softfloat: Replace float_class_dnan with parts_default_nan
With a canonical representation of NaNs, we can return the
default nan directly rather than delay the expansion until
the final format is known.

Note one case where we uselessly assigned to a.sign, which was
overwritten/ignored later when expanding float_class_dnan.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:24:19 -07:00
Richard Henderson
298b468e43 fpu/softfloat: Introduce parts_is_snan_frac
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:24:19 -07:00
Richard Henderson
94933df0e5 fpu/softfloat: Canonicalize NaN fraction
Shift the NaN fraction to a canonical position, much like we
do for the fraction of normal numbers.  This will facilitate
manipulation of NaNs within the shared code paths.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:24:19 -07:00
Richard Henderson
0664335a6e fpu/softfloat: Move softfloat-specialize.h below FloatParts definition
We want to be able to specialize on the canonical representation.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:24:19 -07:00
Richard Henderson
d619bb98fd fpu/softfloat: Split floatXX_silence_nan from floatXX_maybe_silence_nan
The new function assumes that the input is an SNaN and
does not double-check.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:24:19 -07:00
Richard Henderson
bca52234d1 fpu/softfloat: Merge NO_SIGNALING_NANS definitions
Move the ifdef inside the relevant functions instead of
duplicating the function declarations.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:24:19 -07:00
Petr Tesarik
6603d50648 fpu/softfloat: Fix conversion from uint64 to float128
The significand is passed to normalizeRoundAndPackFloat128() as high
first, low second. The current code passes the integer first, so the
result is incorrectly shifted left by 64 bits.

This bug affects the emulation of s390x instruction CXLGBR (convert
from logical 64-bit binary-integer operand to extended BFP result).

Cc: qemu-stable@nongnu.org
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Message-Id: <20180511071052.1443-1-ptesarik@suse.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-05-17 15:24:19 -07:00
Peter Maydell
333583757c fpu/softfloat: Don't set Invalid for float-to-int(MAXINT)
In float-to-integer conversion, if the floating point input
converts exactly to the largest or smallest integer that
fits in to the result type, this is not an overflow.
In this situation we were producing the correct result value,
but were incorrectly setting the Invalid flag.
For example for Arm A64, "FCVTAS w0, d0" on an input of
0x41dfffffffc00000 should produce 0x7fffffff and set no flags.

Fix the boundary case to take the right half of the if()
statements.

This fixes a regression from 2.11 introduced by the softfloat
refactoring.

Cc: qemu-stable@nongnu.org
Fixes: ab52f973a5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180510140141.12120-1-peter.maydell@linaro.org
2018-05-15 14:58:42 +01:00
Alex Bennée
a5a5f5e2e4 fpu/softfloat: int_to_float ensure r fully initialised
Reported by Coverity (CID1390635). We ensure this for uint_to_float
later on so we might as well mirror that.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-15 14:58:42 +01:00
Peter Maydell
1839189bbf softfloat: Handle default NaN mode after pickNaNMulAdd, not before
It is implementation defined whether a multiply-add of
(0,inf,qnan) or (inf,0,qnan) raises InvalidaOperation or
not, so we let the target-specific pickNaNMulAdd function
handle this. This means that we must do the "return the
default NaN in default NaN mode" check after the call,
not before. Correct the ordering, and restore the comment
from the old propagateFloat64MulAddNaN() that warned about
this corner case.

This fixes a regression from 2.11 for Arm guests where we would
incorrectly fail to set the Invalid flag for these cases.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180504100547.14621-1-peter.maydell@linaro.org
2018-05-10 18:10:56 +01:00
Richard Henderson
ce8d408205 fpu: Bound increment for scalbn
Without bounding the increment, we can overflow exp either here
in scalbn_decomposed or when adding the bias in round_canonical.
This can result in e.g. underflowing to 0 instead of overflowing
to infinity.

The old softfloat code did bound the increment.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-17 14:52:38 +01:00
Alex Bennée
9cb4e398c2 fpu/softfloat: check for Inf / x or 0 / x before /0
The re-factoring of div_floats changed the order of checking meaning
an operation like -inf/0 erroneously raises the divbyzero flag.
IEEE-754 (2008) specifies this should only occur for operations on
finite operands.

We fix this by moving the check on the dividend being Inf/0 to before
the divisor is zero check.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180416135442.30606-1-alex.bennee@linaro.org
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-16 18:40:48 +01:00