target/xtensa, the only user of NO_SIGNALING_NANS macro has FPU
implementations with and without the corresponding property. With
NO_SIGNALING_NANS being a macro they cannot be a part of the same QEMU
executable.
Replace macro with new property in float_status to allow cores with
different FPU implementations coexist.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
With Makefiles that have automatically generated dependencies, you
generated includes are set as dependencies of the Makefile, so that they
are built before everything else and they are available when first
building the .c files.
Alternatively you can use a fine-grained dependency, e.g.
target/arm/translate.o: target/arm/decode-neon-shared.inc.c
With Meson you have only one choice and it is a third option, namely
"build at the beginning of the corresponding target"; the way you
express it is to list the includes in the sources of that target.
The problem is that Meson decides if something is a source vs. a
generated include by looking at the extension: '.c', '.cc', '.m', '.C'
are sources, while everything else is considered an include---including
'.inc.c'.
Use '.c.inc' to avoid this, as it is consistent with our other convention
of using '.rst.inc' for included reStructuredText files. The editorconfig
file is adjusted.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both x87 and m68k need the low parts of the quotient for their
remainder operations. Arrange for floatx80_modrem to track those bits
and return them via a pointer.
The architectures using float32_rem and float64_rem do not appear to
need this information, so the *_rem interface is left unchanged and
the information returned only from floatx80_modrem. The logic used to
determine the low 7 bits of the quotient for m68k
(target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it
looks at the result of converting the remainder to integer, the
quotient having been discarded by that point); this patch does not
change that, but the m68k maintainers may wish to do so.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081656500.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The floatx80 remainder implementation unnecessarily sets the high bit
of bSig explicitly. By that point in the function, arguments that are
invalid, zero, infinity or NaN have already been handled and
subnormals have been through normalizeFloatx80Subnormal, so the high
bit will already be set. Remove the unnecessary code.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081656220.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The floatx80 remainder implementation sometimes returns the numerator
unchanged when the denominator is sufficiently larger than the
numerator. But if the value to be returned unchanged is a
pseudo-denormal, that is incorrect. Fix it to normalize the numerator
in that case.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081655520.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The floatx80 remainder implementation ignores the high bit of the
significand when checking whether an operand (numerator) with zero
exponent is zero. This means it mishandles a pseudo-denormal
representation of 0x1p-16382L by treating it as zero. Fix this by
checking the whole significand instead.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081655180.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The m68k-specific softfloat code includes a function floatx80_mod that
is extremely similar to floatx80_rem, but computing the remainder
based on truncating the quotient toward zero rather than rounding it
to nearest integer. This is also useful for emulating the x87 fprem
and fprem1 instructions. Change the floatx80_rem implementation into
floatx80_modrem that can perform either operation, with both
floatx80_rem and floatx80_mod as thin wrappers available for all
targets.
There does not appear to be any use for the _mod operation for other
floating-point formats in QEMU (the only other architectures using
_rem at all are linux-user/arm/nwfpe, for FPA emulation, and openrisc,
for instructions that have been removed in the latest version of the
architecture), so no change is made to the code for other formats.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081654280.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When building with clang version 10.0.0-4ubuntu1, we get:
CC lm32-softmmu/fpu/softfloat.o
fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fpu/softfloat.c:3423:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
absZ0 &= ~ ( ( (uint64_t) ( absZ1<<1 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...
fpu/softfloat.c:4273:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
zSig1 &= ~ ( ( zSig2 + zSig2 == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix by rewriting the fishy bitwise AND of two bools as an int.
Suggested-by: Eric Blake <eblake@redhat.com>
Buglink: https://bugs.launchpad.net/bugs/1881004
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200617201309.1640952-2-richard.henderson@linaro.org
Message-Id: <20200528155420.9802-1-philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This includes *_is_any_nan, *_is_neg, *_is_inf, etc.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the floatx80 compare specializations with inline functions
that call the standard floatx80_compare{,_quiet} functions.
Use bool as the return type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the float128 compare specializations with inline functions
that call the standard float128_compare{,_quiet} functions.
Use bool as the return type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the float64 compare specializations with inline functions
that call the standard float64_compare{,_quiet} functions.
Use bool as the return type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Replace the float32 compare specializations with inline functions
that call the standard float32_compare{,_quiet} functions.
Use bool as the return type.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Give the previously unnamed enum a typedef name. Use it in the
prototypes of compare functions. Use it to hold the results
of the compare functions.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Give the previously unnamed enum a typedef name. Use the packed
attribute so that we do not affect the layout of the float_status
struct. Use it in the prototypes of relevant functions.
Adjust switch statements as necessary to avoid compiler warnings.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Slightly tidies the usage within softfloat.c and the
representation in float_status.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have had this on the to-do list for quite some time.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The existing f{32,64}_addsub_post test, which checks for zero
inputs, is identical to f{32,64}_mul_fast_test. Which means
we can eliminate the fast_test/fast_op hooks in favor of
reusing the same post hook.
This means we have one fewer test along the fast path for multiply.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softfloat function floatx80_round_to_int incorrectly handles the
case of a pseudo-denormal where only the high bit of the significand
is set, ignoring that bit (treating the number as an exact zero)
rather than treating the number as an alternative representation of
+/- 2^-16382 (which may round to +/- 1 depending on the rounding mode)
as hardware does. Fix this check (simplifying the code in the
process).
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.21.2005042339420.22972@digraph.polyomino.org.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softfloat floatx80 comparisons fail to allow for pseudo-denormals,
which should compare equal to corresponding values with biased
exponent 1 rather than 0. Add an adjustment for that case when
comparing numbers with the same sign.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.21.2005042338470.22972@digraph.polyomino.org.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The softfloat function addFloatx80Sigs, used for addition of values
with the same sign and subtraction of values with opposite sign, fails
to handle the case where the two values both have biased exponent zero
and there is a carry resulting from adding the significands, which can
occur if one or both values are pseudo-denormals (biased exponent
zero, explicit integer bit 1). Add a check for that case, so making
the results match those seen on x86 hardware for pseudo-denormals.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.21.2005042337570.22972@digraph.polyomino.org.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Conversions between IEEE floating-point formats should convert
signaling NaNs to quiet NaNs. Most of those in QEMU's softfloat code
do so, but those for floatx80 fail to. Fix those conversions to
silence signaling NaNs as well.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.21.2005042336170.22972@digraph.polyomino.org.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
All other calls to normalize*Subnormal detect zero input before
the call -- this is the only outlier. This case can happen with
+0.0 + +0.0 = +0.0 or -0.0 + -0.0 = -0.0, so return a zero of
the correct sign.
Reported-by: Coverity (CID 1421991)
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20200327232042.10008-1-richard.henderson@linaro.org>
Message-Id: <20200403191150.863-8-alex.bennee@linaro.org>
Reintroduce float32_to_float64 that was removed here:
https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00455.html
- nbench test it not actually calling this function at all
- SPECS 2006 significat number of tests impoved their runtime, just
few of them showed small slowdown
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matus Kysel <mkysel@tachyum.com>
Message-Id: <20191017142133.59439-1-mkysel@tachyum.com>
[rth: Add comment about impossible inexact exceptions.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is not a normal header and should only be included in the main
softfloat.c file to bring in the various target specific
specialisations. Indeed as it contains non-inlined C functions it is
not even a legal header. Rename it to match our included C convention.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In our quest to eliminate the home rolled LIT64 macro we fixup usage
inside the softfloat code. While we are at it we remove some of the
extraneous spaces to closer fit the house style.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Remove some more use of LIT64 while making the meaning more clear. We
also avoid the need of casts as the results by definition fit into the
return type.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
This also allows us to remove the extractFloat16exp/frac helpers. We
avoid using the floatXX_pack_raw functions as they are slight overkill
for masking out all but the top bit of the number. The generated code
is almost exactly the same as makes no difference to the
pre-conversion code.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
We have a wrapper that does the right thing from stdint.h so lets use
it for our constants in softfloat-specialize.h
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Before falling back to softfloat FMA, we do not restore the original
values of inputs A and C. Fix it.
This bug was caught by running gcc's testsuite on RISC-V qemu.
Note that this change gives a small perf increase for fp-bench:
Host: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
Command: perf stat -r 3 taskset -c 0 ./fp-bench -o mulAdd -p $prec
- $prec = single:
- before:
101.71 MFlops
102.18 MFlops
100.96 MFlops
- after:
103.63 MFlops
103.05 MFlops
102.96 MFlops
- $prec = double:
- before:
173.10 MFlops
173.93 MFlops
172.11 MFlops
- after:
178.49 MFlops
178.88 MFlops
178.66 MFlops
Signed-off-by: Kito Cheng <kito.cheng@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20190322204320.17777-1-cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Wrong type of NaN was generated for IEEE 754-2008 by MADDF.<D|S> and
MSUBF.<D|S> instructions when the arguments were (Inf, Zero, NaN) or
(Zero, Inf, NaN).
The if-else statement establishes if the system conforms to IEEE
754-1985 or IEEE 754-2008, and defines different behaviors depending
on that. In case of IEEE 754-2008, in mentioned cases of inputs,
<MADDF|MSUBF>.<D|S> returns the input value 'c' [2] (page 53) and
raises floating point exception 'Invalid Operation' [1] (pages 349,
350).
These scenarios were tested and the results in QEMU emulation match
the results obtained on the machine that has a MIPS64R6 CPU.
[1] MIPS Architecture for Programmers Volume II-a: The MIPS64
Instruction Set Reference Manual, Revision 6.06
[2] MIPS Architecture for Programmers Volume IV-j: The MIPS64
SIMD Architecture Module, Revision 1.12
Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
Message-Id: <1553008916-15274-2-git-send-email-mateja.marjanovic@rt-rk.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[AJB: fixed up commit message]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Previously this was only supported for roundAndPackFloat64.
New support in round_canonical, round_to_int, float128_round_to_int,
roundAndPackFloat32, roundAndPackInt32, roundAndPackInt64,
roundAndPackUint64. This does not include any of the floatx80 routines,
as we do not have users for that rounding mode there.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190215170225.15537-1-richard.henderson@linaro.org>
Tested-by: David Hildenbrand <david@redhat.com>
[AJB: add missing break]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Handling it just like float128_to_uint32_round_to_zero, that hopefully
is free of bugs :)
Documentation basically copied from float128_to_uint64
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The added branch to the FMA ops is marked as unlikely and therefore
its impact on performance (measured with fp-bench) is within noise range
when measured on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz.
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Performance results for fp-bench:
Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
sqrt-single: 42.30 MFlops
sqrt-double: 22.97 MFlops
- after:
sqrt-single: 311.42 MFlops
sqrt-double: 311.08 MFlops
Here USE_FP makes a huge difference for f64's, with throughput
going from ~200 MFlops to ~300 MFlops.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
The appended paves the way for leveraging the host FPU for a subset
of guest FP operations. For most guest workloads (e.g. FP flags
aren't ever cleared, inexact occurs often and rounding is set to the
default [to nearest]) this will yield sizable performance speedups.
The approach followed here avoids checking the FP exception flags register.
See the added comment for details.
This assumes that QEMU is running on an IEEE754-compliant FPU and
that the rounding is set to the default (to nearest). The
implementation-dependent specifics of the FPU should not matter; things
like tininess detection and snan representation are still dealt with in
soft-fp. However, this approach will break on most hosts if we compile
QEMU with flags that break IEEE compatibility. There is no way to detect
all of these flags at compilation time, but at least we check for
-ffast-math (which defines __FAST_MATH__) and disable hardfloat
(plus emit a #warning) when it is set.
This patch just adds common code. Some operations will be migrated
to hardfloat in subsequent patches to ease bisection.
Note: some architectures (at least PPC, there might be others) clear
the status flags passed to softfloat before most FP operations. This
precludes the use of hardfloat, so to avoid introducing a performance
regression for those targets, we add a flag to disable hardfloat.
In the long run though it would be good to fix the targets so that
at least the inexact flag passed to softfloat is indeed sticky.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
glibc >= 2.25 defines canonicalize in commit eaf5ad0
(Add canonicalize, canonicalizef, canonicalizel., 2016-10-26).
Given that we'll be including <math.h> soon, prepare
for this by prefixing our canonicalize() with sf_ to avoid
clashing with the libc's canonicalize().
Reported-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Older versions of Clang (before 3.5) and GCC (before 4.1) do not
support the "__attribute__((flatten))" yet. We don't care about
such old versions of GCC anymore, but since Clang 3.4 is still
used in EPEL for RHEL7 / CentOS 7, we should not use this attribute
directly but with a wrapper macro instead.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The __udiv_qrnnd primitive that we nicked from gmp requires its
inputs to be normalized. We were not doing that. Because the
inputs are nearly normalized already, finishing that is trivial.
Replace div128to64 with a "proper" udiv_qrnnd, so that this
remains a reusable primitive.
Fixes: cf07323d49
Fixes: https://bugs.launchpad.net/qemu/+bug/1793119
Tested-by: Emilio G. Cota <cota@braap.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Our minimum required compiler for compiling QEMU is GCC 4.1 these days,
so we can drop the support for compilers which do not provide the
__builtin_clz*() functions yet. Since the countLeadingZeros32/64 are
then identical to the clz32/64 functions, and we do not have to sync
the softloat 2 codebase with upstream anymore (softloat 3 is a complete
rewrite) we can simply replace the functions with our QEMU versions.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1538118095-7003-1-git-send-email-thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It has not had users since f83311e476 ("target-m68k: use floatx80
internally", 2017-06-21).
Note that no other bit-width has floatX_trunc_to_int.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180814002653.12828-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180814002653.12828-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For 0x1.0000000000003p+0 + 0x1.ffffffep+14 = 0x1.0001fffp+15
we dropped the sticky bit and so failed to raise inexact.
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20180810193129.1556-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Isolate the target-specific choice to 3 functions instead of 6.
The code in floatx80_default_nan tried to be over-general. There are
only two targets that support this format: x86 and m68k. Thus there
is no point in inventing a mechanism for snan_bit_is_one.
Move routines that no longer have ifdefs out of softfloat-specialize.h.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reduce the number of ifdefs. Correct the result for OpenRISC
and TriCore (although TriCore fixed in target-specific code).
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Isolate the target-specific choice to 2 functions instead of 6.
The code in float16_default_nan was only correct for ARM, MIPS, and X86.
Though float16 support is rare among our targets.
The code in float128_default_nan was arguably wrong for Sparc. While
QEMU supports the Sparc 128-bit insns, no real cpu enables it.
The code in floatx80_default_nan tried to be over-general. There are
only two targets that support this format: x86 and m68k. Thus there
is no point in inventing a value for snan_bit_is_one.
Move routines that no longer have ifdefs out of softfloat-specialize.h.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For each operand, pass a single enumeration instead of a pair of booleans.
The commit also merges multiple different ifdef-selected implementations
of pickNaNMulAdd into a single function whose body is ifdef-selected.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For each operand, pass a single enumeration instead of a pair of booleans.
The commit also merges multiple different ifdef-selected implementations
of pickNaN into a single function whose body is ifdef-selected.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need these helpers within softfloat-specialize.h, so move
the definitions above the include. After specialization, they will
not always be used so mark them to avoid the Werror.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Only MIPS requires snan_bit_is_one to be variable. While we are
specializing softfloat behaviour, allow other targets to eliminate
this runtime check.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@mips.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These functions are now unused.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have already checked the arguments for SNaN;
we don't need to do it again.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This allows us to delete a lot of additional boilerplate
code which is no longer needed.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For float16 ARM supports an alternative half-precision format which
sacrifices the ability to represent NaN/Inf in return for a higher
dynamic range. The new FloatFmt flag, arm_althp, is then used to
modify the behaviour of canonicalize and round_canonical with respect
to representation and exception raising.
Usage of this new flag waits until we re-factor float-to-float conversions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With a canonical representation of NaNs, we can silence an SNaN
immediately rather than delay until the final format is known.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With a canonical representation of NaNs, we can return the
default nan directly rather than delay the expansion until
the final format is known.
Note one case where we uselessly assigned to a.sign, which was
overwritten/ignored later when expanding float_class_dnan.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Shift the NaN fraction to a canonical position, much like we
do for the fraction of normal numbers. This will facilitate
manipulation of NaNs within the shared code paths.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We want to be able to specialize on the canonical representation.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The new function assumes that the input is an SNaN and
does not double-check.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the ifdef inside the relevant functions instead of
duplicating the function declarations.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The significand is passed to normalizeRoundAndPackFloat128() as high
first, low second. The current code passes the integer first, so the
result is incorrectly shifted left by 64 bits.
This bug affects the emulation of s390x instruction CXLGBR (convert
from logical 64-bit binary-integer operand to extended BFP result).
Cc: qemu-stable@nongnu.org
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Message-Id: <20180511071052.1443-1-ptesarik@suse.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In float-to-integer conversion, if the floating point input
converts exactly to the largest or smallest integer that
fits in to the result type, this is not an overflow.
In this situation we were producing the correct result value,
but were incorrectly setting the Invalid flag.
For example for Arm A64, "FCVTAS w0, d0" on an input of
0x41dfffffffc00000 should produce 0x7fffffff and set no flags.
Fix the boundary case to take the right half of the if()
statements.
This fixes a regression from 2.11 introduced by the softfloat
refactoring.
Cc: qemu-stable@nongnu.org
Fixes: ab52f973a5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180510140141.12120-1-peter.maydell@linaro.org
Reported by Coverity (CID1390635). We ensure this for uint_to_float
later on so we might as well mirror that.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It is implementation defined whether a multiply-add of
(0,inf,qnan) or (inf,0,qnan) raises InvalidaOperation or
not, so we let the target-specific pickNaNMulAdd function
handle this. This means that we must do the "return the
default NaN in default NaN mode" check after the call,
not before. Correct the ordering, and restore the comment
from the old propagateFloat64MulAddNaN() that warned about
this corner case.
This fixes a regression from 2.11 for Arm guests where we would
incorrectly fail to set the Invalid flag for these cases.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180504100547.14621-1-peter.maydell@linaro.org
Without bounding the increment, we can overflow exp either here
in scalbn_decomposed or when adding the bias in round_canonical.
This can result in e.g. underflowing to 0 instead of overflowing
to infinity.
The old softfloat code did bound the increment.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The re-factoring of div_floats changed the order of checking meaning
an operation like -inf/0 erroneously raises the divbyzero flag.
IEEE-754 (2008) specifies this should only occur for operations on
finite operands.
We fix this by moving the check on the dividend being Inf/0 to before
the divisor is zero check.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180416135442.30606-1-alex.bennee@linaro.org
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The re-factor broke the raising of INVALID when NaN/Inf is passed to
the float_to_int conversion functions. round_to_uint_and_pack got this
right for NaN but also missed out the Inf handling.
Fixes https://bugs.launchpad.net/qemu/+bug/1759264
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180413140334.26622-3-alex.bennee@linaro.org
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Before 8936006 ("fpu/softfloat: re-factor minmax", 2018-02-21),
we used to return +Zero for maxnummag(-Zero,+Zero); after that
commit, we return -Zero.
Fix it by making {min,max}nummag consistent with {min,max}num,
deferring to the latter when the absolute value of the operands
is the same.
With this fix we now pass fp-test.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180413140334.26622-2-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We incorrectly passed in the current rounding mode
instead of float_round_to_zero.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180410055912.934-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Helper routines for FPU instructions and NaN definitions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
Since f3218a8 ("softfloat: add floatx80 constants")
floatx80_infinity is defined but never used.
This patch updates floatx80 functions to use
this definition.
This allows to define a different default Infinity
value on m68k: the m68k FPU defines infinity with
all bits set to zero in the mantissa.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180224201802.911-4-laurent@vivier.eu>
Move fpu/softfloat-macros.h to include/fpu/
Export floatx80 functions to be used by target floatx80
specific implementations.
Exports:
propagateFloatx80NaN(), extractFloatx80Frac(),
extractFloatx80Exp(), extractFloatx80Sign(),
normalizeFloatx80Subnormal(), packFloatx80(),
roundAndPackFloatx80(), normalizeRoundAndPackFloatx80()
Also exports packFloat32() that will be used to implement
m68k fsinh, fcos, fsin, ftan operations.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180224201802.911-2-laurent@vivier.eu>
This is a little bit of a departure from softfloat's original approach
as we skip the estimate step in favour of a straight iteration. There
is a minor optimisation to avoid calculating more bits of precision
than we need however this still brings a performance drop, especially
for float64 operations.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The compare function was already expanded from a macro. I keep the
macro expansion but move most of the logic into a compare_decomposed.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Let's do the same re-factor treatment for minmax functions. I still
use the MACRO trick to expand but now all the checking code is common.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
This is one of the simpler manipulations you could make to a floating
point number.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
These are considerably simpler as the lower order integers can just
use the higher order conversion function. As the decomposed fractional
part is a full 64 bit rounding and inexact handling comes from the
pack functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We share the common int64/uint64_pack_decomposed function across all
the helpers and simply limit the final result depending on the final
size.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We can now add float16_round_to_int and use the common round_decomposed and
canonicalize functions to have a single implementation for
float16/32/64 round_to_int functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
We can now add float16_muladd and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 muladd functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
We can now add float16_div and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 versions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
We can now add float16_mul and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 versions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
We can now add float16_add/sub and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 add and sub functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
These structures pave the way for generic softfloat helper routines
that will operate on fully decomposed numbers.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This is pure code-motion during re-factoring as the helpers will be
needed earlier.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Mention the pseudo-code fragment from which this is based.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
This will be required when expanding the MINMAX() macro for 16
bit/half-precision operations.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Add a function to round a floatx80 to the defined precision
(floatx80_rounding_precision)
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170628204241.32106-5-laurent@vivier.eu>
In float64_to_uint64_round_to_zero() a typo meant that we were
taking the uint64_t return value from float64_to_uint64() and
putting it into an int64_t variable before returning it as
uint64_t again. Use uint64_t instead of pointlessly casting it
back and forth to int64_t.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
float128_to_uint32_round_to_zero() is needed by xscvqpuwz instruction
of PowerPC ISA 3.0.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Implement float128_to_uint64() and use that to implement
float128_to_uint64_round_to_zero()
This is required by xscvqpudz instruction of PowerPC ISA 3.0.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Power ISA 3.0 introduces a few quadruple precision floating point
instructions that support round-to-odd rounding mode. The
round-to-odd mode is explained as under:
Let Z be the intermediate arithmetic result or the operand of a convert
operation. If Z can be represented exactly in the target format, the
result is Z. Otherwise the result is either Z1 or Z2 whichever is odd.
Here Z1 and Z2 are the next larger and smaller numbers representable
in the target format respectively.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently float128_default_nan() returns 0xFFFF800000000000 in the
higher double word, but it should return 0x7FFF800000000000 which
is the correct higher double word for default qNAN on PowerPC.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Like the original MIPS, HPPA has the MSB of an SNaN set.
However, it has different rules for silencing an SNaN:
(1) msb is cleared and (2) msb-1 must be set if the fraction
is now zero, and (implementation defined) may be set always.
I haven't checked real hardware but chose the set always
alternative because it's easy and within spec.
Signed-off-by: Richard Henderson <rth@twiddle.net>
All operations that take a floatx80 as an operand need to have their
inputs checked for malformed encodings. In all of these cases, use the
function floatx80_invalid_encoding to perform the check. If an invalid
operand is found, raise an invalid operation exception, and then return
either NaN (for fp-typed results) or the integer indefinite value (the
minimum representable signed integer value, for int-typed results).
For the non-quiet comparison operations, this touches adjacent code in
order to pass style checks.
Signed-off-by: Andrew Dutcher <andrew@andrewdutcher.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1471392895-17324-1-git-send-email-andrew@andrewdutcher.com
[PMM: changed "1 << 63" to "1ULL << 63" to fix compile errors]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Change the flag type to 'uint8_t' to fix the implicit conversion error.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 20160810185502.32015-1-bobby.prani@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Only for Mips platform, and only for cases when snan_bit_is_one is 0,
correct the order of argument comparisons in pickNaNMulAdd().
For more info, see [1], page 53, section "3.5.3 NaN Propagation".
[1] "MIPS Architecture for Programmers Volume IV-j:
The MIPS32 SIMD Architecture Module",
Imagination Technologies LTD, Revision 1.12, February 3, 2016
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[leon.alrae@imgtec.com:
* reworded the subject of the patch
* swapped if/else code blocks to match the commit description]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Only for Mips platform, and only for cases when snan_bit_is_one is 0,
correct default NaN values (in their 16-, 32-, and 64-bit flavors).
For more info, see [1], page 84, Table 6.3 "Value Supplied When a New
Quiet NaN Is Created", and [2], page 52, Table 3.7 "Default NaN
Encodings".
[1] "MIPS Architecture For Programmers Volume II-A:
The MIPS64 Instruction Set Reference Manual",
Imagination Technologies LTD, Revision 6.04, November 13, 2015
[2] "MIPS Architecture for Programmers Volume IV-j:
The MIPS32 SIMD Architecture Module",
Imagination Technologies LTD, Revision 1.12, February 3, 2016
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
fpu/softfloat-specialize.h is the most critical file in SoftFloat
library, since it handles numerous differences between platforms in
relation to floating point arithmetics. This patch makes the code
in this file more consistent format-wise, and hopefully easier to
debug and maintain.
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
This patch modifies SoftFloat library so that it can be configured in
run-time in relation to the meaning of signaling NaN bit, while, at the
same time, strictly preserving its behavior on all existing platforms.
Background:
In floating-point calculations, there is a need for denoting undefined or
unrepresentable values. This is achieved by defining certain floating-point
numerical values to be NaNs (which stands for "not a number"). For additional
reasons, virtually all modern floating-point unit implementations use two
kinds of NaNs: quiet and signaling. The binary representations of these two
kinds of NaNs, as a rule, differ only in one bit (that bit is, traditionally,
the first bit of mantissa).
Up to 2008, standards for floating-point did not specify all details about
binary representation of NaNs. More specifically, the meaning of the bit
that is used for distinguishing between signaling and quiet NaNs was not
strictly prescribed. (IEEE 754-2008 was the first floating-point standard
that defined that meaning clearly, see [1], p. 35) As a result, different
platforms took different approaches, and that presented considerable
challenge for multi-platform emulators like QEMU.
Mips platform represents the most complex case among QEMU-supported
platforms regarding signaling NaN bit. Up to the Release 6 of Mips
architecture, "1" in signaling NaN bit denoted signaling NaN, which is
opposite to IEEE 754-2008 standard. From Release 6 on, Mips architecture
adopted IEEE standard prescription, and "0" denotes signaling NaN. On top of
that, Mips architecture for SIMD (also known as MSA, or vector instructions)
also specifies signaling bit in accordance to IEEE standard. MSA unit can be
implemented with both pre-Release 6 and Release 6 main processor units.
QEMU uses SoftFloat library to implement various floating-point-related
instructions on all platforms. The current QEMU implementation allows for
defining meaning of signaling NaN bit during build time, and is implemented
via preprocessor macro called SNAN_BIT_IS_ONE.
On the other hand, the change in this patch enables SoftFloat library to be
configured in run-time. This configuration is meant to occur during CPU
initialization, at the moment when it is definitely known what desired
behavior for particular CPU (or any additional FPUs) is.
The change is implemented so that it is consistent with existing
implementation of similar cases. This means that structure float_status is
used for passing the information about desired signaling NaN bit on each
invocation of SoftFloat functions. The additional field in float_status is
called snan_bit_is_one, which supersedes macro SNAN_BIT_IS_ONE.
IMPORTANT:
This change is not meant to create any change in emulator behavior or
functionality on any platform. It just provides the means for SoftFloat
library to be used in a more flexible way - in other words, it will just
prepare SoftFloat library for usage related to Mips platform and its
specifics regarding signaling bit meaning, which is done in some of
subsequent patches from this series.
Further break down of changes:
1) Added field snan_bit_is_one to the structure float_status, and
correspondent setter function set_snan_bit_is_one().
2) Constants <float16|float32|float64|floatx80|float128>_default_nan
(used both internally and externally) converted to functions
<float16|float32|float64|floatx80|float128>_default_nan(float_status*).
This is necessary since they are dependent on signaling bit meaning.
At the same time, for the sake of code cleanup and simplicity, constants
<floatx80|float128>_default_nan_<low|high> (used only internally within
SoftFloat library) are removed, as not needed.
3) Added a float_status* argument to SoftFloat library functions
XXX_is_quiet_nan(XXX a_), XXX_is_signaling_nan(XXX a_),
XXX_maybe_silence_nan(XXX a_). This argument must be present in
order to enable correct invocation of new version of functions
XXX_default_nan(). (XXX is <float16|float32|float64|floatx80|float128>
here)
4) Updated code for all platforms to reflect changes in SoftFloat library.
This change is twofolds: it includes modifications of SoftFloat library
functions invocations, and an addition of invocation of function
set_snan_bit_is_one() during CPU initialization, with arguments that
are appropriate for each particular platform. It was established that
all platforms zero their main CPU data structures, so snan_bit_is_one(0)
in appropriate places is not added, as it is not needed.
[1] "IEEE Standard for Floating-Point Arithmetic",
IEEE Computer Society, August 29, 2008.
Signed-off-by: Thomas Schwinge <thomas@codesourcery.com>
Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Tested-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[leon.alrae@imgtec.com:
* cherry-picked 2 chunks from patch #2 to fix compilation warnings]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
This patch adds a file for all the FPU related helpers with all the includes,
useful defines, and a function to update the status bits. Additionally it adds
a mask for the rounding mode bits of PSW as well as all the opcodes for the
FPU instructions.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <1457708597-3025-2-git-send-email-kbastian@mail.uni-paderborn.de>
Use the plain 'int' type rather than 'int_fast16_t' for handling
exponents. Exponents don't need to be exactly 16 bits, so using int16_t
for them would confuse more than it clarified.
This should be a safe change because int_fast16_t semantics
permit use of 'int' (and on 32-bit glibc that is what you get).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1453807806-32698-4-git-send-email-peter.maydell@linaro.org
Use the plain 'int' type rather than 'int_fast16_t' for shift counts
in the various shift related functions, since we don't actually care
about the size of the integer at all here, and using int16_t would
be confusing.
This should be a safe change because int_fast16_t semantics
permit use of 'int' (and on 32-bit glibc that is what you get).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1453807806-32698-3-git-send-email-peter.maydell@linaro.org
Make the functions which convert floating point to 16 bit integer
return int16_t rather than int_fast16_t, and correspondingly use
int_fast16_t in their internal implementations where appropriate.
(These functions are used only by the ARM target.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1453807806-32698-2-git-send-email-peter.maydell@linaro.org
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-16-git-send-email-peter.maydell@linaro.org
The roundAndPackFloat16 function should return a float16 value, not a
float32 one. Fix that.
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1452700993-6570-1-git-send-email-aurelien@aurel32.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace the int8 softfloat-specific typedef with int8_t.
This change was made with
find include hw fpu target-* -name '*.[ch]' | xargs sed -i -e 's/\bint8\b/int8_t/g'
together with manual removal of the typedef definition, and
manual undoing of various mis-hits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Message-id: 1452603315-27030-6-git-send-email-peter.maydell@linaro.org
Replace the uint32 softfloat-specific typedef with uint32_t.
This change was made with
find include hw fpu target-* -name '*.[ch]' | xargs sed -i -e 's/\buint32\b/uint32_t/g'
together with manual removal of the typedef definition,
manual undoing of various mis-hits, and another couple of
fixes found via test compilation.
All the uses in hw/ were using the wrong type by mistake.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Message-id: 1452603315-27030-5-git-send-email-peter.maydell@linaro.org
Replace the int32 softfloat-specific typedef with int32_t.
This change was made with
find hw include fpu target-* -name '*.[ch]' | xargs sed -i -e 's/\bint32\b/int32_t/g'
together with manual removal of the typedef definition, and
manual undoing of some mis-hits where macro arguments were
being used for token pasting rather than as a type.
The uses in hw/ipmi/ should not have been using this type at all.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Message-id: 1452603315-27030-4-git-send-email-peter.maydell@linaro.org
Replace the uint64 softfloat-specific typedef with uint64_t.
This change was made with
find include fpu target-* -name '*.[ch]' | xargs sed -i -e 's/\buint64\b/uint64_t/g'
together with manual removal of the typedef definition, and
manual undoing of some mis-hits where macro arguments were
being used for token pasting rather than as a type.
Note that the target-mips/kvm.c and target-s390x/kvm.c changes are fixing
code that should not have been using the uint64 type in the first place.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Message-id: 1452603315-27030-3-git-send-email-peter.maydell@linaro.org
Replace the int64 softfloat-specific typedef with int64_t.
This change was made with
find include fpu target-* -name '*.[ch]' | xargs sed -i -e 's/\bint64\b/int64_t/g'
together with manual removal of the typedef definition, and
manual undoing of some mis-hits where macro arguments were
being used for token pasting rather than as a type.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Leon Alrae <leon.alrae@imgtec.com>
Message-id: 1452603315-27030-2-git-send-email-peter.maydell@linaro.org
Expand out STATUS_PARAM wherever it is used and delete the definition.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The code in the softfloat source files is under a mixture of
licenses: the original code and many changes from QEMU contributors
are under the base SoftFloat-2a license; changes from Stefan Weil
and RedHat employees are GPLv2-or-later; changes from Fabrice Bellard
are under the BSD license. Clarify this in the comments at the
top of each affected source file, including a statement about
the assumed licensing for future contributions, so we don't need
to remember to ask patch submitters explicitly to pick a license.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Andreas Färber <afaerber@suse.de>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Avi Kivity <avi.kivity@gmail.com>
Acked-by: Ben Taylor <bentaylor.solx86@gmail.com>
Acked-by: Blue Swirl <blauwirbel@gmail.com>
Acked-by: Christophe Lyon <christophe.lyon@st.com>
Acked-by: Fabrice Bellard <fabrice@bellard.org>
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
Acked-by: Juan Quintela <quintela@redhat.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Paul Brook <paul@codesourcery.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Richard Henderson <rth@twiddle.net>
Acked-by: Richard Sandiford <rdsandiford@googlemail.com>
Acked-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1421073508-23909-5-git-send-email-peter.maydell@linaro.org
Revert the parts of commits b645bb4885 and 5a6932d51d which are still
in the codebase and under a SoftFloat-2b license.
Reimplement support for architectures where the most significant bit
in the mantissa is 1 for a signaling NaN rather than a quiet NaN,
by adding handling for SNAN_BIT_IS_ONE being set to the functions
which test values for NaN-ness.
This includes restoring the bugfixes lost in the reversion where
some of the float*_is_quiet_nan() functions were returning true
for both signaling and quiet NaNs.
[This is a mechanical squashing together of two separate "revert"
and "reimplement" patches.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1421073508-23909-4-git-send-email-peter.maydell@linaro.org
Revert the remaining portions of commits 75d62a5856 and 3430b0be36
which are under a SoftFloat-2b license, ie the functions
uint64_to_float32() and uint64_to_float64(). (The float64_to_uint64()
and float64_to_uint64_round_to_zero() functions were completely
rewritten in commits fb3ea83aa and 0a87a3107d so can stay.)
Reimplement from scratch the uint64_to_float64() and uint64_to_float32()
conversion functions.
[This is a mechanical squashing together of two separate "revert"
and "reimplement" patches.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1421073508-23909-3-git-send-email-peter.maydell@linaro.org
This commit applies the changes to master which correspond to
replacing commit 158142c2c2 with a set of changes made by:
* taking the SoftFloat-2a release
* mechanically transforming the block comment style
* reapplying Fabrice's original changes from 158142c2c2
This commit was created by:
diff -u 158142c2c2 import-sf-2a
patch -p1 --fuzz 10 <../relicense-patch.txt
(where import-sf-2a is the branch resulting from the changes above).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1421073508-23909-2-git-send-email-peter.maydell@linaro.org
Add abs argument to the existing softfloat minmax() function and define
new float{32,64}_{min,max}nummag functions.
minnummag(x,y) returns x if |x| < |y|,
returns y if |y| < |x|,
otherwise minnum(x,y)
maxnummag(x,y) returns x if |x| > |y|,
returns y if |y| > |x|,
otherwise maxnum(x,y)
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
This commit expands all uses of the INLINE macro and drop it.
The reason for this is to avoid clashes with external libraries with
bad name conventions and also because renaming keywords is not a good
practice.
PS: I'm fine with this change to be licensed under softfloat-2a or
softfloat-2b.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This change adds the float32_to_uint64_round_to_zero function to the softfloat
library. This function fills out the complement of float32 to INT round-to-zero
conversion rountines, where INT is {int32_t, uint32_t, int64_t, uint64_t}.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
I need these available outside of softfloat for some of the reciprocal
processing in aarch64 helper functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1394822294-14837-20-git-send-email-peter.maydell@linaro.org
The ARMv8 instruction set includes a fused floating point
reciprocal square root step instruction which demands an
"(x * y + z) / 2" fused operation. Support this by adding
a flag to the softfloat muladd operations which requests
that the result is halved before rounding.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
IEEE754-2008 specifies a new rounding mode:
"roundTiesToAway: the floating-point number nearest to the infinitely
precise result shall be delivered; if the two nearest floating-point
numbers bracketing an unrepresentable infinitely precise result are
equally near, the one with larger magnitude shall be delivered."
Implement this new mode (it is needed for ARM). The general principle
is that the required code is exactly like the ties-to-even code,
except that we do not need to do the "in case of exact tie clear LSB
to round-to-even", because the rounding operation naturally causes
the exact tie to round up in magnitude.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Refactor the code in various functions which calculates rounding
increments given the current rounding mode, so that instead of a
set of nested if statements we have a simple switch statement.
This will give us a clean place to add the case for the new
tiesAway rounding mode.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Add the conversion functions float16_to_float64() and
float64_to_float16(), which will be needed for the ARM
A64 instruction set.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
In preparation for adding conversions between float16 and float64,
factor out code currently done inline in the float16<=>float32
conversion functions into functions RoundAndPackFloat16 and
NormalizeFloat16Subnormal along the lines of the existing versions
for the other float types.
Note that we change the handling of zExp from the inline code
to match the API of the other RoundAndPackFloat functions; however
we leave the positioning of the binary point between bits 22 and 23
rather than shifting it up to the high end of the word.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tidy up the get/set accessors for the fp state to add missing ones
and make them all inline in softfloat.h rather than some inline and
some not.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The float64_to_uint32_round_to_zero routine is incorrect.
For example, the following test pattern:
425F81378DC0CD1F / 0x1.f81378dc0cd1fp+38
will erroneously set the inexact flag.
This patch re-implements the routine to use the float64_to_uint64_round_to_zero
routine. If saturation occurs we ignore any flags set by the
conversion function and raise only Invalid.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Message-id: 1387397961-4894-6-git-send-email-tommusta@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The float64_to_uint32 has several flaws:
- for numbers between 2**32 and 2**64, the inexact exception flag
may get incorrectly set. In this case, only the invalid flag
should be set.
test pattern: 425F81378DC0CD1F / 0x1.f81378dc0cd1fp+38
- for numbers between 2**63 and 2**64, incorrect results may
be produced:
test pattern: 43EAAF73F1F0B8BD / 0x1.aaf73f1f0b8bdp+63
This patch re-implements float64_to_uint32 to re-use the
float64_to_uint64 routine (instead of float64_to_int64). For the
saturation case, we ignore any flags which the conversion routine
has set and raise only the invalid flag.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Message-id: 1387397961-4894-5-git-send-email-tommusta@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The float64_to_uint64_round_to_zero routine is incorrect.
For example, the following test pattern:
46697351FF4AEC29 / 0x1.97351ff4aec29p+103
currently produces 8000000000000000 instead of FFFFFFFFFFFFFFFF.
This patch re-implements the routine to temporarily force the
rounding mode and use the float64_to_uint64 routine.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Message-id: 1387397961-4894-4-git-send-email-tommusta@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
This patch adds the float32_to_uint64() routine, which converts a
32-bit floating point number to an unsigned 64 bit number.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: removed harmless but silly int64_t casts]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
If the input to float*_scalbn() is denormal then it represents
a number 0.[mantissabits] * 2^(1-exponentbias) (and the actual
exponent field is all zeroes). This means that when we convert
it to our unpacked encoding the unpacked exponent must be one
greater than for a normal number, which represents
1.[mantissabits] * 2^(e-exponentbias) for an exponent field e.
This meant we were giving answers too small by a factor of 2 for
all denormal inputs.
Note that the float-to-int routines also have this behaviour
of not adjusting the exponent for denormals; however there it is
harmless because denormals will all convert to integer zero anyway.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
We implement a number of float-to-integer conversions using conversion
to an integer type with a wider range and then a check against the
narrower range we are actually converting to. If we find the result to
be out of range we correctly raise the Invalid exception, but we must
also suppress other exceptions which might have been raised by the
conversion function we called.
This won't throw away exceptions we should have preserved, because for
the 'core' exception flags the IEEE spec mandates that the only valid
combinations of exception that can be raised by a single operation are
Inexact + Overflow and Inexact + Underflow. For the non-IEEE softfloat
flag for input denormals, we can guarantee that that flag won't have
been set for out of range float-to-int conversions because a squashed
denormal by definition goes to plus or minus zero, which is always in
range after conversion to integer zero.
This bug has been fixed for some of the float-to-int conversion routines
by previous patches; fix it for the remaining functions as well, so
that they all restore the pre-conversion status flags prior to raising
Invalid.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The comment preceding the float64_to_uint64 routine suggests that
the implementation is broken. And this is, indeed, the case.
This patch properly implements the conversion of a 64-bit floating
point number to an unsigned, 64 bit integer.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Currently the int-to-float functions take types which are specified
as "at least X bits wide", rather than "exactly X bits wide". This is
confusing and unhelpful since it means that the callers have to include
an explicit cast to [u]intXX_t to ensure the correct behaviour. Fix
them all to take the exactly-X-bits-wide types instead.
Note that this doesn't change behaviour at all since at the moment
we happen to define the 'int32' and 'uint32' types as exactly 32 bits
wide, and the 'int64' and 'uint64' types as exactly 64 bits wide.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
ARMv8 requires support for converting 32 and 64bit floating point
values to signed and unsigned 16bit integers.
Signed-off-by: Will Newton <will.newton@linaro.org>
[PMM: updated not to incorrectly set Inexact for Invalid inputs]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Our float32 to float16 conversion routine was generating the correct
numerical answers, but not always setting the right set of exception
flags. Fix this, mostly by rearranging the code to more closely
resemble RoundAndPackFloat*, and in particular:
* non-IEEE halfprec always raises Invalid for input NaNs
* we need to check for the overflow case before underflow
* we weren't getting the tininess-detected-after-rounding
case correct (somewhat academic since only ARM uses halfprec
and it is always tininess-detected-before-rounding)
* non-IEEE halfprec overflow raises only Invalid, not
Invalid + Inexact
* we weren't setting Inexact when we should
Also add some clarifying comments about what the code is doing.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Add floatnn_minnum() and floatnn_maxnum() functions which are equivalent
to the minNum() and maxNum() functions from IEEE 754-2008. They are
similar to min() and max() but differ in the handling of QNaN arguments.
Signed-off-by: Will Newton <will.newton@linaro.org>
Message-id: 1386158099-9239-5-git-send-email-will.newton@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>