Revert the remaining portions of commits 75d62a5856 and 3430b0be36
which are under a SoftFloat-2b license, ie the functions
uint64_to_float32() and uint64_to_float64(). (The float64_to_uint64()
and float64_to_uint64_round_to_zero() functions were completely
rewritten in commits fb3ea83aa and 0a87a3107d so can stay.)
Reimplement from scratch the uint64_to_float64() and uint64_to_float32()
conversion functions.
[This is a mechanical squashing together of two separate "revert"
and "reimplement" patches.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1421073508-23909-3-git-send-email-peter.maydell@linaro.org
This commit applies the changes to master which correspond to
replacing commit 158142c2c2 with a set of changes made by:
* taking the SoftFloat-2a release
* mechanically transforming the block comment style
* reapplying Fabrice's original changes from 158142c2c2
This commit was created by:
diff -u 158142c2c2 import-sf-2a
patch -p1 --fuzz 10 <../relicense-patch.txt
(where import-sf-2a is the branch resulting from the changes above).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1421073508-23909-2-git-send-email-peter.maydell@linaro.org
Add abs argument to the existing softfloat minmax() function and define
new float{32,64}_{min,max}nummag functions.
minnummag(x,y) returns x if |x| < |y|,
returns y if |y| < |x|,
otherwise minnum(x,y)
maxnummag(x,y) returns x if |x| > |y|,
returns y if |y| > |x|,
otherwise maxnum(x,y)
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
This commit expands all uses of the INLINE macro and drop it.
The reason for this is to avoid clashes with external libraries with
bad name conventions and also because renaming keywords is not a good
practice.
PS: I'm fine with this change to be licensed under softfloat-2a or
softfloat-2b.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This change adds the float32_to_uint64_round_to_zero function to the softfloat
library. This function fills out the complement of float32 to INT round-to-zero
conversion rountines, where INT is {int32_t, uint32_t, int64_t, uint64_t}.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
I need these available outside of softfloat for some of the reciprocal
processing in aarch64 helper functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1394822294-14837-20-git-send-email-peter.maydell@linaro.org
The ARMv8 instruction set includes a fused floating point
reciprocal square root step instruction which demands an
"(x * y + z) / 2" fused operation. Support this by adding
a flag to the softfloat muladd operations which requests
that the result is halved before rounding.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
IEEE754-2008 specifies a new rounding mode:
"roundTiesToAway: the floating-point number nearest to the infinitely
precise result shall be delivered; if the two nearest floating-point
numbers bracketing an unrepresentable infinitely precise result are
equally near, the one with larger magnitude shall be delivered."
Implement this new mode (it is needed for ARM). The general principle
is that the required code is exactly like the ties-to-even code,
except that we do not need to do the "in case of exact tie clear LSB
to round-to-even", because the rounding operation naturally causes
the exact tie to round up in magnitude.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Refactor the code in various functions which calculates rounding
increments given the current rounding mode, so that instead of a
set of nested if statements we have a simple switch statement.
This will give us a clean place to add the case for the new
tiesAway rounding mode.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Add the conversion functions float16_to_float64() and
float64_to_float16(), which will be needed for the ARM
A64 instruction set.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
In preparation for adding conversions between float16 and float64,
factor out code currently done inline in the float16<=>float32
conversion functions into functions RoundAndPackFloat16 and
NormalizeFloat16Subnormal along the lines of the existing versions
for the other float types.
Note that we change the handling of zExp from the inline code
to match the API of the other RoundAndPackFloat functions; however
we leave the positioning of the binary point between bits 22 and 23
rather than shifting it up to the high end of the word.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tidy up the get/set accessors for the fp state to add missing ones
and make them all inline in softfloat.h rather than some inline and
some not.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The float64_to_uint32_round_to_zero routine is incorrect.
For example, the following test pattern:
425F81378DC0CD1F / 0x1.f81378dc0cd1fp+38
will erroneously set the inexact flag.
This patch re-implements the routine to use the float64_to_uint64_round_to_zero
routine. If saturation occurs we ignore any flags set by the
conversion function and raise only Invalid.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Message-id: 1387397961-4894-6-git-send-email-tommusta@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The float64_to_uint32 has several flaws:
- for numbers between 2**32 and 2**64, the inexact exception flag
may get incorrectly set. In this case, only the invalid flag
should be set.
test pattern: 425F81378DC0CD1F / 0x1.f81378dc0cd1fp+38
- for numbers between 2**63 and 2**64, incorrect results may
be produced:
test pattern: 43EAAF73F1F0B8BD / 0x1.aaf73f1f0b8bdp+63
This patch re-implements float64_to_uint32 to re-use the
float64_to_uint64 routine (instead of float64_to_int64). For the
saturation case, we ignore any flags which the conversion routine
has set and raise only the invalid flag.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Message-id: 1387397961-4894-5-git-send-email-tommusta@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The float64_to_uint64_round_to_zero routine is incorrect.
For example, the following test pattern:
46697351FF4AEC29 / 0x1.97351ff4aec29p+103
currently produces 8000000000000000 instead of FFFFFFFFFFFFFFFF.
This patch re-implements the routine to temporarily force the
rounding mode and use the float64_to_uint64 routine.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Message-id: 1387397961-4894-4-git-send-email-tommusta@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
This patch adds the float32_to_uint64() routine, which converts a
32-bit floating point number to an unsigned 64 bit number.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: removed harmless but silly int64_t casts]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
If the input to float*_scalbn() is denormal then it represents
a number 0.[mantissabits] * 2^(1-exponentbias) (and the actual
exponent field is all zeroes). This means that when we convert
it to our unpacked encoding the unpacked exponent must be one
greater than for a normal number, which represents
1.[mantissabits] * 2^(e-exponentbias) for an exponent field e.
This meant we were giving answers too small by a factor of 2 for
all denormal inputs.
Note that the float-to-int routines also have this behaviour
of not adjusting the exponent for denormals; however there it is
harmless because denormals will all convert to integer zero anyway.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
We implement a number of float-to-integer conversions using conversion
to an integer type with a wider range and then a check against the
narrower range we are actually converting to. If we find the result to
be out of range we correctly raise the Invalid exception, but we must
also suppress other exceptions which might have been raised by the
conversion function we called.
This won't throw away exceptions we should have preserved, because for
the 'core' exception flags the IEEE spec mandates that the only valid
combinations of exception that can be raised by a single operation are
Inexact + Overflow and Inexact + Underflow. For the non-IEEE softfloat
flag for input denormals, we can guarantee that that flag won't have
been set for out of range float-to-int conversions because a squashed
denormal by definition goes to plus or minus zero, which is always in
range after conversion to integer zero.
This bug has been fixed for some of the float-to-int conversion routines
by previous patches; fix it for the remaining functions as well, so
that they all restore the pre-conversion status flags prior to raising
Invalid.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The comment preceding the float64_to_uint64 routine suggests that
the implementation is broken. And this is, indeed, the case.
This patch properly implements the conversion of a 64-bit floating
point number to an unsigned, 64 bit integer.
This contribution can be licensed under either the softfloat-2a or -2b
license.
Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Currently the int-to-float functions take types which are specified
as "at least X bits wide", rather than "exactly X bits wide". This is
confusing and unhelpful since it means that the callers have to include
an explicit cast to [u]intXX_t to ensure the correct behaviour. Fix
them all to take the exactly-X-bits-wide types instead.
Note that this doesn't change behaviour at all since at the moment
we happen to define the 'int32' and 'uint32' types as exactly 32 bits
wide, and the 'int64' and 'uint64' types as exactly 64 bits wide.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
ARMv8 requires support for converting 32 and 64bit floating point
values to signed and unsigned 16bit integers.
Signed-off-by: Will Newton <will.newton@linaro.org>
[PMM: updated not to incorrectly set Inexact for Invalid inputs]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Our float32 to float16 conversion routine was generating the correct
numerical answers, but not always setting the right set of exception
flags. Fix this, mostly by rearranging the code to more closely
resemble RoundAndPackFloat*, and in particular:
* non-IEEE halfprec always raises Invalid for input NaNs
* we need to check for the overflow case before underflow
* we weren't getting the tininess-detected-after-rounding
case correct (somewhat academic since only ARM uses halfprec
and it is always tininess-detected-before-rounding)
* non-IEEE halfprec overflow raises only Invalid, not
Invalid + Inexact
* we weren't setting Inexact when we should
Also add some clarifying comments about what the code is doing.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Add floatnn_minnum() and floatnn_maxnum() functions which are equivalent
to the minNum() and maxNum() functions from IEEE 754-2008. They are
similar to min() and max() but differ in the handling of QNaN arguments.
Signed-off-by: Will Newton <will.newton@linaro.org>
Message-id: 1386158099-9239-5-git-send-email-will.newton@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The nan_exp argument is not used, so remove it.
Signed-off-by: Will Newton <will.newton@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1386158099-9239-4-git-send-email-will.newton@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
shift128Right would give the wrong result for a shift count
between 64 and 127. This was never noticed because all of
our uses of this function are guaranteed not to use shift
counts in this range.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1370186269-24353-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In handling float64_muladd, if we end up doing a subtraction of the
product and c, and the 128 bit result of this subtraction happens to
have its most significant bit in bit 63, we weren't handling this
correctly when attempting to normalize to put the most significant
bit into bit 126. We would end up doing a right shift by a negative
number (undefined behaviour in C) so at best we would return an
incorrect result to the guest. MSB in bit 63 has to be handled as a
special case separately from MSB in 0..62 and MSB in 63..126. (MSB
in 127 is not possible.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Honour float_muladd_negate_c in the case where the product is zero and
c is nonzero. Previously we would fail to negate c.
Seen in (and tested against) the gfortran testsuite on MIPS.
Signed-off-by: Richard Sandiford <rdsandiford@googlemail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The interface to normalizeRoundAndPackFloat64 requires that the
high bit be clear. Perform one shift-right-and-jam if needed.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Add a pickNaNMulAdd function for MIPS, implementing NaN propagation
rules for MIPS fused multiply-add instructions.
Cc: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The uint64_to_float32() conversion function was incorrectly always
returning numbers with the sign bit set (ie negative numbers). Correct
this so we return positive numbers instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
In float16_to_float32, when returning an infinity, just pass zero
as the mantissa argument to packFloat32(), rather than shifting
a value which we know must be zero.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
NaN propagation rule: leftmost NaN in the expression gets propagated to
the result.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Architectures that don't have signaling NaNs can define
NO_SIGNALING_NANS, it will make float*_is_quiet_nan return 1 for any NaN
and float*_is_signaling_nan always return 0.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Flags passed into float{32,64}_muladd are treated as bits; assign
independent bits to float_muladd_negate_* to allow precise control over
what gets negated in float{32,64}_muladd.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Based on the following Coccinelle patch:
@@
typedef int16, int_fast16_t;
@@
-int16
+int_fast16_t
Avoids a workaround for AIX.
Add typedef for pre-10 Solaris.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: malc <av1474@comtv.ru>
Cc: Ben Taylor <bentaylor.solx86@gmail.com>
Tested-by: Bernhard Walle <bernhard@bwalle.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Based on the following Coccinelle patch:
@@
typedef uint16, uint_fast16_t;
@@
-uint16
+uint_fast16_t
Fixes the build of the Cocoa frontend on Mac OS X and avoids a
workaround for AIX.
For pre-10 Solaris include osdep.h.
Reported-by: Pavel Borzenkov <pavel.borzenkov@gmail.com>
Reported-by: Rui Carmo <rui.carmo@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Juan Pineda <juan@logician.com>
Cc: malc <av1474@comtv.ru>
Cc: Ben Taylor <bentaylor.solx86@gmail.com>
Tested-by: Bernhard Walle <bernhard@bwalle.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
normalizeFloat{32,64}Subnormal() expect the exponent as int16, not int.
This went unnoticed since int16 and uint16 were both typedef'ed to int.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Bernhard Walle <bernhard@bwalle.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This change makes it compile and return the same value than the #undef one.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fix code in roundAndPackInt32 that assumed that int32 was only
32 bits, by simply using int32_t instead. Fix the parallel bug
in roundAndPackInt64 as well, although that one is only theoretical
since it's unlikely that int64 will ever be more than 64 bits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Code in the float64_to_int32_round_to_zero() function was assuming
that int32 would not be wider than 32 bits; this meant it might
not correctly detect the overflow case. We take the simple approach
of using int32_t. Also fix equivalent issues in the functions
for other float sizes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
C99 appears to consider compound literals as non-constants, and complains
when they are used in static initializers. Switch to ordinary initializer
syntax.
Signed-off-by: Avi Kivity <avi@redhat.com>
Acked-by: Andreas Färber <afaerber@suse.de>
Reported-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Implement fused multiply-add as a softfloat primitive. This implements
"a+b*c" as a single step without any intermediate rounding; it is
specified in IEEE 754-2008 and implemented in a number of CPUs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Include config.h in softfloat.c, so that the target specific ifdefs in
softfloat-specialize.h are evaluated correctly. This was accidentally
broken in commit 789ec7ce2 when config-target.h was removed from
softfloat.h, and means that most targets will have been returning the
wrong results for calculations involving NaNs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Prepares for uint32 replacement.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Prepares for uint16 replacement.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Most definitions in softfloat.h are really target-independent, but the
file is not because it includes definitions of the default NaN values.
Change those to variables to allow including softfloat.h from files that
are not compiled per-target. By making them const, the compiler is
allowed to optimize them into softfloat functions that use them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
float*_is_zero_or_denormal() is available for float32, but not for
float64, floatx80 and float128. Fix that.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now that softfloat-native is gone, there is no real point on not always
enabling floatx80 and float128 support.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Remove softfloat-native support, all targets are now using softfloat
instead.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add a new float_flag_output_denormal which is set when the result
of a floating point operation would be denormal but is flushed to
zero because we are in flush_to_zero mode. This is necessary because
some architectures signal this condition as an underflow and others
signal it as an inexact result.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add float*_is_any_nan() functions to match the softfloat API.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
float*_scalbn() should be able to take a status parameter. Fix that.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
float*_scalnb() were not taking into account all cases. This patch fixes
some corner cases:
- NaN values in input were not properly propagated and the invalid flag
not correctly raised. Use propagateFloat*NaN() for that.
- NaN or infinite values in input of floatx80_scalnb() were not correctly
detected due to a typo.
- The sum of exponent and n could overflow, leading to strange results.
Additionally having int16 defined to int make that happening for a very
small range of values. Fix that by saturating n to the maximum exponent
range, and using an explicit wider type if needed.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add floatx80_compare() and floatx80_compare_quiet() functions to match
the softfloat-native ones.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add a pi constant for float32, float64, floatx80. It will be used by
target-i386 and later by the trigonometric functions.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
With floatx80, the explicit bit is set for infinity.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The floatx80 format uses an explicit bit that should be taken into account
when converting to and from commonNaN format.
When converting to commonNaN, the explicit bit should be removed if it is
a 1, and a default NaN should be used if it is 0.
When converting from commonNan, the explicit bit should be added.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Make clear for all comparison functions which ones trigger an exception
for all NaNs, and which one only for sNaNs.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
I am not a big fan of code moving, but having the signaling version in
the middle of quiet versions and vice versa doesn't make the code easy
to read.
This patch is a simple code move, basically swapping locations of
float*_eq and float*_eq_quiet.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
float*_eq_signaling functions have a different semantics than other
comparison functions. Fix that by renaming float*_quiet_signaling() into
float*_eq().
Note that it is purely mechanical, and the behaviour should be unchanged.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
float*_eq functions have a different semantics than other comparison
functions. Fix that by first renaming float*_quiet() into float*_eq_quiet().
Note that it is purely mechanical, and the behaviour should be unchanged.
That said it clearly highlight problems due to this different semantics,
they are fixed later in this patch series.
Cc: Alexander Graf <agraf@suse.de>
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add float*_unordered_quiet() functions to march the softfloat versions.
As FPU status is not tracked with softfloat-native, they don't differ
from the signaling version.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add float*_unordered() functions to softfloat, matching the softfloat-native
ones. Also add float*_unordered_quiet() functions to match the others
comparison functions.
This allow target-i386/ops_sse.h to be compiled with softfloat.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Softfloat has its own implementation to count the leading zeros. However
a lot of architectures have either a dedicated instruction or an
optimized to do that. When using GCC >= 3.4, this patch uses GCC builtins
instead of the handcoded implementation.
Note that I amware that QEMU_GNUC_PREREQ is defined in osdep.h and that
clz32() and clz64() are defined in host-utils.h, but I think it is better
to keep the softfloat implementation self contained.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add a setter function for the underflow tininess detection mode,
in line with the similar functions for other parts of the float status
structure.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add min and max operations to softfloat. This allows us to implement
propagation of NaNs and handling of negative zero correctly (unlike
the approach of having target helper routines return one of the operands
based on the result of a comparison op).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
They are defined with the same semantics as the POSIX types,
so prefer those for consistency. Suggested by Peter Maydell.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The original SoftFloat 2.0b library avoided the use of custom integer types
in its public headers. This requires the definitions of int{8,16,32,64} to
match the assumptions in the declarations. This breaks on BeOS R5 and Haiku/x86,
where int32 is defined in {be,os}/support/SupportDefs.h in terms of a long
rather than an int. Spotted by Michael Lotz.
Since QEMU already breaks this distinction by defining those types just above,
do use them for consistency and to allow #ifndef'ing them out as done for
[u]int16 on AIX.
Cc: Michael Lotz <mmlr@mlotz.ch>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The SoftFloat license requires "prominent notice that the work
is derivative". Having added features like improved 16-bit support
for arm already, add such a notice to the sources.
softfloat-native.[ch] are not under the SoftFloat license
and thus are not changed.
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
These constants and utility function are needed to implement some
helpers. Defining constants avoids the need to re-compute them at
runtime.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
These special values are needed to implement some helper functions,
which return/use these values in some cases.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Make softfloat compile with USE_SOFTFLOAT_STRUCT_TYPES defined, by
adding and using new macros const_float16(), const_float32() and
const_float64() so you can use array initializers in an array of
float16/float32/float64 whether the types are bare or wrapped in the
structs.
[aurelien@aurel32.net: do the same for float16]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correctly handle NaNs in float16_to_float32(), by defining and
using a float16ToCommonNaN() function, as we do with the other formats.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix various bugs in the single-to-half-precision conversion code:
* input NaNs not correctly converted in IEEE mode
(fixed by defining and using a commonNaNToFloat16())
* wrong values returned when converting NaN/Inf into non-IEEE
half precision value
* wrong values returned for conversion of values which are
on the boundary between denormal and zero for the half
precision format
* zeroes not correctly identified
* excessively large results in non-IEEE mode should
generate InvalidOp, not Overflow
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Honour the default_nan_mode flag when doing conversions between
different floating point formats, as well as when returning a NaN from
a two-operand floating point function. This corrects the behaviour
of float<->double conversions on both ARM and SH4.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add a float16 type to softfloat, rather than using bits16 directly.
Also add the missing functions float16_is_quiet_nan(),
float16_is_signaling_nan() and float16_maybe_silence_nan(),
which are needed for the float16 conversion routines.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
floatx80_is_{quiet,signaling}_nan() functions are incorrectly detecting
the type of NaN, depending on SNAN_BIT_IS_ONE, one of the two is
returning the correct value, and the other true for any kind of NaN.
This patch fixes that by applying the same kind of comparison as for
other float formats, but taking into account the explicit bit.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add a utility function to softfloat to test whether a float32
is zero or denormal.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When the default-NaN mode is enabled, it should return the default NaN
value, but it should anyway raise the invalid operation flag if one of
the operand is an sNaN.
I have checked that this behavior matches the ARM and SH4 manuals, as
well as real SH4 hardware.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement versions of float*_is_any_nan() for the floatx80 and
float128 types.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Add support to softfloat for flushing input denormal float32 and float64
to zero. softfloat's existing 'flush_to_zero' flag only flushes denormals
to zero on output. Some CPUs need input denormals to be flushed before
processing as well. Implement this, using a new status flag to enable it
and a new exception status bit to indicate when it has happened. Existing
CPUs should be unaffected as there is no behaviour change unless the
mode is enabled.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement the correct NaN propagation rules for PowerPC targets by
providing an appropriate pickNaN function.
Also fix the #ifdef tests for default NaN definition, the correct name
is TARGET_PPC instead of TARGET_POWERPC.
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement the correct NaN propagation rules for MIPS targets by
providing an appropriate pickNaN function.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use float{32,64,x80,128}_maybe_silence_nan() instead of toggling the
sNaN bit manually. This allow per target implementation of sNaN to qNaN
conversion.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Add float{x80,128}_maybe_silence_nan() functions, they will be need by
propagateFloat{x80,128}NaN().
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
On targets that define sNaN with the sNaN bit as one, simply clearing
this bit may correspond to an infinite value.
Convert it to a default NaN if SNAN_BIT_IS_ONE, as it corresponds to
the MIPS implementation, the only emulated CPU with SNAN_BIT_IS_ONE.
When other CPU of this type are added, this might be updated to include
more cases.
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Similarly to what has been done in commit
185698715d rename the misnamed *IsNaN
variables into *IsQuietNaN.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
We don't have any HPPA target, so let's remove HPPA specific code. It
can be re-added when someone adds an HPPA target.
This has been blessed by Stuart Brady <sdb@zubnet.me.uk>, author of the
target-hppa fork.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement the correct NaN propagation rules for ARM targets by
providing an appropriate pickNaN function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
IEEE754 doesn't specify precisely what NaN should be returned as
the result of an operation on two input NaNs. This is therefore
target-specific. Abstract out the code in propagateFloat*NaN()
which was implementing the x87 propagation rules, so that it
can be easily replaced on a per-target basis.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The softfloat functions float*_is_nan() were badly misnamed,
because they return true only for quiet NaNs, not for all NaNs.
Rename them to float*_is_quiet_nan() to more accurately reflect
what they do.
This change was produced by:
perl -p -i -e 's/_is_nan/_is_quiet_nan/g' $(git grep -l is_nan)
(with the results manually checked.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The ARM architecture needs float/double to 16 bit integer conversions.
(The 32 bit versions aren't sufficient because of the requirement
to saturate at 16 bit MAXINT/MININT and to get the exception bits right.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Add functions float*_maybe_silence_nan() which ensure that a
value is not a signaling NaN by turning it into a quiet NaN.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>