Motorola treats denormals with explicit integer bit set as
having unbiased exponent 0, unlike Intel which treats it as
having unbiased exponent 1 (more like all other IEEE formats
that have no explicit integer bit).
Add a flag on FloatFmt to differentiate the behaviour.
Reported-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add versions of float64_to_int* which do not saturate the result.
Reviewed-by: Christoph Muellner <christoph.muellner@vrull.eu>
Tested-by: Christoph Muellner <christoph.muellner@vrull.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230527141910.1885950-2-richard.henderson@linaro.org>
logB(0) should raise divideByZero exception from IEEE 754-2008 spec 7.3
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220930024510.800005-4-gaosong@loongson.cn>
Added the possibility of recalculating a result if it overflows or
underflows, if the result overflow and the rebias bool is true then the
intermediate result should have 3/4 of the total range subtracted from
the exponent. The same for underflow but it should be added to the
exponent of the intermediate number instead.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220805141522.412864-2-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
As the return type is FloatRelation, it's clearer to
use the type for 'cmp' within the function.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220401132240.79730-3-richard.henderson@linaro.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-8-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-7-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-6-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has these flags, and it's easier to compute them here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-5-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-4-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-3-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
For "fmax/fmin ft0, ft1, ft2" and if one of the inputs is sNaN,
The original logic:
Return NaN and set invalid flag if ft1 == sNaN || ft2 == sNan.
The alternative path:
Set invalid flag if ft1 == sNaN || ft2 == sNaN.
Return NaN only if ft1 == NaN && ft2 == NaN.
The IEEE 754 spec allows both implementation and some architecture such
as riscv choose different defintions in two spec versions.
(riscv-spec-v2.2 use original version, riscv-spec-20191213 changes to
alternative)
Signed-off-by: Chih-Min Chao <chihmin.chao@sifive.com>
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211021160847.2748577-2-frank.chang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Rename to parts$N_modrem. This was the last use of a lot
of the legacy infrastructure, so remove it as required.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_log2. Though this is partly a ruse, since I do not
believe the code will succeed for float128 without work. Which is ok
for now, because we do not need this for more than float32 and float64.
Since berkeley-testfloat-3 doesn't support log2, compare float64_log2
vs the system log2. Fix the errors for inputs near 1.0:
test: 3ff00000000000b0 +0x1.00000000000b0p+0
sf: 3d2fa00000000000 +0x1.fa00000000000p-45
libm: 3d2fbd422b1bd36f +0x1.fbd422b1bd36fp-45
Error in fraction: 32170028290927 ulp
test: 3feec24f6770b100 +0x1.ec24f6770b100p-1
sf: bfad3740d13c9ec0 -0x1.d3740d13c9ec0p-5
libm: bfad3740d13c9e98 -0x1.d3740d13c9e98p-5
Error in fraction: 40 ulp
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With floatx80_precision_x, the rounding happens across
the break between words. Notice this case with
frac_lsb = round_mask + 1 -> 0
and check the bits in frac_hi as needed.
In addition, since frac_shift == 0, we won't implicitly clear
round_mask via the right-shift, so explicitly clear those bits.
This fixes rounding for floatx80_precision_[sd].
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove frac_lsb, frac_lsbm1, roundeven_mask. Compute
these from round_mask in parts$N_uncanon_normal.
With floatx80, round_mask will not be tied to frac_shift.
Everything else is easily computable.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need to treat the non-normal cases of floatx80 specially,
so split out the normal case that we can reuse.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_sqrt.
Reimplement float128_sqrt with FloatParts128.
Reimplement with the inverse sqrt newton-raphson algorithm from musl.
This is significantly faster than even the berkeley sqrt n-r algorithm,
because it does not use division instructions, only multiplication.
Ordinarily, changing algorithms at the same time as migrating code is
a bad idea, but this is the only way I found that didn't break one of
the routines at the same time.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_scalbn.
Reimplement float128_scalbn with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_compare. Rename all of the intermediate
functions to ftype_do_compare. Rename the hard-float functions
to ftype_hs_compare. Convert float128 to FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_minmax. Combine 3 bool arguments to a bitmask.
Introduce ftype_minmax functions as a common optimization point.
Fold bfloat16 expansions into the same macro as the other types.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_uint_to_float.
Reimplement uint64_to_float128 with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_sint_to_float.
Reimplement int{32,64}_to_float128 with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_float_to_uint. Reimplement
float128_to_uint{32,64}{_round_to_zero} with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For Arm BFDOT and BFMMLA, we need a version of round-to-odd
that overflows to infinity, instead of the max normal number.
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rename to parts$N_float_to_sint. Reimplement
float128_to_int{32,64}{_round_to_zero} with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, split out
parts$N_round_to_int_normal, define a macro for
parts_round_to_int using QEMU_GENERIC.
This necessarily meant some rearrangement to the
rount_to_{,u}int_and_pack routines, so go ahead and
convert to parts_round_to_int_normal, which in turn
allows cleaning up of the raised exception handling.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_div.
Implement float128_div with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_muladd.
Implement float128_muladd with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_mul.
Reimplement float128_mul with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for implementing multiple sizes. Rename to parts_addsub,
split out parts_add/sub_normal for future reuse with muladd.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, renaming to parts$N_uncanon,
and define a macro for parts_uncanon using QEMU_GENERIC.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, rename to parts$N_canonicalize
and define a macro for parts_canonicalize using QEMU_GENERIC.
Rearrange the cases to recognize float_class_normal as
early as possible.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, rename to pick_nan_muladd$N
and define a macro for pick_nan_muladd using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, rename to parts$N_pick_nan
and define a macro for parts_pick_nan using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, rename to return_nan$N
and define a macro for return_nan using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>