c869df6afb
symbols--even when the branches are local and would otherwise work with bxx.b. Compensate for this by shadowing the relevant labels with local labels.
948 lines
29 KiB
Plaintext
948 lines
29 KiB
Plaintext
* $NetBSD: bindec.sa,v 1.5 2001/12/09 01:43:13 briggs Exp $
|
|
|
|
* MOTOROLA MICROPROCESSOR & MEMORY TECHNOLOGY GROUP
|
|
* M68000 Hi-Performance Microprocessor Division
|
|
* M68040 Software Package
|
|
*
|
|
* M68040 Software Package Copyright (c) 1993, 1994 Motorola Inc.
|
|
* All rights reserved.
|
|
*
|
|
* THE SOFTWARE is provided on an "AS IS" basis and without warranty.
|
|
* To the maximum extent permitted by applicable law,
|
|
* MOTOROLA DISCLAIMS ALL WARRANTIES WHETHER EXPRESS OR IMPLIED,
|
|
* INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
|
|
* PARTICULAR PURPOSE and any warranty against infringement with
|
|
* regard to the SOFTWARE (INCLUDING ANY MODIFIED VERSIONS THEREOF)
|
|
* and any accompanying written materials.
|
|
*
|
|
* To the maximum extent permitted by applicable law,
|
|
* IN NO EVENT SHALL MOTOROLA BE LIABLE FOR ANY DAMAGES WHATSOEVER
|
|
* (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
|
|
* PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR
|
|
* OTHER PECUNIARY LOSS) ARISING OF THE USE OR INABILITY TO USE THE
|
|
* SOFTWARE. Motorola assumes no responsibility for the maintenance
|
|
* and support of the SOFTWARE.
|
|
*
|
|
* You are hereby granted a copyright license to use, modify, and
|
|
* distribute the SOFTWARE so long as this entire notice is retained
|
|
* without alteration in any modified and/or redistributed versions,
|
|
* and that such modified versions are clearly identified as such.
|
|
* No licenses are granted by implication, estoppel or otherwise
|
|
* under any patents or trademarks of Motorola, Inc.
|
|
|
|
*
|
|
* bindec.sa 3.4 1/3/91
|
|
*
|
|
* bindec
|
|
*
|
|
* Description:
|
|
* Converts an input in extended precision format
|
|
* to bcd format.
|
|
*
|
|
* Input:
|
|
* a0 points to the input extended precision value
|
|
* value in memory; d0 contains the k-factor sign-extended
|
|
* to 32-bits. The input may be either normalized,
|
|
* unnormalized, or denormalized.
|
|
*
|
|
* Output: result in the FP_SCR1 space on the stack.
|
|
*
|
|
* Saves and Modifies: D2-D7,A2,FP2
|
|
*
|
|
* Algorithm:
|
|
*
|
|
* A1. Set RM and size ext; Set SIGMA = sign of input.
|
|
* The k-factor is saved for use in d7. Clear the
|
|
* BINDEC_FLG for separating normalized/denormalized
|
|
* input. If input is unnormalized or denormalized,
|
|
* normalize it.
|
|
*
|
|
* A2. Set X = abs(input).
|
|
*
|
|
* A3. Compute ILOG.
|
|
* ILOG is the log base 10 of the input value. It is
|
|
* approximated by adding e + 0.f when the original
|
|
* value is viewed as 2^^e * 1.f in extended precision.
|
|
* This value is stored in d6.
|
|
*
|
|
* A4. Clr INEX bit.
|
|
* The operation in A3 above may have set INEX2.
|
|
*
|
|
* A5. Set ICTR = 0;
|
|
* ICTR is a flag used in A13. It must be set before the
|
|
* loop entry A6.
|
|
*
|
|
* A6. Calculate LEN.
|
|
* LEN is the number of digits to be displayed. The
|
|
* k-factor can dictate either the total number of digits,
|
|
* if it is a positive number, or the number of digits
|
|
* after the decimal point which are to be included as
|
|
* significant. See the 68882 manual for examples.
|
|
* If LEN is computed to be greater than 17, set OPERR in
|
|
* USER_FPSR. LEN is stored in d4.
|
|
*
|
|
* A7. Calculate SCALE.
|
|
* SCALE is equal to 10^ISCALE, where ISCALE is the number
|
|
* of decimal places needed to insure LEN integer digits
|
|
* in the output before conversion to bcd. LAMBDA is the
|
|
* sign of ISCALE, used in A9. Fp1 contains
|
|
* 10^^(abs(ISCALE)) using a rounding mode which is a
|
|
* function of the original rounding mode and the signs
|
|
* of ISCALE and X. A table is given in the code.
|
|
*
|
|
* A8. Clr INEX; Force RZ.
|
|
* The operation in A3 above may have set INEX2.
|
|
* RZ mode is forced for the scaling operation to insure
|
|
* only one rounding error. The grs bits are collected in
|
|
* the INEX flag for use in A10.
|
|
*
|
|
* A9. Scale X -> Y.
|
|
* The mantissa is scaled to the desired number of
|
|
* significant digits. The excess digits are collected
|
|
* in INEX2.
|
|
*
|
|
* A10. Or in INEX.
|
|
* If INEX is set, round error occurred. This is
|
|
* compensated for by 'or-ing' in the INEX2 flag to
|
|
* the lsb of Y.
|
|
*
|
|
* A11. Restore original FPCR; set size ext.
|
|
* Perform FINT operation in the user's rounding mode.
|
|
* Keep the size to extended.
|
|
*
|
|
* A12. Calculate YINT = FINT(Y) according to user's rounding
|
|
* mode. The FPSP routine sintd0 is used. The output
|
|
* is in fp0.
|
|
*
|
|
* A13. Check for LEN digits.
|
|
* If the int operation results in more than LEN digits,
|
|
* or less than LEN -1 digits, adjust ILOG and repeat from
|
|
* A6. This test occurs only on the first pass. If the
|
|
* result is exactly 10^LEN, decrement ILOG and divide
|
|
* the mantissa by 10.
|
|
*
|
|
* A14. Convert the mantissa to bcd.
|
|
* The binstr routine is used to convert the LEN digit
|
|
* mantissa to bcd in memory. The input to binstr is
|
|
* to be a fraction; i.e. (mantissa)/10^LEN and adjusted
|
|
* such that the decimal point is to the left of bit 63.
|
|
* The bcd digits are stored in the correct position in
|
|
* the final string area in memory.
|
|
*
|
|
* A15. Convert the exponent to bcd.
|
|
* As in A14 above, the exp is converted to bcd and the
|
|
* digits are stored in the final string.
|
|
* Test the length of the final exponent string. If the
|
|
* length is 4, set operr.
|
|
*
|
|
* A16. Write sign bits to final string.
|
|
*
|
|
* Implementation Notes:
|
|
*
|
|
* The registers are used as follows:
|
|
*
|
|
* d0: scratch; LEN input to binstr
|
|
* d1: scratch
|
|
* d2: upper 32-bits of mantissa for binstr
|
|
* d3: scratch;lower 32-bits of mantissa for binstr
|
|
* d4: LEN
|
|
* d5: LAMBDA/ICTR
|
|
* d6: ILOG
|
|
* d7: k-factor
|
|
* a0: ptr for original operand/final result
|
|
* a1: scratch pointer
|
|
* a2: pointer to FP_X; abs(original value) in ext
|
|
* fp0: scratch
|
|
* fp1: scratch
|
|
* fp2: scratch
|
|
* F_SCR1:
|
|
* F_SCR2:
|
|
* L_SCR1:
|
|
* L_SCR2:
|
|
*
|
|
|
|
BINDEC IDNT 2,1 Motorola 040 Floating Point Software Package
|
|
|
|
include fpsp.h
|
|
|
|
section 8
|
|
|
|
* Constants in extended precision
|
|
LOG2 dc.l $3FFD0000,$9A209A84,$FBCFF798,$00000000
|
|
LOG2UP1 dc.l $3FFD0000,$9A209A84,$FBCFF799,$00000000
|
|
|
|
* Constants in single precision
|
|
FONE dc.l $3F800000,$00000000,$00000000,$00000000
|
|
FTWO dc.l $40000000,$00000000,$00000000,$00000000
|
|
FTEN dc.l $41200000,$00000000,$00000000,$00000000
|
|
F4933 dc.l $459A2800,$00000000,$00000000,$00000000
|
|
|
|
RBDTBL dc.b 0,0,0,0
|
|
dc.b 3,3,2,2
|
|
dc.b 3,2,2,3
|
|
dc.b 2,3,3,2
|
|
|
|
xref binstr
|
|
xref sintdo
|
|
xref ptenrn,ptenrm,ptenrp
|
|
|
|
xdef bindec
|
|
xdef sc_mul
|
|
bindec:
|
|
movem.l d2-d7/a2,-(a7)
|
|
fmovem.x fp0-fp2,-(a7)
|
|
|
|
* A1. Set RM and size ext. Set SIGMA = sign input;
|
|
* The k-factor is saved for use in d7. Clear BINDEC_FLG for
|
|
* separating normalized/denormalized input. If the input
|
|
* is a denormalized number, set the BINDEC_FLG memory word
|
|
* to signal denorm. If the input is unnormalized, normalize
|
|
* the input and test for denormalized result.
|
|
*
|
|
fmove.l #rm_mode,FPCR ;set RM and ext
|
|
move.l (a0),L_SCR2(a6) ;save exponent for sign check
|
|
move.l d0,d7 ;move k-factor to d7
|
|
clr.b BINDEC_FLG(a6) ;clr norm/denorm flag
|
|
move.w STAG(a6),d0 ;get stag
|
|
andi.w #$e000,d0 ;isolate stag bits
|
|
beq A2_str ;if zero, input is norm
|
|
*
|
|
* Normalize the denorm
|
|
*
|
|
un_de_norm:
|
|
move.w (a0),d0
|
|
andi.w #$7fff,d0 ;strip sign of normalized exp
|
|
move.l 4(a0),d1
|
|
move.l 8(a0),d2
|
|
norm_loop:
|
|
sub.w #1,d0
|
|
add.l d2,d2
|
|
addx.l d1,d1
|
|
tst.l d1
|
|
bge.b norm_loop
|
|
*
|
|
* Test if the normalized input is denormalized
|
|
*
|
|
tst.w d0
|
|
bgt.b pos_exp ;if greater than zero, it is a norm
|
|
st BINDEC_FLG(a6) ;set flag for denorm
|
|
pos_exp:
|
|
andi.w #$7fff,d0 ;strip sign of normalized exp
|
|
move.w d0,(a0)
|
|
move.l d1,4(a0)
|
|
move.l d2,8(a0)
|
|
|
|
* A2. Set X = abs(input).
|
|
*
|
|
A2_str:
|
|
move.l (a0),FP_SCR2(a6) ; move input to work space
|
|
move.l 4(a0),FP_SCR2+4(a6) ; move input to work space
|
|
move.l 8(a0),FP_SCR2+8(a6) ; move input to work space
|
|
andi.l #$7fffffff,FP_SCR2(a6) ;create abs(X)
|
|
|
|
* A3. Compute ILOG.
|
|
* ILOG is the log base 10 of the input value. It is approx-
|
|
* imated by adding e + 0.f when the original value is viewed
|
|
* as 2^^e * 1.f in extended precision. This value is stored
|
|
* in d6.
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: k-factor/exponent
|
|
* d2: x/x
|
|
* d3: x/x
|
|
* d4: x/x
|
|
* d5: x/x
|
|
* d6: x/ILOG
|
|
* d7: k-factor/Unchanged
|
|
* a0: ptr for original operand/final result
|
|
* a1: x/x
|
|
* a2: x/x
|
|
* fp0: x/float(ILOG)
|
|
* fp1: x/x
|
|
* fp2: x/x
|
|
* F_SCR1:x/x
|
|
* F_SCR2:Abs(X)/Abs(X) with $3fff exponent
|
|
* L_SCR1:x/x
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
tst.b BINDEC_FLG(a6) ;check for denorm
|
|
beq.b A3_cont ;if clr, continue with norm
|
|
move.l #-4933,d6 ;force ILOG = -4933
|
|
bra.b A4_str
|
|
A3_cont:
|
|
move.w FP_SCR2(a6),d0 ;move exp to d0
|
|
move.w #$3fff,FP_SCR2(a6) ;replace exponent with 0x3fff
|
|
fmove.x FP_SCR2(a6),fp0 ;now fp0 has 1.f
|
|
sub.w #$3fff,d0 ;strip off bias
|
|
fadd.w d0,fp0 ;add in exp
|
|
fsub.s FONE,fp0 ;subtract off 1.0
|
|
fbge.w pos_res ;if pos, branch
|
|
fmul.x LOG2UP1,fp0 ;if neg, mul by LOG2UP1
|
|
fmove.l fp0,d6 ;put ILOG in d6 as a lword
|
|
bra.b A4_str ;go move out ILOG
|
|
pos_res:
|
|
fmul.x LOG2,fp0 ;if pos, mul by LOG2
|
|
fmove.l fp0,d6 ;put ILOG in d6 as a lword
|
|
|
|
|
|
* A4. Clr INEX bit.
|
|
* The operation in A3 above may have set INEX2.
|
|
|
|
A4_str:
|
|
fmove.l #0,FPSR ;zero all of fpsr - nothing needed
|
|
|
|
|
|
* A5. Set ICTR = 0;
|
|
* ICTR is a flag used in A13. It must be set before the
|
|
* loop entry A6. The lower word of d5 is used for ICTR.
|
|
|
|
clr.w d5 ;clear ICTR
|
|
|
|
|
|
* A6. Calculate LEN.
|
|
* LEN is the number of digits to be displayed. The k-factor
|
|
* can dictate either the total number of digits, if it is
|
|
* a positive number, or the number of digits after the
|
|
* original decimal point which are to be included as
|
|
* significant. See the 68882 manual for examples.
|
|
* If LEN is computed to be greater than 17, set OPERR in
|
|
* USER_FPSR. LEN is stored in d4.
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: exponent/Unchanged
|
|
* d2: x/x/scratch
|
|
* d3: x/x
|
|
* d4: exc picture/LEN
|
|
* d5: ICTR/Unchanged
|
|
* d6: ILOG/Unchanged
|
|
* d7: k-factor/Unchanged
|
|
* a0: ptr for original operand/final result
|
|
* a1: x/x
|
|
* a2: x/x
|
|
* fp0: float(ILOG)/Unchanged
|
|
* fp1: x/x
|
|
* fp2: x/x
|
|
* F_SCR1:x/x
|
|
* F_SCR2:Abs(X) with $3fff exponent/Unchanged
|
|
* L_SCR1:x/x
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
A6_str:
|
|
tst.l d7 ;branch on sign of k
|
|
ble.b k_neg ;if k <= 0, LEN = ILOG + 1 - k
|
|
move.l d7,d4 ;if k > 0, LEN = k
|
|
bra.b len_ck ;skip to LEN check
|
|
k_neg:
|
|
move.l d6,d4 ;first load ILOG to d4
|
|
sub.l d7,d4 ;subtract off k
|
|
addq.l #1,d4 ;add in the 1
|
|
len_ck:
|
|
tst.l d4 ;LEN check: branch on sign of LEN
|
|
ble.b LEN_ng ;if neg, set LEN = 1
|
|
cmp.l #17,d4 ;test if LEN > 17
|
|
ble.b A7_str ;if not, forget it
|
|
move.l #17,d4 ;set max LEN = 17
|
|
tst.l d7 ;if negative, never set OPERR
|
|
ble.b A7_str ;if positive, continue
|
|
or.l #opaop_mask,USER_FPSR(a6) ;set OPERR & AIOP in USER_FPSR
|
|
bra.b A7_str ;finished here
|
|
LEN_ng:
|
|
moveq.l #1,d4 ;min LEN is 1
|
|
|
|
|
|
* A7. Calculate SCALE.
|
|
* SCALE is equal to 10^ISCALE, where ISCALE is the number
|
|
* of decimal places needed to insure LEN integer digits
|
|
* in the output before conversion to bcd. LAMBDA is the sign
|
|
* of ISCALE, used in A9. Fp1 contains 10^^(abs(ISCALE)) using
|
|
* the rounding mode as given in the following table (see
|
|
* Coonen, p. 7.23 as ref.; however, the SCALE variable is
|
|
* of opposite sign in bindec.sa from Coonen).
|
|
*
|
|
* Initial USE
|
|
* FPCR[6:5] LAMBDA SIGN(X) FPCR[6:5]
|
|
* ----------------------------------------------
|
|
* RN 00 0 0 00/0 RN
|
|
* RN 00 0 1 00/0 RN
|
|
* RN 00 1 0 00/0 RN
|
|
* RN 00 1 1 00/0 RN
|
|
* RZ 01 0 0 11/3 RP
|
|
* RZ 01 0 1 11/3 RP
|
|
* RZ 01 1 0 10/2 RM
|
|
* RZ 01 1 1 10/2 RM
|
|
* RM 10 0 0 11/3 RP
|
|
* RM 10 0 1 10/2 RM
|
|
* RM 10 1 0 10/2 RM
|
|
* RM 10 1 1 11/3 RP
|
|
* RP 11 0 0 10/2 RM
|
|
* RP 11 0 1 11/3 RP
|
|
* RP 11 1 0 11/3 RP
|
|
* RP 11 1 1 10/2 RM
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: exponent/scratch - final is 0
|
|
* d2: x/0 or 24 for A9
|
|
* d3: x/scratch - offset ptr into PTENRM array
|
|
* d4: LEN/Unchanged
|
|
* d5: 0/ICTR:LAMBDA
|
|
* d6: ILOG/ILOG or k if ((k<=0)&(ILOG<k))
|
|
* d7: k-factor/Unchanged
|
|
* a0: ptr for original operand/final result
|
|
* a1: x/ptr to PTENRM array
|
|
* a2: x/x
|
|
* fp0: float(ILOG)/Unchanged
|
|
* fp1: x/10^ISCALE
|
|
* fp2: x/x
|
|
* F_SCR1:x/x
|
|
* F_SCR2:Abs(X) with $3fff exponent/Unchanged
|
|
* L_SCR1:x/x
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
A7_str:
|
|
tst.l d7 ;test sign of k
|
|
bgt.b k_pos ;if pos and > 0, skip this
|
|
cmp.l d6,d7 ;test k - ILOG
|
|
blt.b k_pos ;if ILOG >= k, skip this
|
|
move.l d7,d6 ;if ((k<0) & (ILOG < k)) ILOG = k
|
|
k_pos:
|
|
move.l d6,d0 ;calc ILOG + 1 - LEN in d0
|
|
addq.l #1,d0 ;add the 1
|
|
sub.l d4,d0 ;sub off LEN
|
|
swap d5 ;use upper word of d5 for LAMBDA
|
|
clr.w d5 ;set it zero initially
|
|
clr.w d2 ;set up d2 for very small case
|
|
tst.l d0 ;test sign of ISCALE
|
|
bge.b iscale ;if pos, skip next inst
|
|
addq.w #1,d5 ;if neg, set LAMBDA true
|
|
cmp.l #$ffffecd4,d0 ;test iscale <= -4908
|
|
bgt.b no_inf ;if false, skip rest
|
|
addi.l #24,d0 ;add in 24 to iscale
|
|
move.l #24,d2 ;put 24 in d2 for A9
|
|
no_inf:
|
|
neg.l d0 ;and take abs of ISCALE
|
|
iscale:
|
|
fmove.s FONE,fp1 ;init fp1 to 1
|
|
bfextu USER_FPCR(a6){26:2},d1 ;get initial rmode bits
|
|
add.w d1,d1 ;put them in bits 2:1
|
|
add.w d5,d1 ;add in LAMBDA
|
|
add.w d1,d1 ;put them in bits 3:1
|
|
tst.l L_SCR2(a6) ;test sign of original x
|
|
bge.b x_pos ;if pos, don't set bit 0
|
|
addq.l #1,d1 ;if neg, set bit 0
|
|
x_pos:
|
|
lea.l RBDTBL,a2 ;load rbdtbl base
|
|
move.b (a2,d1),d3 ;load d3 with new rmode
|
|
lsl.l #4,d3 ;put bits in proper position
|
|
fmove.l d3,fpcr ;load bits into fpu
|
|
lsr.l #4,d3 ;put bits in proper position
|
|
tst.b d3 ;decode new rmode for pten table
|
|
bne.b not_rn ;if zero, it is RN
|
|
lea.l PTENRN,a1 ;load a1 with RN table base
|
|
bra.b rmode ;exit decode
|
|
not_rn:
|
|
lsr.b #1,d3 ;get lsb in carry
|
|
bcc.b not_rp ;if carry clear, it is RM
|
|
lea.l PTENRP,a1 ;load a1 with RP table base
|
|
bra.b rmode ;exit decode
|
|
not_rp:
|
|
lea.l PTENRM,a1 ;load a1 with RM table base
|
|
rmode:
|
|
clr.l d3 ;clr table index
|
|
e_loop:
|
|
lsr.l #1,d0 ;shift next bit into carry
|
|
bcc.b e_next ;if zero, skip the mul
|
|
fmul.x (a1,d3),fp1 ;mul by 10**(d3_bit_no)
|
|
e_next:
|
|
add.l #12,d3 ;inc d3 to next pwrten table entry
|
|
tst.l d0 ;test if ISCALE is zero
|
|
bne.b e_loop ;if not, loop
|
|
|
|
|
|
* A8. Clr INEX; Force RZ.
|
|
* The operation in A3 above may have set INEX2.
|
|
* RZ mode is forced for the scaling operation to insure
|
|
* only one rounding error. The grs bits are collected in
|
|
* the INEX flag for use in A10.
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
|
|
fmove.l #0,FPSR ;clr INEX
|
|
fmove.l #rz_mode,FPCR ;set RZ rounding mode
|
|
|
|
|
|
* A9. Scale X -> Y.
|
|
* The mantissa is scaled to the desired number of significant
|
|
* digits. The excess digits are collected in INEX2. If mul,
|
|
* Check d2 for excess 10 exponential value. If not zero,
|
|
* the iscale value would have caused the pwrten calculation
|
|
* to overflow. Only a negative iscale can cause this, so
|
|
* multiply by 10^(d2), which is now only allowed to be 24,
|
|
* with a multiply by 10^8 and 10^16, which is exact since
|
|
* 10^24 is exact. If the input was denormalized, we must
|
|
* create a busy stack frame with the mul command and the
|
|
* two operands, and allow the fpu to complete the multiply.
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: FPCR with RZ mode/Unchanged
|
|
* d2: 0 or 24/unchanged
|
|
* d3: x/x
|
|
* d4: LEN/Unchanged
|
|
* d5: ICTR:LAMBDA
|
|
* d6: ILOG/Unchanged
|
|
* d7: k-factor/Unchanged
|
|
* a0: ptr for original operand/final result
|
|
* a1: ptr to PTENRM array/Unchanged
|
|
* a2: x/x
|
|
* fp0: float(ILOG)/X adjusted for SCALE (Y)
|
|
* fp1: 10^ISCALE/Unchanged
|
|
* fp2: x/x
|
|
* F_SCR1:x/x
|
|
* F_SCR2:Abs(X) with $3fff exponent/Unchanged
|
|
* L_SCR1:x/x
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
A9_str:
|
|
fmove.x (a0),fp0 ;load X from memory
|
|
fabs.x fp0 ;use abs(X)
|
|
tst.w d5 ;LAMBDA is in lower word of d5
|
|
bne.b short_sc_mul ;if neg (LAMBDA = 1), scale by mul
|
|
fdiv.x fp1,fp0 ;calculate X / SCALE -> Y to fp0
|
|
bra.b A10_st ;branch to A10
|
|
|
|
sc_mul:
|
|
short_sc_mul:
|
|
tst.b BINDEC_FLG(a6) ;check for denorm
|
|
beq.b A9_norm ;if norm, continue with mul
|
|
fmovem.x fp1,-(a7) ;load ETEMP with 10^ISCALE
|
|
move.l 8(a0),-(a7) ;load FPTEMP with input arg
|
|
move.l 4(a0),-(a7)
|
|
move.l (a0),-(a7)
|
|
move.l #18,d3 ;load count for busy stack
|
|
A9_loop:
|
|
clr.l -(a7) ;clear lword on stack
|
|
dbf.w d3,A9_loop
|
|
move.b VER_TMP(a6),(a7) ;write current version number
|
|
move.b #BUSY_SIZE-4,1(a7) ;write current busy size
|
|
move.b #$10,$44(a7) ;set fcefpte[15] bit
|
|
move.w #$0023,$40(a7) ;load cmdreg1b with mul command
|
|
move.b #$fe,$8(a7) ;load all 1s to cu savepc
|
|
frestore (a7)+ ;restore frame to fpu for completion
|
|
fmul.x 36(a1),fp0 ;multiply fp0 by 10^8
|
|
fmul.x 48(a1),fp0 ;multiply fp0 by 10^16
|
|
bra.b A10_st
|
|
A9_norm:
|
|
tst.w d2 ;test for small exp case
|
|
beq.b A9_con ;if zero, continue as normal
|
|
fmul.x 36(a1),fp0 ;multiply fp0 by 10^8
|
|
fmul.x 48(a1),fp0 ;multiply fp0 by 10^16
|
|
A9_con:
|
|
fmul.x fp1,fp0 ;calculate X * SCALE -> Y to fp0
|
|
|
|
|
|
* A10. Or in INEX.
|
|
* If INEX is set, round error occurred. This is compensated
|
|
* for by 'or-ing' in the INEX2 flag to the lsb of Y.
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: FPCR with RZ mode/FPSR with INEX2 isolated
|
|
* d2: x/x
|
|
* d3: x/x
|
|
* d4: LEN/Unchanged
|
|
* d5: ICTR:LAMBDA
|
|
* d6: ILOG/Unchanged
|
|
* d7: k-factor/Unchanged
|
|
* a0: ptr for original operand/final result
|
|
* a1: ptr to PTENxx array/Unchanged
|
|
* a2: x/ptr to FP_SCR2(a6)
|
|
* fp0: Y/Y with lsb adjusted
|
|
* fp1: 10^ISCALE/Unchanged
|
|
* fp2: x/x
|
|
|
|
A10_st:
|
|
fmove.l FPSR,d0 ;get FPSR
|
|
fmove.x fp0,FP_SCR2(a6) ;move Y to memory
|
|
lea.l FP_SCR2(a6),a2 ;load a2 with ptr to FP_SCR2
|
|
btst.l #9,d0 ;check if INEX2 set
|
|
beq.b A11_st ;if clear, skip rest
|
|
ori.l #1,8(a2) ;or in 1 to lsb of mantissa
|
|
fmove.x FP_SCR2(a6),fp0 ;write adjusted Y back to fpu
|
|
|
|
|
|
* A11. Restore original FPCR; set size ext.
|
|
* Perform FINT operation in the user's rounding mode. Keep
|
|
* the size to extended. The sintdo entry point in the sint
|
|
* routine expects the FPCR value to be in USER_FPCR for
|
|
* mode and precision. The original FPCR is saved in L_SCR1.
|
|
|
|
A11_st:
|
|
move.l USER_FPCR(a6),L_SCR1(a6) ;save it for later
|
|
andi.l #$00000030,USER_FPCR(a6) ;set size to ext,
|
|
* ;block exceptions
|
|
|
|
|
|
* A12. Calculate YINT = FINT(Y) according to user's rounding mode.
|
|
* The FPSP routine sintd0 is used. The output is in fp0.
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: FPSR with AINEX cleared/FPCR with size set to ext
|
|
* d2: x/x/scratch
|
|
* d3: x/x
|
|
* d4: LEN/Unchanged
|
|
* d5: ICTR:LAMBDA/Unchanged
|
|
* d6: ILOG/Unchanged
|
|
* d7: k-factor/Unchanged
|
|
* a0: ptr for original operand/src ptr for sintdo
|
|
* a1: ptr to PTENxx array/Unchanged
|
|
* a2: ptr to FP_SCR2(a6)/Unchanged
|
|
* a6: temp pointer to FP_SCR2(a6) - orig value saved and restored
|
|
* fp0: Y/YINT
|
|
* fp1: 10^ISCALE/Unchanged
|
|
* fp2: x/x
|
|
* F_SCR1:x/x
|
|
* F_SCR2:Y adjusted for inex/Y with original exponent
|
|
* L_SCR1:x/original USER_FPCR
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
A12_st:
|
|
movem.l d0-d1/a0-a1,-(a7) ;save regs used by sintd0
|
|
move.l L_SCR1(a6),-(a7)
|
|
move.l L_SCR2(a6),-(a7)
|
|
lea.l FP_SCR2(a6),a0 ;a0 is ptr to F_SCR2(a6)
|
|
fmove.x fp0,(a0) ;move Y to memory at FP_SCR2(a6)
|
|
tst.l L_SCR2(a6) ;test sign of original operand
|
|
bge.b do_fint ;if pos, use Y
|
|
or.l #$80000000,(a0) ;if neg, use -Y
|
|
do_fint:
|
|
move.l USER_FPSR(a6),-(a7)
|
|
bsr sintdo ;sint routine returns int in fp0
|
|
move.b (a7),USER_FPSR(a6)
|
|
add.l #4,a7
|
|
move.l (a7)+,L_SCR2(a6)
|
|
move.l (a7)+,L_SCR1(a6)
|
|
movem.l (a7)+,d0-d1/a0-a1 ;restore regs used by sint
|
|
move.l L_SCR2(a6),FP_SCR2(a6) ;restore original exponent
|
|
move.l L_SCR1(a6),USER_FPCR(a6) ;restore user's FPCR
|
|
|
|
|
|
* A13. Check for LEN digits.
|
|
* If the int operation results in more than LEN digits,
|
|
* or less than LEN -1 digits, adjust ILOG and repeat from
|
|
* A6. This test occurs only on the first pass. If the
|
|
* result is exactly 10^LEN, decrement ILOG and divide
|
|
* the mantissa by 10. The calculation of 10^LEN cannot
|
|
* be inexact, since all powers of ten upto 10^27 are exact
|
|
* in extended precision, so the use of a previous power-of-ten
|
|
* table will introduce no error.
|
|
*
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: FPCR with size set to ext/scratch final = 0
|
|
* d2: x/x
|
|
* d3: x/scratch final = x
|
|
* d4: LEN/LEN adjusted
|
|
* d5: ICTR:LAMBDA/LAMBDA:ICTR
|
|
* d6: ILOG/ILOG adjusted
|
|
* d7: k-factor/Unchanged
|
|
* a0: pointer into memory for packed bcd string formation
|
|
* a1: ptr to PTENxx array/Unchanged
|
|
* a2: ptr to FP_SCR2(a6)/Unchanged
|
|
* fp0: int portion of Y/abs(YINT) adjusted
|
|
* fp1: 10^ISCALE/Unchanged
|
|
* fp2: x/10^LEN
|
|
* F_SCR1:x/x
|
|
* F_SCR2:Y with original exponent/Unchanged
|
|
* L_SCR1:original USER_FPCR/Unchanged
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
A13_st:
|
|
swap d5 ;put ICTR in lower word of d5
|
|
tst.w d5 ;check if ICTR = 0
|
|
bne not_zr ;if non-zero, go to second test
|
|
*
|
|
* Compute 10^(LEN-1)
|
|
*
|
|
fmove.s FONE,fp2 ;init fp2 to 1.0
|
|
move.l d4,d0 ;put LEN in d0
|
|
subq.l #1,d0 ;d0 = LEN -1
|
|
clr.l d3 ;clr table index
|
|
l_loop:
|
|
lsr.l #1,d0 ;shift next bit into carry
|
|
bcc.b l_next ;if zero, skip the mul
|
|
fmul.x (a1,d3),fp2 ;mul by 10**(d3_bit_no)
|
|
l_next:
|
|
add.l #12,d3 ;inc d3 to next pwrten table entry
|
|
tst.l d0 ;test if LEN is zero
|
|
bne.b l_loop ;if not, loop
|
|
*
|
|
* 10^LEN-1 is computed for this test and A14. If the input was
|
|
* denormalized, check only the case in which YINT > 10^LEN.
|
|
*
|
|
tst.b BINDEC_FLG(a6) ;check if input was norm
|
|
beq.b A13_con ;if norm, continue with checking
|
|
fabs.x fp0 ;take abs of YINT
|
|
bra test_2
|
|
*
|
|
* Compare abs(YINT) to 10^(LEN-1) and 10^LEN
|
|
*
|
|
A13_con:
|
|
fabs.x fp0 ;take abs of YINT
|
|
fcmp.x fp2,fp0 ;compare abs(YINT) with 10^(LEN-1)
|
|
fbge.w test_2 ;if greater, do next test
|
|
subq.l #1,d6 ;subtract 1 from ILOG
|
|
move.w #1,d5 ;set ICTR
|
|
fmove.l #rm_mode,FPCR ;set rmode to RM
|
|
fmul.s FTEN,fp2 ;compute 10^LEN
|
|
bra.w A6_str ;return to A6 and recompute YINT
|
|
test_2:
|
|
fmul.s FTEN,fp2 ;compute 10^LEN
|
|
fcmp.x fp2,fp0 ;compare abs(YINT) with 10^LEN
|
|
fblt.w A14_st ;if less, all is ok, go to A14
|
|
fbgt.w fix_ex ;if greater, fix and redo
|
|
fdiv.s FTEN,fp0 ;if equal, divide by 10
|
|
addq.l #1,d6 ; and inc ILOG
|
|
bra.b A14_st ; and continue elsewhere
|
|
fix_ex:
|
|
addq.l #1,d6 ;increment ILOG by 1
|
|
move.w #1,d5 ;set ICTR
|
|
fmove.l #rm_mode,FPCR ;set rmode to RM
|
|
bra.w A6_str ;return to A6 and recompute YINT
|
|
*
|
|
* Since ICTR <> 0, we have already been through one adjustment,
|
|
* and shouldn't have another; this is to check if abs(YINT) = 10^LEN
|
|
* 10^LEN is again computed using whatever table is in a1 since the
|
|
* value calculated cannot be inexact.
|
|
*
|
|
not_zr:
|
|
fmove.s FONE,fp2 ;init fp2 to 1.0
|
|
move.l d4,d0 ;put LEN in d0
|
|
clr.l d3 ;clr table index
|
|
z_loop:
|
|
lsr.l #1,d0 ;shift next bit into carry
|
|
bcc.b z_next ;if zero, skip the mul
|
|
fmul.x (a1,d3),fp2 ;mul by 10**(d3_bit_no)
|
|
z_next:
|
|
add.l #12,d3 ;inc d3 to next pwrten table entry
|
|
tst.l d0 ;test if LEN is zero
|
|
bne.b z_loop ;if not, loop
|
|
fabs.x fp0 ;get abs(YINT)
|
|
fcmp.x fp2,fp0 ;check if abs(YINT) = 10^LEN
|
|
fbne.w A14_st ;if not, skip this
|
|
fdiv.s FTEN,fp0 ;divide abs(YINT) by 10
|
|
addq.l #1,d6 ;and inc ILOG by 1
|
|
addq.l #1,d4 ; and inc LEN
|
|
fmul.s FTEN,fp2 ; if LEN++, the get 10^^LEN
|
|
|
|
|
|
* A14. Convert the mantissa to bcd.
|
|
* The binstr routine is used to convert the LEN digit
|
|
* mantissa to bcd in memory. The input to binstr is
|
|
* to be a fraction; i.e. (mantissa)/10^LEN and adjusted
|
|
* such that the decimal point is to the left of bit 63.
|
|
* The bcd digits are stored in the correct position in
|
|
* the final string area in memory.
|
|
*
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: x/LEN call to binstr - final is 0
|
|
* d1: x/0
|
|
* d2: x/ms 32-bits of mant of abs(YINT)
|
|
* d3: x/ls 32-bits of mant of abs(YINT)
|
|
* d4: LEN/Unchanged
|
|
* d5: ICTR:LAMBDA/LAMBDA:ICTR
|
|
* d6: ILOG
|
|
* d7: k-factor/Unchanged
|
|
* a0: pointer into memory for packed bcd string formation
|
|
* /ptr to first mantissa byte in result string
|
|
* a1: ptr to PTENxx array/Unchanged
|
|
* a2: ptr to FP_SCR2(a6)/Unchanged
|
|
* fp0: int portion of Y/abs(YINT) adjusted
|
|
* fp1: 10^ISCALE/Unchanged
|
|
* fp2: 10^LEN/Unchanged
|
|
* F_SCR1:x/Work area for final result
|
|
* F_SCR2:Y with original exponent/Unchanged
|
|
* L_SCR1:original USER_FPCR/Unchanged
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
A14_st:
|
|
fmove.l #rz_mode,FPCR ;force rz for conversion
|
|
fdiv.x fp2,fp0 ;divide abs(YINT) by 10^LEN
|
|
lea.l FP_SCR1(a6),a0
|
|
fmove.x fp0,(a0) ;move abs(YINT)/10^LEN to memory
|
|
move.l 4(a0),d2 ;move 2nd word of FP_RES to d2
|
|
move.l 8(a0),d3 ;move 3rd word of FP_RES to d3
|
|
clr.l 4(a0) ;zero word 2 of FP_RES
|
|
clr.l 8(a0) ;zero word 3 of FP_RES
|
|
move.l (a0),d0 ;move exponent to d0
|
|
swap d0 ;put exponent in lower word
|
|
beq.b no_sft ;if zero, don't shift
|
|
subi.l #$3ffd,d0 ;sub bias less 2 to make fract
|
|
tst.l d0 ;check if > 1
|
|
bgt.b no_sft ;if so, don't shift
|
|
neg.l d0 ;make exp positive
|
|
m_loop:
|
|
lsr.l #1,d2 ;shift d2:d3 right, add 0s
|
|
roxr.l #1,d3 ;the number of places
|
|
dbf.w d0,m_loop ;given in d0
|
|
no_sft:
|
|
tst.l d2 ;check for mantissa of zero
|
|
bne.b no_zr ;if not, go on
|
|
tst.l d3 ;continue zero check
|
|
beq.b zer_m ;if zero, go directly to binstr
|
|
no_zr:
|
|
clr.l d1 ;put zero in d1 for addx
|
|
addi.l #$00000080,d3 ;inc at bit 7
|
|
addx.l d1,d2 ;continue inc
|
|
andi.l #$ffffff80,d3 ;strip off lsb not used by 882
|
|
zer_m:
|
|
move.l d4,d0 ;put LEN in d0 for binstr call
|
|
addq.l #3,a0 ;a0 points to M16 byte in result
|
|
bsr binstr ;call binstr to convert mant
|
|
|
|
|
|
* A15. Convert the exponent to bcd.
|
|
* As in A14 above, the exp is converted to bcd and the
|
|
* digits are stored in the final string.
|
|
*
|
|
* Digits are stored in L_SCR1(a6) on return from BINDEC as:
|
|
*
|
|
* 32 16 15 0
|
|
* -----------------------------------------
|
|
* | 0 | e3 | e2 | e1 | e4 | X | X | X |
|
|
* -----------------------------------------
|
|
*
|
|
* And are moved into their proper places in FP_SCR1. If digit e4
|
|
* is non-zero, OPERR is signaled. In all cases, all 4 digits are
|
|
* written as specified in the 881/882 manual for packed decimal.
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: x/LEN call to binstr - final is 0
|
|
* d1: x/scratch (0);shift count for final exponent packing
|
|
* d2: x/ms 32-bits of exp fraction/scratch
|
|
* d3: x/ls 32-bits of exp fraction
|
|
* d4: LEN/Unchanged
|
|
* d5: ICTR:LAMBDA/LAMBDA:ICTR
|
|
* d6: ILOG
|
|
* d7: k-factor/Unchanged
|
|
* a0: ptr to result string/ptr to L_SCR1(a6)
|
|
* a1: ptr to PTENxx array/Unchanged
|
|
* a2: ptr to FP_SCR2(a6)/Unchanged
|
|
* fp0: abs(YINT) adjusted/float(ILOG)
|
|
* fp1: 10^ISCALE/Unchanged
|
|
* fp2: 10^LEN/Unchanged
|
|
* F_SCR1:Work area for final result/BCD result
|
|
* F_SCR2:Y with original exponent/ILOG/10^4
|
|
* L_SCR1:original USER_FPCR/Exponent digits on return from binstr
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
A15_st:
|
|
tst.b BINDEC_FLG(a6) ;check for denorm
|
|
beq.b not_denorm
|
|
ftst.x fp0 ;test for zero
|
|
fbeq.w den_zero ;if zero, use k-factor or 4933
|
|
fmove.l d6,fp0 ;float ILOG
|
|
fabs.x fp0 ;get abs of ILOG
|
|
bra.b convrt
|
|
den_zero:
|
|
tst.l d7 ;check sign of the k-factor
|
|
blt.b use_ilog ;if negative, use ILOG
|
|
fmove.s F4933,fp0 ;force exponent to 4933
|
|
bra.b convrt ;do it
|
|
use_ilog:
|
|
fmove.l d6,fp0 ;float ILOG
|
|
fabs.x fp0 ;get abs of ILOG
|
|
bra.b convrt
|
|
not_denorm:
|
|
ftst.x fp0 ;test for zero
|
|
fbne.w not_zero ;if zero, force exponent
|
|
fmove.s FONE,fp0 ;force exponent to 1
|
|
bra.b convrt ;do it
|
|
not_zero:
|
|
fmove.l d6,fp0 ;float ILOG
|
|
fabs.x fp0 ;get abs of ILOG
|
|
convrt:
|
|
fdiv.x 24(a1),fp0 ;compute ILOG/10^4
|
|
fmove.x fp0,FP_SCR2(a6) ;store fp0 in memory
|
|
move.l 4(a2),d2 ;move word 2 to d2
|
|
move.l 8(a2),d3 ;move word 3 to d3
|
|
move.w (a2),d0 ;move exp to d0
|
|
beq.b x_loop_fin ;if zero, skip the shift
|
|
subi.w #$3ffd,d0 ;subtract off bias
|
|
neg.w d0 ;make exp positive
|
|
x_loop:
|
|
lsr.l #1,d2 ;shift d2:d3 right
|
|
roxr.l #1,d3 ;the number of places
|
|
dbf.w d0,x_loop ;given in d0
|
|
x_loop_fin:
|
|
clr.l d1 ;put zero in d1 for addx
|
|
addi.l #$00000080,d3 ;inc at bit 6
|
|
addx.l d1,d2 ;continue inc
|
|
andi.l #$ffffff80,d3 ;strip off lsb not used by 882
|
|
move.l #4,d0 ;put 4 in d0 for binstr call
|
|
lea.l L_SCR1(a6),a0 ;a0 is ptr to L_SCR1 for exp digits
|
|
bsr binstr ;call binstr to convert exp
|
|
move.l L_SCR1(a6),d0 ;load L_SCR1 lword to d0
|
|
move.l #12,d1 ;use d1 for shift count
|
|
lsr.l d1,d0 ;shift d0 right by 12
|
|
bfins d0,FP_SCR1(a6){4:12} ;put e3:e2:e1 in FP_SCR1
|
|
lsr.l d1,d0 ;shift d0 right by 12
|
|
bfins d0,FP_SCR1(a6){16:4} ;put e4 in FP_SCR1
|
|
tst.b d0 ;check if e4 is zero
|
|
beq.b A16_st ;if zero, skip rest
|
|
or.l #opaop_mask,USER_FPSR(a6) ;set OPERR & AIOP in USER_FPSR
|
|
|
|
|
|
* A16. Write sign bits to final string.
|
|
* Sigma is bit 31 of initial value; RHO is bit 31 of d6 (ILOG).
|
|
*
|
|
* Register usage:
|
|
* Input/Output
|
|
* d0: x/scratch - final is x
|
|
* d2: x/x
|
|
* d3: x/x
|
|
* d4: LEN/Unchanged
|
|
* d5: ICTR:LAMBDA/LAMBDA:ICTR
|
|
* d6: ILOG/ILOG adjusted
|
|
* d7: k-factor/Unchanged
|
|
* a0: ptr to L_SCR1(a6)/Unchanged
|
|
* a1: ptr to PTENxx array/Unchanged
|
|
* a2: ptr to FP_SCR2(a6)/Unchanged
|
|
* fp0: float(ILOG)/Unchanged
|
|
* fp1: 10^ISCALE/Unchanged
|
|
* fp2: 10^LEN/Unchanged
|
|
* F_SCR1:BCD result with correct signs
|
|
* F_SCR2:ILOG/10^4
|
|
* L_SCR1:Exponent digits on return from binstr
|
|
* L_SCR2:first word of X packed/Unchanged
|
|
|
|
A16_st:
|
|
clr.l d0 ;clr d0 for collection of signs
|
|
andi.b #$0f,FP_SCR1(a6) ;clear first nibble of FP_SCR1
|
|
tst.l L_SCR2(a6) ;check sign of original mantissa
|
|
bge.b mant_p ;if pos, don't set SM
|
|
moveq.l #2,d0 ;move 2 in to d0 for SM
|
|
mant_p:
|
|
tst.l d6 ;check sign of ILOG
|
|
bge.b wr_sgn ;if pos, don't set SE
|
|
addq.l #1,d0 ;set bit 0 in d0 for SE
|
|
wr_sgn:
|
|
bfins d0,FP_SCR1(a6){0:2} ;insert SM and SE into FP_SCR1
|
|
|
|
* Clean up and restore all registers used.
|
|
|
|
fmove.l #0,FPSR ;clear possible inex2/ainex bits
|
|
fmovem.x (a7)+,fp0-fp2
|
|
movem.l (a7)+,d2-d7/a2
|
|
rts
|
|
|
|
end
|