NetBSD/sys/arch/m68k/fpsp/README
1994-01-26 21:24:05 +00:00

1233 lines
45 KiB
Plaintext
Raw Blame History

SECTION 1 FLOATING POINT SOFTWARE PACKAGE TERMS.
From: Microprocessor & Memory Technologies Group
Semiconductor Products Sector
6501 William Cannon Drive West,
Mail Station OE33, Austin, Texas 78735-8598
To: FLOATING POINT SOFTWARE PACKAGE USERS
Date: August 27, 1993
1.1 TITLE TO FLOATING POINT SOFTWARE PACKAGE FPSP
Title to the 68040 Floating Point Software Package, all copies
thereof (in whole or in part and in any form), and all rights
therein, including all rights in patents, and copyrights,
applicable thereto, shall remain vested in MOTOROLA. All
rights, title and interest in the resulting modifications belong
to MOTOROLA except where such modifications (a) are made
solely for use with computer systems manufactured or
distributed by user; (b) are themselves copyrightable; and (c)
would not constitute a copyright infringement if not licensed
hereunder.
1.2 DISCLAIMER OF WARRANTY.
THE 68040 FLOATING POINT SOFTWARE PACKAGE is provided on an
"AS IS" basis and without other warranty except as stated herein.
IN NO EVENT SHALL MOTOROLA BE LIABLE FOR INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE 68040
FLOATING POINT SOFTWARE PACKAGE. THIS DISCLAIMER OF
WARRANTY EXTENDS TO ALL USERS OF THE THE 68040 FLOATING
POINT SOFTWARE PACKAGE AND IS IN LIEU OF ALL WARRANTIES
WHETHER EXPRESS, IMPLIED, OR STATUTORY, INCLUDING
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
PARTICULAR PURPOSE.
SECTION 2 Release 2.3 Errata
As of this release, the following may be considered an
errata of the 040 (Mask 20D43B Mask 4D50D and Mask 5D98D) FPSP:
1. INEX1 reported by inexact conversion of packed source
operand for a dyadic instruction will not be reported
by the 040 upon completion of that instruction. This
errata corresponds to errata "F5" on the 68040 Errata Sheet.
Fixed in D98D.
2. FREM and FMOD with packed operands will occasionally
differ from the 881/2 results by one ulp in the conversion
of the packed source operand.
3. INEX2/AINEX are not calculated in the same manner as in the
881/882 for some cases in which the result is overflowed.
Currently, if the operation was an integer move-out, INEX2/AINEX
is not set for any case.
In some cases of fscale with integer input, the INEX2 bit will not
be set on inex calculation.
Under extended rounding precision, FSCALE results which underflow
and are inexact may be incorrectly rounded. Inaddition, INEX2
is not signaled in these cases.
4. If an Fmove FPn,FPM(this also applies to the FNEG and FABS), is preceded by any floating point operation with a denorm source operand , the FMOVE
destination (FPm) is incorrectly tagged and may result in silent data
corruption. A software fix in release 2.2.
SECTION 3 Software Specification for an MC68040 Floating-
Point Software Package
The purpose of this section is to provide an overview of
the floating-point software package (FPSP) for the
MC68040. The FPSP emulates the floating-point
instructions of the MC68881/MC68882 which are not provided
by the MC68040.
3.1 DEFINITIONS, ACRONYMS, AND ABBREVIATIONS
FPn - Floating-Point Data Register Source
FPSP - Floating-Point Software Package
FPU - Floating-Point Unit
FPx - Floating-Point Data Register
See the Glossary of Reference 1 for additional
definitions.
3.2 PRODUCT OVERVIEW
The FPSP adds additional floating-point capabilities to
the MC68040. A subset of the MC6888x instruction set is
executed by the MC68040 on-chip FPU. The remaining
floating-point instructions are emulated in software by
the FPSP (see Reference 2). There are two types of FPSP:
one for applications compiled for the MC68881/MC68882 and
another for applications compiled for the MC68040 (see
3.8.2 Packaging).
The FPSP provides:
* Arithmetic and Transcendental Instructions
* Decimal Conversions
* Exception Handlers
* MC68040 Unimplemented Data Type and Data Format Handlers
There are two types of users: 1) end users who are running
applications and 2) system integrators who will install
the package (see 3.8.3 Site Adaptations).
3.3 GENERAL CONSTRAINTS
The FPSP satisfies the requirements of the ANSI IEEE
Standard for Binary Floating-Point Arithmetic 754. The
FPSP runs old user code unchanged and is transparent for
old code. The FPSP is easy to modify and install. The
performance of the transcendental function routines is
equivalent or superior to that of a 33-MHz
MC68881/MC68882. The error bound is equivalent or
superior to the MC68881/MC68882 (see 3.7.2 Accuracy).
3.4 ASSUMPTIONS AND DEPENDENCIES
The FPSP can be installed into any operating system. The
MC68040 FPU shall be implemented as described in Reference
2. Table 3-1 Lists the functions provided by the MC68040.
3.5.2 Exceptions
The main goal of the FPSP exception handlers is to provide
the user with an easy path to port over existing MC68882
exception handlers for use with the MC68040. The end
result is that the FPSP provides an entry point so that
once this point is reached, there is an indication that
an IEEE-defined trap condition exist.
3.5.2.1 BSUN <20> BRANCH/SET ON UNORDERED.
On a trap-enabled condition, the FPSP updates the floating-point
instruction address register (FPIAR) by copying the PC
value in the pre-instruction stack frame to the FPIAR.
Once this is done, the exceptional frame is restored
without clearing the exception, and the program flow goes
to the FPSP provided entry point. At the entry point the
MC68040 is in an exceptional state, ready to execute the
user-supplied exception handler.
3.5.2.2 SNAN <20> SIGNALING NOT-A-NUMBER.
On a trap-disabled condition, and if the destination format is B,
W,or L, then the FPSP stores the most significant 8, 16, or
32 bits, respectively, of the SNAN mantissa, with the SNAN
bit set, to the destination. The FPSP discards the
exceptional frame, then returns to the main program flow
without entering the FPSP provided entry point, hence the
user-provided exception handler is not executed.
On a trap-enabled condition, the FPSP checks if the
destination format is B, W, or L. Then, the FPSP stores
the most significant 8, 16, or 32 bits, respectively, of
the SNAN mantissa, with the SNAN bit set, to the
destination. The FPSP then restores the exceptional frame
without clearing the exception, and branches to the FPSP
provided entry point. At the entry point, the MC68040 is
in an exceptional state, ready to execute the user-
supplied exception handler.
3.5.2.3 OPERR <20> OPERAND ERROR.
This exception traps through vector number 52.
Table 3-3 shows the operand errors generated by the MC68040.
Table 3-4 shows the operand errors generated by the FPSP.
Note that the FPSP Unimplemented Instruction Handler
detects and adds to the cases in which OPERR exceptions
occur. Refer to Table 3-4 for these specific exception-
causing conditions.
On a trap-disabled condition, the FPSP checks if the
operand error is caused by an FMOVE to a B, W, or L memory
or integer data register destination. If it is caused by
an integer overflow or if the floating-point data register
to be stored contains infinity, the FPSP stores the
largest positive or negative integer that can fit in the
specified destination format size. If the destination is
integer and the floating-point number to be stored is a
NAN, then the 8, 16, or 32 most significant bits of the
NAN significand is stored as a result.
Next the FPSP checks for a false OPERR condition for an
FMOVE to memory or integer data register. This condition
occurs if the operand is equal to the largest negative
integer representable in its format. The FPSP then stores
the proper result, discards the exceptional frame, and
returns to the main program flow without executing the
user-supplied exception handler.
On a trap-enabled condition, the FPSP does the same
functions as the above trap-disabled condition, with the
exception that in the end, the FPSP restores the
exceptional frame without clearing the exception and
branches to the FPSP supplied entry point instead of
returning to the main program flow. At the FPSP supplied
entry point, the MC68040 is in an exceptional state, ready
to execute the user-supplied exception handler.
3.5.2.4 OVFL <20> OVERFLOW.
This exception traps through vector number 53.
On a trap-disabled case, the FPSP stores the result in the
destination as determined by the rounding mode at the
destination as follows:
Rounding Mode Result
RN Infinity, with the sign of the intermediate result
RZ Largest magnitude number, with the sign of the
intermediate result.
RM For positive overflow, largest positive
number
For negative overflow, infinity
RP For positive overflow, infinity
For negative overflow, largest negative
number
The FPSP then clears the appropriate exception bit in
the frame and restores the non-exceptional frame into
the MC68040, and then returns to the main program flow.
On a trap-enabled case, the FPSP actions are identical to
those found in the trap-disabled case, with the exception
that instead of restoring a non-exceptional frame, the
original exceptional frame is restored to the MC68040 and
the FPSP branches to the FPSP supplied entry point. At
this entry point, the MC68040 is in an exceptional state,
ready to execute the user-supplied exception handler.
3.5.2.5 UNFL <20> UNDERFLOW.
This exception traps through vector number 51.
On a trap-disabled case, the FPSP stores the result in the
destination as determined by the rounding mode at the
destination as follows:
RN Zero with the sign of the intermediate result.
RZ Zero with the sign of the intermediate result.
RM For positive underflow, +zero. For negative
underflow, the smallest denormalized
negative number.
RP For positive underflow, the smallest denormalized
positive number. For negative underflow, -zero.
The FPSP then clears the appropriate exception bit in the
frame and restores the non-exceptional frame into the
MC68040, and then returns to the main program flow.
On a trap-enabled case, the FPSP actions are identical to
those found in the trap-disabled case, with the exception
that instead of restoring a non-exceptional frame, the
original exceptional frame is restored to the MC68040 and
the FPSP branches to the FPSP supplied entry point. At
this entry point, the MC68040 is in an exceptional state,
ready to execute the user-supplied exception handler.
3.5.2.6 DZ <20> DIVIDE BY ZERO.
Note that the FPSP Unimplemented Instruction Handler detects
and adds to the cases in which DZ exceptions occur. Refer to
Table 3-5 for these specific exception-causing conditions.
Table 3-6 lists the DZ exceptions generated by the MC68040.
The FPSP is not needed for this exception. The user-
supplied exception handler is always entered. A system
call is provided by the FPSP to calculate the exceptional
operand.
3.5.2.7 INEX1/INEX2 <20> INEXACT RESULT 1/2.
Note that the FPSP Unimplemented Instruction Handler detects
and allows INEX1 exceptions to occur. Furthermore, many new
cases of INEX2 exceptions may be generated by the FPSP
Unimplemented Instruction Handler as well. The INEX1
exception traps into this handler as well as INEX2
exceptions.
The FPSP is not needed for this exception. The user-
supplied exception handler is always entered.
3.5.3 Instructions
The following paragraphs describe the arithmetic and
transcendental instructions supported by the FPSP.
3.5.3.1 ARITHMETIC.
Table 3-7 shows the arithmetic instructions supported by the FPSP.
3.5.3.2 TRANSCENDENTAL.
Table 3-8 shows the transcendental instructions supported by the FPSP.
3.6 EXTERNAL INTERFACE REQUIREMENTS
For end users the FPSP is transparent; system Integrators
will integrate the FPSP into their system. (See 3.8.3.
Site Adaptations)
For applications compiled for the MC68881/MC68882 the FPSP
provides kernel routines to support the MC68040
unimplemented instructions. The MC68040 uses vector
number 11 for the unimplemented instructions. The MC68040
stack frames are different for unimplemented
MC68881/MC68882 instructions and other F-line traps. For
applications compiled for the MC68040 the unimplemented
instructions are contained in a library (to avoid the
F_line trap overhead at runtime).
For both applications the FPSP provides kernel routines to
support exceptions (vectors 48<34>54) and unsupported data
types (vector 55).
3.7 PERFORMANCE REQUIREMENTS
The following paragraphs describe the speed, accuracy, and
compatibility requirements for the FPSP.
3.7.1 Speed
The performance of the transcendental function routines is
equivalent or superior to that of a 33-MHz
MC68881/MC68882.
3.7.2 Accuracy
The following paragraphs describe the arithmetic
instructions, transcendental instructions, and decimal
conversions for the FPSP.
3.7.2.1 ARITHMETIC INSTRUCTIONS.
The error bound is one-half unit in the last place of the
destination format in the round-to-nearest mode, and one
unit in the last place in the other rounding modes.
3.7.2.2 TRANSCENDENTAL INSTRUCTIONS.
The error bound is less than 0.502 ulp of double precision.
3.7.2.3 DECIMAL CONVERSIONS.
The error bound is 0.97 unit in the last digit of the
destination precision for the round-to-nearest mode; and
1.47 units in the last digit of the destination precision for
the other rounding modes.
3.7.3 Compatibility
The FPSP transcendental calculation results are not the
same as for the MC68881/MC68882. This is because the
algorithms used by the MC68881/MC68882 (CORDIC) cannot be
effectively implemented in software. All other
calculations are identical. The error bound is equivalent
or superior to the MC68881/MC68882.
3.8 OTHER REQUIREMENTS
The following paragraphs describe other requirements for
the FPSP, such as maintainability, packaging, and site
adaptations.
3.8.1 Maintainability
The speed requirements have forced writing most of the
package in assembly language.
3.8.2 Packaging
There are two versions of the FPSP. The FPSP Kernel
Version is used to execute pre-existing user object code
written for the MC68882. This is installed as part of the
operating system. User applications need not be recompiled
or modified in any way once the FPSP Kernel Version is
installed.
The FPSP Library Version is used to compile code that uses
only the MC68040-implemented floating point instructions.
The library version provides less overhead than the FPSP
Kernel Version. Other features of this library includes
ABI compliance as well as IEEE exception-reporting
compliant. It is not however, UNIX exception-reporting
compliant. The FPSP is not yet available in library
format.
3.8.3 Site Adaptations
Some of the entries in the vector table needs to point to
entry points within the FPSP Kernel Version. For those
vectors the FPSP displaces, an entry point is provided to
replace that which it takes. Note that former MC68882
floating-point exception handlers need to go through minor
modifications to account for the differences between the
MC68040 and MC68882 floating point exceptional state
frames. The FPSP provides skeleton code for each floating-
point exception handler to aid in porting the MC68882
floating-point exception handlers.
For systems and applications that never set any of the
exception bits in the FPCR, or if the former MC68882
floating-point exception handlers only contain minimum
code needed to clear the exception and return, no work is
needed and the FPSP is a drop-in package.
The FPSP Library Version needs to "intercept" the
appropriate math library calls which use MC68882
transcendental instructions. Since each site has different
naming conventions, the FPSP subroutines need to be
renamed accordingly and recompiled. The resident compiler
also needs to provide a library path search pattern such
that the FPSP is given a chance to resolve those
trancendentals instructions.
3.8.4 Stack Area Usage
To achieve code re-entrace, the FPSP allocates context-
sensitive variables on the stack. The FPSP does not
require more than 512 bytes on the stack per context. This
may be an installation concern for UNIX applications in
which there is a limited UBLOCK area, and that the system
stack resides there.
3.8.5 ROM-based applications
One of the goals of the FPSP Kernel Version is to be able
to fit in a read-only space of no more than 64 KBytes.
There are two main sections that need to reside in ROM.
The text section, and the initialized data section. The
text section accounts for 65% while the initialized data
section accounts for 35%.
3.9 FPSP KERNEL VERSION INSTALLATION NOTES
The following paragraphs provide the MC68882 users with an
understanding of the issues involved in porting over the
FPSP into existing MC68030/MC68882 systems. Once these
issues are understood, then the actual installation is
explained.
3.9.1 Differences between the MC68040 and MC68882
Floating-point Exception Handling
The main reason for providing the FPSP is to provide
MC68882 compatibility. If the installer understands the
main differences between the MC68882 and MC68040 in the
area of floating-point exception handlers, skip this
section and go to the next section.
There are three areas that differ between the MC68040 and
MC68882.
The first difference is that of unimplemented
instructions. The FPSP handles this by means of the F-line
exception handling. This means that if there is an
existing F-line handler, the FPSP replaces the existing F-
line exception handler, but provides an alternate entry
point for the existing F-line handler.
The second difference is unsupported data types. The
MC68040 provides a new entry point in the vector table,
therefore no existing handler is replaced by the FPSP.
There are no installation issues here.
The third difference is that of floating point exception
differences. This issue is more involved and requires
further explanations.
The IEEE standard allows the user to enable or disable
each floating point exception individually. If an
exceptional condition occurs, the IEEE defines a specific
action for the trap-disabled condition, and it also
defines certain specific actions for a trap-enabled
condition. The IEEE standard however, does not constrain
the implementation of exception handling; both software
and hardware can be used.
The MC68882 supports the IEEE exception handling
compliance totally in hardware. For example, a user-
disabled (trap disabled) exception will cause the
specified IEEE defined actions for user-disabled exception
handling to occur. Similarly, user-enabled exceptions
will cause the MC68882 to take the exception as defined by
the IEEE trap enabled case.
The MC68040 provides full IEEE trap-disabled exception
handling compliance for the INEX and DZ exceptions. Just
as the MC68882, the MC68040 takes these exceptions only
for an IEEE trap-enabled condition. Existing MC68882
handlers have a minimum code requirement as defined by the
MC68882 User's Manual. As the MC68882 handlers, the
MC68040 handlers have a minimum code requirement as well.
The FPSP provides this minimum code requirement.
The MC68040 does not provide full IEEE exception
compliance on IEEE defined trap-disabled conditions for
the following exceptions: OVFL, UNFL, OPERR, SNAN. For
these exceptions, the MC68040 may take an exception even
on an IEEE-defined trap-disabled condition. The FPSP
provided exception handlers decide if its job is to
implement IEEE trap-disabled exception compliance, (and
therefore not execute the user supplied exception handler)
or to implement IEEE trap-enabled exception compliance,
(hence executing the user supplied exception handler). The
FPSP provides a user entry point so that when this entry
point is taken an IEEE-defined trap-enabled condition has
definitely occurred. At this specified entry point, an
exception handler written for the MC68882 needs to be
modified to account for MC68040 stack differences, and
then placed at the user entry point.
As with the MC68882, there is a minimum code requirement
for the MC68040 handler, but this minimum code is provided
by the FPSP.
From an installation perspective, the OVFL, UNFL, OPERR,
SNAN exception handlers are replaced by the FPSP handlers,
but the FPSP provides an entry point so that MC68882-like
exception handlers may be written. Furthermore, minimum
code is provided by the FPSP and can be used as a
template.
The BSUN exception is different in that unlike the
previous exception handlers, the difference between the
MC68882 and MC68040 resides in the IEEE-defined trap
enabled case. The FPSP handles this by performing the
patch needed for MC68882 compatibility, and then restoring
the exception to the MC68040 without performing the
necessary steps to clear the BSUN exception. The
exceptional frame is restored into the MC68040 and the
FPSP branches to the user entry point provided. At this
entry point, an MC68882-like exception handler written for
the MC68040 is executed without having to worry about the
built-in incompatibility. Although this method incurs a
performance hit, it frees the user-defined exception
handler from having to write the code needed to implement
MC68882 code compatibility. As with the other exception
handlers, the FPSP provides the minimum code needed.
In summary, the FPSP replaces the following exception
handlers and provides an entry-point for MC68882-like
exception handlers for these exceptions: OVFL, UNFL,
OPERR, SNAN, BSUN, F-line.
The FPSP is not needed for the INEX and DZ exception
handlers, and these exception handlers just need to be
MC68882-like.
3.9.2 Vector Table
The entry point into the FPSP is achieved by having the
appropriate vector table offset point to a specified entry
point within the FPSP. For simplicity, all of the FPSP
main entry points are found in the file skeleton.sa.
Table 3-9 shows the vector table offset and the
appropriate labels within the file skeleton.sa that it
needs to point to. Figure 3-1 shows a flowchart of the
entry points.
Once the entry point is reached, the user may add some
user-specific code prior to jumping to the FPSP routines (
FPSP routines are prefixed by "fpsp_"). After the jump to
the FPSP routines, the FPSP performs its function and then
jumps to the FPSP supplied entry points (if needed) found
in the file skeleton.sa.
3.9.3 FPSP Supplied Entry Points
To replace the vector table entries it displaces, the FPSP
provides an alternate entry point. For simplicity, all of
the FPSP supplied entry points are found in the file
skeleton.sa. The FPSP supplied F-line exception entry
point is straight-forward. An F-line exception handler
written for an MC68030 can be placed here without
modifications. The Unsupported Data Type exception handler
is newly-defined, it does not displace any MC68030/MC68882
exception handler. Therefore, the FPSP does not provide an
alternate entry point for this exception.
The alternate entry points have the naming convention such
that the specified exception handler is prefixed by
"real_". For instance, the entry point for user-supplied
BSUN exception handler is named "real_bsun".
For the floating-point exception handlers (BSUN, OPERR,
SNAN, DZ, OVFL, and UNFL) previously written for an
MC68882 based system, these handlers need to be modified
slightly for use with the MC68040. Once these handlers are
modified, they are then placed in the FPSP provided entry
points.
3.9.4 Extract the Hardware Independent portion of
the MC68882 handlers
To modify the existing MC68882 handlers, all of the code
used in accessing the MC68882 generated frame needs to be
stripped off. The code used in clearing an MC68882
exception (setting bit 27 of the BIU Flag) needs to be
stripped off as well. Only the hardware independent
portions of the MC68882 handlers may be used.
To aid the installer in rewriting the MC68882 exception
handlers, the file skeleton.sa provides the minimum code
necessary to clear the exception once the specific handler
is entered.
Once the hardware-independent portion is written, the
modified MC68882 handlers need to be integrated into the
portion of the code which is hardware dependent. The
minimum code needed by each exception handler is already
provided by the FPSP within the file skeleton.sa. The
following section describes the mechanics behind the
written code.
3.9.5 MC68040 Minimum Exception Code
This section describes the minimum requirements for the
user-supplied exception handlers. As mentioned in the
previous sections, these minimum handlers are provided as
part of the package, and this section is strictly for the
user's information only.
As with the MC68882, if all exceptions are always
disabled, no minimum code is necessary since the FPSP
guarantees that these FPSP provided entry points are never
entered on trap-disabled condition. Therefore, for
existing systems that do not provide exception handlers
for the MC68882, it is likely that the assumption that all
exceptions are always disabled is valid, and therefore no
user-defined MC68040 exception handlers are needed either.
The above paragraph should not be interpreted to mean that
the FPSP provided exception handlers are unnecessary. On
the contrary, the FPSP provided exception handlers are
needed, and that these FPSP exception handlers provide the
entry points for user-defined exception handlers. Whether
or not the user-defined MC68040 exception handers are
needed is the issue being discussed.
Assuming that it is possible that the exceptions are
enabled at some point, the minimum exception handler is
similar to that defined for an MC68882. As with the
MC68882, the MC68040 requires that the first floating
point instruction be an FSAVE. Unlike the
MC68882, the MC68040 does not always require an equivalent
FRESTORE. For an E1 exception, only the FSAVE requirement
is needed, the state frame may be discarded. The E3
exception is more similar to that found in an MC68882. As
with the MC68882, the E3 exception requires an FSAVE, an
instruction that clears the exception in the resulting
FSAVE stack, followed by an FRESTORE.
If both E3 and E1 exceptions exist at the same time, then
the exception is handled as though it were an E3
exception. After which, the MC68040 re-traps to handle the
E1 exception.
The E3 exception can only be reported by the following
exception handlers: OVFL, UNFL, INEX. For these exception
handlers, this is the minimum code requirement:
1) FSAVE
2) if E3 bit set, goto (4), else goto (3)
3) E1 exception, throw away stack and RTE
4) Clear E3 bit, FRESTORE, RTE
The E3 exception cannot be reported by the following
exception handlers: SNAN, OPERR and DZ. Since only an E1
exception needs to be handled here, this is the minimum
code requirement:
1) FSAVE
2) throw away stack and RTE
For the BSUN exception handler, the minimum code
requirement is:
1) FSAVE
2) Do one of 4 methods described in MC68040 User's
Manual
3) throw away stack and RTE
If the above minimum code requirements are not met, then,
an infinitely looping exception sequence occurs.
3.9.6 Mem_read and Mem_write
The mem_write and mem_read subroutines are used by the
FPSP to read and write from user space. These routines
perform a UNIX system call to lcopyin and lcopyout. The
FPSP provides a simple version of lcopyin and lcopyout for
non-UNIX applications. Installation to UNIX-based systems
requires that the FPSP provided lcopyin and lcopyout be
deleted or commented out. For simplicity, these
subroutines are found in the file skeleton.sa.
The production version of the FPSP is fully re-entrant. If
a page fault occurs on either a mem_read or mem_write, the
operating system may perform a page-in operation and still
allow other processes to use the FPSP.
3.9.7 Increasing F-line Handler Performance
The FPSP was written to handle all possible cases of
MC68040 vs MC68030/MC68882 problem areas. Any performance
improvement in this handler increases floating-point
performance. The F-line handling may be made quicker by
pointing the vector table entry directly into the label
"fpsp_unimp" found in the file x_unimp.sa, if these
conditions are met:
1) That the system never has to execute an FMOVECR
instruction in which bits 0 to 5 of the F-line word are
non-zero.
2) An alternate F-line entry point is unnecessary.
This optimization saves a total of three instructions. ( 1
bra, 1 cmpi, 1 beq).
3.10 REFERENCES
1. MC68881UM/AD, MC68881/MC68882 Motorola Floating-Point
Coprocessor User's Manual. Motorola Inc., 1989
2. M68040UM/AD M68040 32-Bit Microprocessor User's
Manual, Motorola, Inc.,1992,
3. MC68020UM/AD, MC68020 32-Bit Microprocessor User's
Manual, Motorola, Inc., 1990.
4. MC68030UM/AD, MC68030 Enhanced 32-Bit Microprocessor
User's Manual, Motorola Inc., 1990
5. ANSI/IEEE Std. 754,1985 Standard for Binary Floating-
Point Arithmetic
6. M68000PM/AD REV. 1 Programmer's Reference Manual. Motorola Inc., 1992
3.11 Tables and Figures
Table 3-1. Functions Provided by MC68040
------------------------------------------------------------------
Name | Description
------------------------------------------------------------------
FMOVE Move to FPU
FMOVEM Move Multiple Registers
FSMOVE Single-Precision Move
FDMOVE Double-Precision Move
FCMP Compare
FABS Absolute Value
FSABS Single-Precision Absolute Value
FDABS Double-Precision Absolute Value
FTST Test
FNEG Negate
FSNEG Single-Precision Negate
FDNEG Double-Precision Negate
FADD Add
FSUB Subtract
FDIV Divide
FMUL Multiply
FBcc Branch Conditionally
FScc Set According to Condition
FDBcc Test Cond, Dec and Branch
FTRAPcc Trap Conditionally
FSADD Single-Precision Add
FSSUB Single-Precision Subtract
FSMUL Single-Precision Multiply
FSDIV Single-Precision Divide
FDADD Double-Precision Add
FDSUB Double-Precision Subtract
FDMUL Double-Precision Multiply
FDDIV Double-Precision Divide
FSQRT Square Root
FSSQRT Single-Precision Square Root
FDSQRT Double-Precision Square Root
FNOP No Operation
FSAVE Save Internal State
FRESTORE Restore Internal State
FSGLDIV Single-Precision Divide (68882 compatible)
FSGLMUL Single-Precision Multiply (68882 compatible)
------------------------------------------------------------------
Table 3-2. Support for Data Types and Data Formats
------------------------------------------------------------------
| Data Formats
|----------------------------------------------------
Data Types | SGL | DBL | EXT | Dec | Byte | Word | Long
------------------------------------------------------------------
Norm * * * @ * * *
Zero * * * @ * * *
Infinity * * * @
NaN * * * @
Denorm # # @ @
Unnorm @ @
------------------------------------------------------------------
Notes:
@ = supported by FPSP
* = supported by the MC68040 FPU
# = supported by FPSP after being converted to extended precision by
MC68040
Table 3-3. Operand Errors Handled by the MC68040
------------------------------------------------------------------
Instruction | Conditions Causing Operand Error
------------------------------------------------------------------
FADD ( + inf )+( - inf ) or (- inf )+( + inf )
FSUB ( + inf )-( + inf ) or (- inf )-(- inf )
FMUL ( 0 ) x ( inf ) or ( inf ) x ( 0 )
FDIV 0 / 0 or inf / inf
FMOVE.BWL Integer overflow, Source is NaN, or Source is inf
FSQRT Source < 0, Source = - inf
------------------------------------------------------------------
Table 3-4. Operand Errors Generated by the FPSP
------------------------------------------------------------------
Instruction | Condition Causing Operand Error
------------------------------------------------------------------
FSADD ( + inf )+( - inf ) or ( - inf )+( + inf )
FDADD ( + inf )+( - inf ) or ( - inf )+( + inf )
FSSUB ( + inf )-( + inf ) or ( - inf )-( - inf )
FDSUB ( + inf )-( + inf ) or ( - inf )-( - inf )
FSMUL ( 0 ) x ( inf ) or ( inf ) x ( 0 )
FDMUL ( 0 ) x ( inf ) or ( inf ) x ( 0 )
FSDIV 0 / 0 or inf / inf
FDDIV 0 / 0 or inf / inf
FCOS Source is +/- inf
FSIN Source is +/- inf
FTAN Source is +/- inf
FACOS Source is +/- inf, > +1, or < -1
FASIN Source is +/- inf, > +1, or < -1
FATANH Source is > +1, or < -1, Source = <20> inf
FSINCOS Source is +/- inf
FGETEXP Source is +/- inf
FGETMAN Source is +/- inf
FLOG10 Source is < 0, Source = - inf
FLOG2 Source is +/- inf, > +1, or < -1
FLOGN Source is +/- inf, > +1, or < -1
FLOGNP1 Source is < -1, Source is = - inf
FMOD FPx is +/- inf or Source is 0, Other Operand is not a
NaN
FMOVE to P Result Exponent > 999 (Decimal) or k-Factor > +17
FREM FPx is +/- inf or Source, Other Operand is not a NaN
FSCALE Source is +/- inf, Other Operand is not a NaN
------------------------------------------------------------------
Table 3-5. DZ Exceptions Generated by the FPSWP
------------------------------------------------------------------
<EFBFBD> Instruction | Condition Causing DZ Exception
------------------------------------------------------------------
FATANH Source Operand = $ + -$1
FLOG10 Source Operand = 0
FLOG2 Source Operand = 0
FLOGN Source Operand = 0
FLOGNP1 Source Operand = -1
FSGLDIV Source Operand = 0 and FPn is not a NaN, Infinity,
or 0
------------------------------------------------------------------
Table 3-6. DZ Exceptions Generated by the MC68040
------------------------------------------------------------------
Instruction | Condition Causing DZ Exception
------------------------------------------------------------------
FDIV Source Operand = 0 and FPn is not a NaN, Infinity,
or 0
FSDIV Source Operand = 0 and FPn is not a NaN, Infinity,
or 0
FDDIV Source Operand = 0 and FPn is not a NaN, Infinity,
or 0
------------------------------------------------------------------
Table 3-7. Arithmetic Instructions
------------------------------------------------------------------
Name | Description
------------------------------------------------------------------
FADD* Add
FSUB* Subtract
FSADD*+ Single-Precision Add
FSSUB*+ Single-Precision Subtract
FDADD*+ Double-Precision Add
FDSUB*+ Double-Precision Subtract
FMUL* Multiply
FDIV* Divide
FSMUL*+ Single-Precision Multiply
FSDIV*+ Single-Precision Divide
FDMUL*+ Double-Precision Multiply
FDDIV*+ Double-Precision Divide
FINT Integer Part
FINTRZ Integer Part (Truncated)
FABS* Absolute Value
FNEG* Negate
FGETEXP Get Exponent
FGETMAN Get Mantissa
FTST* Test Operand
FCMP* Compare
FREM IEEE Remainder
FSCALE Scale Exponent
FMOVE* Move FP data register
FSMOVE* Single-Precision Move
FDMOVE* Double-Precision Move
FSQRT* Square Root
FSSQRT* Single-Precision Square Root
FTWOTOX 2 to the X Power
FMOD Modulo Remainder
FDSQRT* Double-Precision Square Root
FDMOD Double-Precision Modulo Remainder
FSMOD Single-Precision Modulo Remainder
------------------------------------------------------------------
Notes:
* The FPSP provides these functions for all decimal data formats,
single, double, and extended denormalized data types, and extended
unnormalized data types. The MC68040 provides these functions
for the remaining formats and types (See page 11 of Reference 2).
+ Additional functions which are not provided by the MC68881/MC68882.
Table 3-8. Transcendental Instructions
------------------------------------------------------------------
Name | Description
------------------------------------------------------------------
FCOS Cosine
FSIN Sine
FACOS Arc Cosine
FASIN Arc Sine
FCOSH Hyperbolic Cosine
FSINH Hyperbolic Sine
FSINCOS Simultaneous Sine & Cosine
FATAN Arc Tangent
FTAN Tangent
FATANH Hyperbolic Arc Tan
FTANH Hyperbolic Tangent
FLOG10 Log Base 10
FLOG2 Log Base 2
FLOGNP1 Log Base e of (x+1)
FLOGN Log Base e
FETOXM1 (e to the x Power) -1
FETOX e to the x Power
FTWOTOX 2 to the x Power
FTENTOX 10 to the x Power
------------------------------------------------------------------
Table 3-9. FPSP Provided Entry Points
------------------------------------------------------------------
Exception Type | Vector Table | FPSP entry | User entry
| (offset) | point | point
------------------------------------------------------------------
F-line unimplemented vector 11 ($2C) fline real_fline
float instruction
------------------------------------------------------------------
Branch or set on vector 48 ($20) bsun real_bsun
unordered
------------------------------------------------------------------
Inexact vector 49 ($C4) inex real_inex
------------------------------------------------------------------
Divide-by-zero vector 50 ($C8) dz real_dz
------------------------------------------------------------------
Underflow vector 51 ($CC) unfl real_unfl
------------------------------------------------------------------
Operand error vector 52 ($D0) operr real_operr
------------------------------------------------------------------
Overflow vector 53 ($D4) ovfl real_ovfl
------------------------------------------------------------------
Signalling Not-A- vector 54 ($D8) snan real_snan
Number
------------------------------------------------------------------
Unsupported data type vector 55 ($DC) unsupp
------------------------------------------------------------------
File: skeleton.sa
|---------------------|
| | File: x_unfl.sa
| | |----------------------|
VECTOR | | | |
TABLE | | /->|fpsp_unfl: |
|------------| /->|unfl: | | | . |
| | | | jmp fpsp_unfl ----|-/ | . |
| | | | | | . |
vbr+$cc |addr of unfl|-/ | | | HANDLE NON-MASKABLE |
| | |real_unfl: <--------|--\ | EXCEPTION CONDITION |
| | | . | | | . |
| | | . | | | . |
| | | . | | | . |
| | | USER TRAP HANDLER | | | |
|------------| | . | | | if FPCR Exception |
| . | | | Byte UNFL bit set,|
| . | \-|-- jmp real_unfl |
|rte . | | else rte |
| | |----------------------|
|---------------------|
Figure 3-1 FPSP Entry Points
SECTION 4 FPSP Library Version
4.1 When to use the FPSP Library Version.
The FPSP Library Version is intended to provide better performance for
trancendental instructions. It gets its performance by avoiding the
overhead involved in F-line trap emulation, as used by the Unimplemented
Instruction Handler. The FPSP Library Version is optional, and user code
needs to be recompiled to make use of it.
4.2 Installation Notes
The library version of the FPSP can be built by running 'make libFPSP.a'
from either the Makefile (for asm syntax) or fpsp.mk (for as syntax).
The 'make convert' step in Makefile will build both kernel and library .s
files from the .sa sources. Change the SYS= and PREFIX= variables
in Makefile BEFORE running 'make convert'. Three templates are supplied
for building the library version: GEN, CI5 and R3V6. The GEN templates
generate entry points for single, double and extended precision routines and
provide the closest emulation of the kernel FPSP. The CI5 and R3V6
templates are faster, but discard most of the condition code and control
register handling, and only provide the double precision entry points.
The entry point names are contained in L_LIST. Change the first 3
entries of each line to suit your system.
4.3 Differences in the library version:
1.) Single and Double precision SNAN's will not generate an SNAN
exception because they are converted to extended precision
and doing so causes them to turn into non-signalling NAN's.
Example: facos.d 7ff7_ffff_ffff_ffff snan
2.) An enabled Inexact exception may not be taken in all cases.
Example: facos.x 000000000000000000000001 d inex2
fint.x 403d_0000_aaaa_aaaa_aaaa_ffff inex2
3.) The return value in fp0 is undefined when an enabled OPERR or DZ
exception ocurrs. In the kernel FPSP, the destination register
is unchanged.
4.)fscale does not return the right result when an underflow ocurrs.
The problem is that the t_unfl code in the l_support.sa file
cannot exactly mimic the kernel FPSP version because the incoming
FPCR is not in the same place every time.
4.4 Changes in the library version of FPSP rel 2.3:
A floating point exception occurs when a transcendentals called twice.
example: main()
{ double d;
d= 0.0;
x = facosd(0.0);
y = facosd(d);
}
This is fixed in release 2.2 of FPSP.
A followup on the above bug was to restore the fpcr beforw it unlinks.
This is fixed in release 2.3 of FPSP.
4.5 Performance
Overall, the library version is twice as fast as the kernel code.
APPENDIX A BUG TEMPLATE
Use the template below when reporting bugs.
Any fields designated with an asterick (*) can be left blank.
When complete, please fax the report to the following
phone number :
(800) 248-8567
To assist you in filling out this form, a description of each field follows
the template.
Should you have any questions, please fax us at the above number.
----------------------------------------------------------------------------
Problem# (0-0000)
Key Words
Severity (1,2,3)
Customer Description
Long Description
System Description
Date Reported
Reported By
Phone
*Resolved?
*Date Resolved
*Who fixed
*Correction Description
*Modules Affected
Problem Release/Load
*Test suite passed
*sccs version control
---------------------------------------------------------------------------
The following is a brief description of how to use the bug report template.
---------------------------------------------------------------------------
Problem# (0-0000) You may include a number that you will use internally
to track this bug. We will log it but assign our own
# to track your bug repair. Please choose one person
as the individual to send in all bug reports for your
firm. This should help avoid confusion and ensure a
smooth working relationship.
Key Words Indicate the key terms associated with this bug
report. For example: fpcr, denorm
Severity (1,2,3,4,5) This indicates the severity of the bug.
The following descriptions are taken from AT&T test
document:
1: an error that causes the FPSP to crash and no
further work can be done. An error that causes gross
deviations of results. Non-waiverable compliance
violations are also severity 1.
2: an error that represents a substantial deviation in
the functionality of the FPSP or deviation from
IEEE 754 standard.
3: an error that represents a deviation in the
functionality. However, the customer is able to
implement a workaround to this problem.
4: an error that represents a minor deviation or
incorrect documentation.
5: a request for product enhancement.
Customer Description This is a brief description of the problem.
Long Description This is a more detailed description of the problem.
This could contain a short code fragment, a
suggested fix for the bug or reference a longer
file with this type of information in it.
System Description This is a brief description of your system.
Date Reported MM/DD/YY
Reported By Your name
Phone Enter your telephone number including the
area code.
*Resolved? Enter "yes" or "no" only.
*Date Resolved When the bug is fixed the date it was fixed
will be entered by us in this field.
*Who fixed Name of the person who fixed the bug.
*Correction Description When the bug is fixed we will enter a
description of the fix in this field.
*Modules Affected When the bug is fixed we will enter the
module name.
Problem Release/Load Enter the release or load information in this
field. This information should be on the label
for the tape that was sent to you.
*Test suite passed You do not need to fill out this field. This is
the test suite file that we use to verify bugs
and/or fixes.
*sccs version control You do not need to fill out this field. This is
the sccs version that contains the fix.