NetBSD/sys/arch/amiga/fpsp/README

SECTION 1  	FLOATING POINT SOFTWARE PACKAGE TERMS.

From:		Microprocessor & Memory Technologies Group
		Semiconductor Products Sector
		6501 William Cannon Drive West, 
		Mail Station OE33, Austin, Texas 78735-8598

To:  		FLOATING POINT SOFTWARE PACKAGE USERS                 
	 
Date:		August 27, 1993    	 


1.1	TITLE TO FLOATING POINT SOFTWARE PACKAGE FPSP

Title to the 68040 Floating Point Software Package, all copies 
thereof (in whole or in part and in any form), and all rights 
therein, including all rights in patents, and copyrights, 
applicable thereto, shall remain vested in MOTOROLA.  All 
rights, title and interest in the resulting modifications belong 
to MOTOROLA except where such modifications (a) are made 
solely for use with computer systems manufactured or 
distributed by user; (b) are themselves copyrightable; and  (c) 
would not constitute a copyright infringement if not licensed 
hereunder.

 
1.2	DISCLAIMER OF  WARRANTY.

THE 68040 FLOATING POINT SOFTWARE PACKAGE is provided on an 
"AS IS" basis and without other warranty except as stated herein.  

IN NO EVENT SHALL MOTOROLA BE LIABLE FOR INCIDENTAL OR 
CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE 68040 
FLOATING POINT SOFTWARE PACKAGE.  THIS DISCLAIMER OF 
WARRANTY EXTENDS TO ALL USERS OF THE THE 68040 FLOATING 
POINT SOFTWARE PACKAGE AND IS IN LIEU OF ALL WARRANTIES 
WHETHER EXPRESS, IMPLIED, OR STATUTORY, INCLUDING 
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR 
PARTICULAR PURPOSE.


SECTION 2	Release 2.3 Errata

As of this release, the following may be considered an
errata of the 040 (Mask 20D43B  Mask 4D50D and Mask 5D98D) FPSP:

    1. INEX1 reported by inexact conversion of packed source
       operand for a dyadic instruction will not be reported
       by the 040 upon completion of that instruction.  This 
       errata corresponds to errata "F5" on the 68040 Errata Sheet.
       Fixed in D98D.

    2. FREM and FMOD with packed operands will occasionally
       differ from the 881/2 results by one ulp in the conversion
       of the packed source operand.  

    3. INEX2/AINEX are not calculated in the same manner as in the 
       881/882 for some cases in which the result is overflowed.
       Currently, if the operation was an integer move-out, INEX2/AINEX
       is not set for any case.
       In some cases of fscale with integer input, the INEX2 bit will not
       be set on inex calculation.
       Under extended rounding precision, FSCALE results which underflow 
       and are inexact may be incorrectly rounded.  Inaddition, INEX2
       is not signaled in these cases.

    4. If an Fmove FPn,FPM(this also applies to the FNEG and FABS), is preceded        by any floating point operation with a denorm source operand , the FMOVE
       destination (FPm) is incorrectly tagged and may result in silent data
       corruption. A software fix in release 2.2.


SECTION 3  	Software Specification for an MC68040 Floating-
		Point Software Package
 
The purpose of this section is to provide an overview of 
the floating-point software package (FPSP) for the 
MC68040.  The FPSP emulates the floating-point 
instructions of the MC68881/MC68882 which are not provided 
by the MC68040.


3.1  DEFINITIONS, ACRONYMS, AND ABBREVIATIONS

FPn	-	Floating-Point Data Register Source
FPSP	-	Floating-Point Software Package
FPU	-	Floating-Point Unit
FPx	-	Floating-Point Data Register
See the Glossary of Reference 1 for additional 
definitions.


3.2  PRODUCT OVERVIEW

The FPSP adds additional floating-point capabilities to 
the MC68040.  A subset of the MC6888x instruction set is 
executed by the MC68040 on-chip FPU.  The remaining 
floating-point instructions are emulated in software by 
the FPSP (see Reference 2).  There are two types of FPSP: 
one for applications compiled for the MC68881/MC68882 and 
another for applications compiled for the MC68040 (see 
3.8.2 Packaging). 

The FPSP provides:
* Arithmetic and Transcendental Instructions
* Decimal Conversions
* Exception Handlers
* MC68040 Unimplemented Data Type and Data Format Handlers
There are two types of users: 1) end users who are running 
applications and 2) system integrators who will install 
the package (see 3.8.3 Site Adaptations).


3.3  GENERAL CONSTRAINTS

The FPSP satisfies the requirements of the ANSI IEEE 
Standard for Binary Floating-Point Arithmetic 754. The 
FPSP runs old user code unchanged and is transparent for 
old code.  The FPSP is easy to modify and install.  The 
performance of the transcendental function routines is 
equivalent or superior to that of a 33-MHz 
MC68881/MC68882.  The error bound is equivalent or 
superior to the MC68881/MC68882  (see 3.7.2 Accuracy).


3.4  ASSUMPTIONS AND DEPENDENCIES

The FPSP can be installed into any operating system.  The 
MC68040 FPU shall be implemented as described in Reference 
2.  Table 3-1 Lists the functions provided by the MC68040.


3.5.2  Exceptions

The main goal of the FPSP exception handlers is to provide 
the user with an easy path to port over existing MC68882 
exception handlers for use with the MC68040. The end 
result is that the FPSP provides an entry point so that 
once this point is reached, there is an indication that  
an IEEE-defined trap condition exist.


3.5.2.1  BSUN <20> BRANCH/SET ON UNORDERED.  

On a trap-enabled condition, the FPSP updates the floating-point 
instruction address register (FPIAR) by copying the PC 
value in the pre-instruction stack frame to the FPIAR. 
Once this is done, the exceptional frame is restored 
without clearing the exception, and the program flow goes 
to the FPSP provided entry point. At the entry point the 
MC68040 is in an exceptional state, ready to execute the 
user-supplied exception handler.  


3.5.2.2  SNAN <20> SIGNALING NOT-A-NUMBER.  

On a trap-disabled condition, and if the destination format is B, 
W,or L, then the FPSP stores the most significant 8, 16, or 
32 bits, respectively, of the SNAN mantissa, with the SNAN 
bit set, to the destination. The FPSP discards the 
exceptional frame, then returns to the main program flow 
without entering the FPSP provided entry point, hence the 
user-provided exception handler is not executed.

On a trap-enabled condition, the FPSP checks if the 
destination format is B, W, or L. Then, the FPSP stores 
the most significant 8, 16, or 32 bits, respectively, of 
the SNAN mantissa, with the SNAN bit set, to the 
destination. The FPSP then restores the exceptional frame 
without clearing the exception, and branches to  the FPSP 
provided entry point.  At the entry point, the MC68040 is 
in an exceptional state, ready to execute the user-
supplied exception handler.


3.5.2.3  OPERR <20> OPERAND ERROR. 
 
This exception traps through vector number 52.  
Table 3-3 shows the operand errors generated by the MC68040.  
Table 3-4 shows the operand errors generated by the FPSP.
Note that the FPSP Unimplemented Instruction Handler 
detects and adds to the cases in which OPERR exceptions 
occur. Refer to Table 3-4 for these specific exception-
causing conditions.

On a trap-disabled condition, the FPSP checks if the 
operand error is caused by an FMOVE to a B, W, or L memory 
or integer data register destination. If it is caused by 
an integer overflow or if the floating-point data register 
to be stored contains infinity, the FPSP stores the 
largest positive or negative integer that can fit in the 
specified destination format size. If the destination is 
integer and the floating-point number to be stored is a 
NAN, then the 8, 16, or 32 most significant bits of the 
NAN significand is stored as a result.
Next the FPSP checks for a false OPERR condition for an 
FMOVE to memory or integer data register. This condition 
occurs if the operand is equal to the largest negative 
integer representable in its format. The FPSP then stores 
the proper result, discards the exceptional frame, and 
returns to the main program flow without executing the 
user-supplied exception handler.

On a trap-enabled condition, the FPSP does the same 
functions as the above trap-disabled condition, with the 
exception that in the end, the FPSP restores the 
exceptional frame without clearing the exception and 
branches to the FPSP supplied entry point instead of 
returning to the main program flow. At the FPSP supplied  
entry point, the MC68040 is in an exceptional state, ready 
to execute the user-supplied exception handler. 


3.5.2.4  OVFL <20> OVERFLOW.  

This exception traps through vector number 53.
 
On a trap-disabled case, the FPSP stores the result in the 
destination as determined by the rounding mode at the 
destination as follows:

Rounding Mode			Result
	RN		Infinity, with the sign of the intermediate result
	RZ		Largest magnitude number, with the sign of the 
			intermediate result.
	RM		For positive overflow, largest positive 
			number
			For negative overflow, infinity
	RP		For positive overflow, infinity
			For negative overflow, largest negative 
			number

The FPSP then clears the appropriate exception bit in 
the frame and restores the non-exceptional frame into 
the MC68040, and  then returns to the main program flow.

On a trap-enabled case, the FPSP actions are identical to 
those found in the trap-disabled case, with the exception 
that instead of restoring a non-exceptional frame, the 
original exceptional frame is restored to the MC68040 and 
the FPSP branches to the FPSP supplied entry point. At 
this entry point, the MC68040 is in an exceptional state, 
ready to execute the user-supplied exception handler.


3.5.2.5  UNFL <20> UNDERFLOW.  

This exception traps through vector number 51.

On a trap-disabled case, the FPSP stores the result in the 
destination as determined by the rounding mode at the 
destination as follows:

	RN		Zero with the sign of the intermediate result.
	RZ		Zero with the sign of the intermediate result.
	RM 		For positive underflow, +zero.  For negative 
			underflow, the smallest denormalized 
			negative number.
	RP		For positive underflow, the smallest denormalized 
			positive number.  For negative underflow, -zero.

The FPSP then clears the appropriate exception bit in the 
frame and restores the non-exceptional frame into the 
MC68040, and  then returns to the main program flow.

On a trap-enabled case, the FPSP actions are identical to 
those found in the trap-disabled case, with the exception 
that instead of restoring a non-exceptional frame, the 
original exceptional frame is restored to the MC68040 and 
the FPSP branches to the FPSP supplied entry point. At 
this entry point, the MC68040 is in an exceptional state, 
ready to execute the user-supplied exception handler.


3.5.2.6  DZ <20> DIVIDE BY ZERO.
 
Note that the FPSP Unimplemented Instruction Handler detects 
and adds to the cases in which DZ exceptions occur. Refer to 
Table 3-5 for these specific exception-causing conditions.  
Table 3-6 lists the DZ exceptions generated by the MC68040.

The FPSP is not needed for this exception. The user-
supplied exception handler is always entered. A system 
call is provided by the FPSP to calculate the exceptional 
operand.


3.5.2.7  INEX1/INEX2 <20> INEXACT RESULT 1/2.  

Note that the FPSP Unimplemented Instruction Handler detects 
and allows INEX1 exceptions to occur. Furthermore, many new 
cases of INEX2 exceptions may be generated by the FPSP 
Unimplemented Instruction Handler as well. The INEX1 
exception traps into this handler as well as INEX2 
exceptions. 

The FPSP is not needed for this exception. The user-
supplied exception handler is always entered.


3.5.3  Instructions

The following paragraphs describe the arithmetic and 
transcendental instructions supported by the FPSP.


3.5.3.1  ARITHMETIC.  

Table 3-7 shows the arithmetic instructions supported by the FPSP.


3.5.3.2  TRANSCENDENTAL.  

Table 3-8 shows the transcendental instructions supported by the FPSP.


3.6  EXTERNAL INTERFACE REQUIREMENTS

For end users the FPSP is transparent; system Integrators 
will integrate the FPSP into their system. (See 3.8.3. 
Site Adaptations)

For applications compiled for the MC68881/MC68882 the FPSP 
provides kernel routines to support the MC68040 
unimplemented instructions.  The MC68040 uses vector 
number 11 for the unimplemented instructions. The MC68040 
stack frames are different for unimplemented 
MC68881/MC68882  instructions and other F-line traps. For 
applications compiled for the MC68040 the unimplemented 
instructions are contained in a library (to avoid the 
F_line trap overhead at runtime).

For both applications the FPSP provides kernel routines to 
support exceptions (vectors 48<34>54) and unsupported data 
types (vector 55).


3.7  PERFORMANCE REQUIREMENTS

The following paragraphs describe the speed, accuracy, and 
compatibility  requirements for the FPSP.


3.7.1  Speed

The performance of the transcendental function routines is 
equivalent or superior to that of a 33-MHz 
MC68881/MC68882.


3.7.2  Accuracy

The following paragraphs describe the arithmetic 
instructions, transcendental instructions, and decimal 
conversions for the FPSP.


3.7.2.1  ARITHMETIC INSTRUCTIONS.  

The error bound is one-half unit in the last place of the 
destination format in the round-to-nearest mode, and one 
unit in the last place in the other rounding modes.


3.7.2.2  TRANSCENDENTAL INSTRUCTIONS.  

The error bound is less than 0.502 ulp of double precision.


3.7.2.3  DECIMAL CONVERSIONS.  

The error bound is 0.97 unit in the last digit of the 
destination precision for the round-to-nearest mode; and 
1.47 units in the last digit of the destination precision for 
the other rounding modes.


3.7.3  Compatibility

The FPSP transcendental calculation results are not the 
same as for the MC68881/MC68882.  This is because the 
algorithms used by the MC68881/MC68882 (CORDIC) cannot be 
effectively implemented in software.  All other 
calculations are identical. The error bound is equivalent 
or superior to the MC68881/MC68882.


3.8  OTHER REQUIREMENTS

The following paragraphs describe other requirements for 
the FPSP, such as maintainability, packaging, and site 
adaptations.


3.8.1  Maintainability

The speed requirements have forced writing most of the 
package in assembly language.


3.8.2  Packaging

There are two versions of the FPSP. The FPSP Kernel 
Version is used to execute pre-existing user object code 
written for the MC68882. This is installed as part of the 
operating system. User applications need not be recompiled 
or modified in any way once the FPSP Kernel Version is 
installed.

The FPSP Library Version is used to compile code that uses 
only the MC68040-implemented floating point instructions. 
The library version provides less overhead than the FPSP 
Kernel Version. Other features of this library includes 
ABI compliance as well as IEEE exception-reporting 
compliant. It is not however, UNIX exception-reporting 
compliant.  The FPSP is not yet available in library 
format.


3.8.3  Site Adaptations

Some of the entries in the vector table needs to point to 
entry points within the FPSP Kernel Version.  For those 
vectors the FPSP displaces, an entry point is provided to 
replace that which it takes. Note that former MC68882 
floating-point exception handlers need to go through minor 
modifications to account for the differences between the 
MC68040 and MC68882 floating point exceptional state 
frames. The FPSP provides skeleton code for each floating-
point exception handler to aid in porting the MC68882 
floating-point exception handlers. 

For systems and applications that never set any of the 
exception bits in the FPCR, or if the former MC68882 
floating-point exception handlers only contain minimum 
code needed to clear the exception and return, no work is 
needed and the FPSP is a drop-in package.

The FPSP Library Version needs to "intercept" the 
appropriate math library calls which use MC68882 
transcendental instructions. Since each site has different 
naming conventions,  the FPSP subroutines need to be 
renamed accordingly and recompiled. The resident compiler 
also needs to provide a library path search pattern such 
that the FPSP is given a chance to resolve those 
trancendentals instructions.


3.8.4  Stack Area Usage

To achieve code re-entrace, the FPSP allocates context-
sensitive variables on the stack. The FPSP does not 
require more than 512 bytes on the stack per context. This 
may be an installation concern for UNIX applications in 
which there is a limited UBLOCK area,  and that the system 
stack resides there. 


3.8.5  ROM-based applications

One of the goals of the FPSP Kernel Version is to be able 
to fit in a read-only space of no more than 64 KBytes. 
There are two main sections that need to reside in ROM. 
The text section, and the initialized data section. The 
text section accounts for 65% while the initialized data 
section accounts for 35%.


3.9  FPSP KERNEL VERSION INSTALLATION NOTES

The following paragraphs provide the MC68882 users with an 
understanding of the issues involved in porting over the 
FPSP into existing MC68030/MC68882 systems. Once these 
issues are understood, then the actual installation is 
explained.


3.9.1  Differences between the MC68040 and MC68882 
	Floating-point Exception Handling

The main reason for providing the FPSP is to provide 
MC68882 compatibility. If the installer understands the 
main differences between the MC68882 and MC68040 in the 
area of floating-point exception handlers, skip this 
section and go to the next section.

There are three areas that differ between the MC68040 and 
MC68882. 

The first difference is that of unimplemented 
instructions. The FPSP handles this by means of the F-line 
exception handling. This means that if there is an 
existing F-line handler, the FPSP replaces the existing F-
line exception handler, but provides an alternate entry 
point for the existing F-line handler.

The second difference is unsupported data types. The 
MC68040 provides a new entry point in the vector table, 
therefore no existing handler is replaced by the FPSP. 
There are no installation issues here.

The third difference is that of floating point exception 
differences. This issue is more involved and requires 
further explanations. 

The IEEE standard allows the user to enable or disable 
each floating point exception individually. If an 
exceptional condition occurs, the IEEE defines a specific 
action for the trap-disabled condition, and it also 
defines certain specific actions for a trap-enabled 
condition. The IEEE standard however, does not constrain 
the implementation of exception handling; both software 
and hardware can be used.

The MC68882 supports the IEEE exception handling 
compliance totally in hardware. For example, a user-
disabled (trap disabled) exception will cause the 
specified IEEE defined actions for user-disabled exception 
handling to occur.  Similarly, user-enabled exceptions 
will cause the MC68882 to take the exception as defined by 
the IEEE trap enabled case.

The MC68040 provides full IEEE trap-disabled exception 
handling compliance for the INEX and DZ exceptions. Just 
as the MC68882, the MC68040 takes these exceptions only 
for an IEEE trap-enabled condition. Existing MC68882 
handlers have a minimum code requirement as defined by the 
MC68882 User's Manual. As the MC68882 handlers, the 
MC68040 handlers have a minimum code requirement as well. 
The FPSP provides this minimum code requirement. 

The MC68040 does not provide full IEEE exception 
compliance on IEEE defined trap-disabled conditions for 
the following exceptions: OVFL, UNFL, OPERR, SNAN. For 
these exceptions, the MC68040 may take an exception even 
on an IEEE-defined trap-disabled condition. The FPSP 
provided exception handlers decide if its job is to 
implement IEEE trap-disabled exception compliance, (and 
therefore not execute the user supplied exception handler) 
or to implement IEEE trap-enabled exception compliance, 
(hence executing the user supplied exception handler). The 
FPSP provides a user entry point so that when this entry 
point is taken an IEEE-defined trap-enabled condition has 
definitely occurred. At this specified entry point, an 
exception handler  written for the MC68882 needs to be 
modified to account for MC68040 stack differences, and 
then placed at the user entry point. 

As with the MC68882, there is a minimum code requirement 
for the MC68040 handler, but this minimum code is provided 
by the FPSP. 

From an installation perspective, the OVFL, UNFL, OPERR, 
SNAN exception handlers are replaced by the FPSP handlers, 
but the FPSP provides an entry point so that MC68882-like 
exception handlers may be written. Furthermore, minimum 
code is provided by the FPSP and can be used as a 
template.

The BSUN exception is different in that unlike the 
previous exception handlers, the difference between the 
MC68882 and MC68040 resides in the IEEE-defined trap 
enabled case. The FPSP handles this by performing the 
patch needed for MC68882 compatibility, and then restoring 
the exception to the MC68040 without performing the 
necessary steps to clear the BSUN exception. The 
exceptional frame is restored into the MC68040 and the 
FPSP branches to the user entry point provided. At this 
entry point, an MC68882-like exception handler written for 
the MC68040 is executed without having to worry about the 
built-in incompatibility. Although this method incurs a 
performance hit, it frees the user-defined exception 
handler from having to write the code needed to implement 
MC68882 code compatibility. As with the other exception 
handlers, the FPSP provides the minimum code needed.

In summary, the FPSP replaces the following exception 
handlers and provides an entry-point for MC68882-like 
exception handlers for these exceptions: OVFL, UNFL, 
OPERR, SNAN, BSUN, F-line. 

The FPSP is not needed for the INEX and DZ exception 
handlers, and these exception handlers just need to be 
MC68882-like.


3.9.2  Vector Table

The entry point into the FPSP is achieved by having the 
appropriate vector table offset point to a specified entry 
point within the FPSP. For simplicity, all of the FPSP 
main entry points are found in the file skeleton.sa. 
Table 3-9 shows the vector table offset and the 
appropriate labels within the file skeleton.sa that it 
needs to point to.  Figure 3-1 shows a flowchart of the 
entry points.

Once the entry point is reached, the user may add some 
user-specific code prior to jumping to the FPSP routines ( 
FPSP routines are prefixed by "fpsp_").  After the jump to 
the FPSP routines, the FPSP performs its function and then 
jumps to the FPSP supplied entry points (if needed) found 
in the file skeleton.sa. 


3.9.3   FPSP Supplied Entry Points

To replace the vector table entries it displaces, the FPSP 
provides an alternate entry point. For simplicity, all of 
the FPSP supplied entry points are found in the file 
skeleton.sa. The FPSP supplied F-line exception entry 
point is straight-forward. An F-line exception handler 
written for an MC68030 can be placed here without 
modifications. The Unsupported Data Type exception handler 
is newly-defined, it does not displace any MC68030/MC68882 
exception handler. Therefore, the FPSP does not provide an 
alternate entry point for this exception.

The alternate entry points have the naming convention such 
that the specified exception handler is prefixed by 
"real_". For instance, the entry point for user-supplied 
BSUN exception handler is named "real_bsun". 

For the floating-point exception handlers (BSUN, OPERR, 
SNAN, DZ, OVFL, and UNFL) previously written for an 
MC68882 based system, these handlers need to be modified 
slightly for use with the MC68040. Once these handlers are 
modified, they are then placed in the FPSP provided entry 
points. 


3.9.4   Extract the Hardware Independent portion of 
the MC68882 handlers

To modify the existing MC68882 handlers, all of the code 
used in accessing the MC68882 generated frame needs to be 
stripped off. The code used in clearing an MC68882 
exception (setting bit 27 of the BIU Flag) needs to be 
stripped off as well. Only the hardware independent 
portions of the MC68882 handlers may be used. 

To aid the installer in rewriting the MC68882 exception 
handlers, the file skeleton.sa provides the minimum code 
necessary to clear the exception once the specific handler 
is entered. 

Once the hardware-independent portion is written, the 
modified MC68882 handlers need to be integrated into the 
portion of the code which is hardware dependent. The 
minimum code needed by each exception handler is already 
provided by the FPSP within the file skeleton.sa. The 
following section describes the mechanics behind the 
written code.  


3.9.5  MC68040 Minimum Exception Code

This section describes the minimum requirements for the 
user-supplied exception handlers. As mentioned in the 
previous sections, these minimum handlers are provided as 
part of the package, and this section is strictly for the 
user's information only. 

As with the MC68882, if all exceptions are always 
disabled, no minimum code is necessary since the FPSP 
guarantees that these FPSP provided entry points are never 
entered on trap-disabled condition. Therefore, for 
existing systems that do not provide exception handlers 
for the MC68882, it is likely that the assumption that all 
exceptions are always disabled is valid, and therefore no 
user-defined MC68040 exception handlers are needed either. 

The above paragraph should not be interpreted to mean that 
the FPSP provided exception handlers are unnecessary. On 
the contrary, the FPSP provided exception handlers are 
needed, and that these FPSP exception handlers provide the 
entry points for user-defined exception handlers.  Whether 
or not the user-defined MC68040 exception handers are 
needed is the issue being discussed.

Assuming that it is possible that the exceptions are 
enabled at some point, the minimum exception handler is 
similar to that defined for an MC68882. As with the 
MC68882, the MC68040 requires that the first floating 
point instruction be an FSAVE. Unlike the 
MC68882, the MC68040 does not always require an equivalent 
FRESTORE. For an E1 exception, only the FSAVE requirement 
is needed, the state frame may be discarded. The E3 
exception is more similar to that found in an MC68882. As 
with the MC68882, the E3 exception requires an FSAVE, an 
instruction that clears the exception in the resulting 
FSAVE stack, followed by an FRESTORE. 

If both E3 and E1 exceptions exist at the same time, then 
the exception is handled as though it were an E3 
exception. After which, the MC68040 re-traps to handle the 
E1 exception.

The E3 exception can only be reported by the following 
exception handlers: OVFL, UNFL, INEX. For these exception 
handlers, this is the minimum code requirement:
	1) FSAVE
	2) if E3 bit set,  goto (4), else goto (3)
	3) E1 exception, throw away stack and RTE
	4) Clear E3 bit, FRESTORE, RTE

The E3 exception cannot be reported by the following 
exception handlers: SNAN, OPERR and DZ. Since only an E1 
exception needs to be handled here, this is the minimum 
code requirement:
	1) FSAVE
	2) throw away stack and RTE

For the BSUN exception handler, the minimum code 
requirement is:
	1) FSAVE
	2) Do one of 4 methods described in MC68040 User's 
	Manual
	3) throw away stack and RTE

If the above minimum code requirements are not met, then, 
an infinitely looping exception sequence occurs.


3.9.6  Mem_read and Mem_write

The mem_write and mem_read subroutines are used by the 
FPSP to read and write from user space. These routines 
perform a UNIX system call to lcopyin and lcopyout. The 
FPSP provides a simple version of lcopyin and lcopyout for 
non-UNIX applications. Installation to UNIX-based systems 
requires that the FPSP provided  lcopyin and lcopyout be 
deleted or commented out. For simplicity, these 
subroutines are found in the file skeleton.sa.

The production version of the FPSP is fully re-entrant. If 
a page fault occurs on either a mem_read or mem_write, the 
operating system may perform a page-in operation and still 
allow other processes to use the FPSP.


3.9.7  Increasing F-line Handler Performance

The FPSP was written to handle all possible cases of 
MC68040 vs MC68030/MC68882 problem areas. Any performance 
improvement in this handler increases floating-point 
performance. The F-line handling may be made quicker by 
pointing the vector table entry directly into the label 
"fpsp_unimp" found in the file x_unimp.sa, if these 
conditions are met: 

	1) That the system never has to execute an FMOVECR 
	instruction in which bits 0 to 5 of the F-line word are 
	non-zero. 

	2) An alternate F-line entry point is unnecessary.
	This optimization saves a total of three instructions. ( 1 
	bra, 1 cmpi, 1 beq).


3.10  REFERENCES

1. 	MC68881UM/AD, MC68881/MC68882 Motorola Floating-Point 
	Coprocessor User's Manual.  Motorola Inc., 1989
2.	M68040UM/AD M68040 32-Bit Microprocessor User's
	Manual, Motorola, Inc.,1992,
3.  	MC68020UM/AD, MC68020 32-Bit Microprocessor User's 
	Manual,  Motorola, Inc., 1990.
4.  	MC68030UM/AD, MC68030 Enhanced 32-Bit Microprocessor 
	User's Manual, Motorola Inc., 1990
5.  	ANSI/IEEE Std. 754,1985 Standard for Binary Floating-
	Point Arithmetic
6.	M68000PM/AD REV. 1 Programmer's Reference Manual. Motorola Inc.,		1992


3.11	Tables and Figures

		Table 3-1. Functions Provided by MC68040
	------------------------------------------------------------------
	Name         |	Description	
	------------------------------------------------------------------
	FMOVE		Move to FPU	
	FMOVEM		Move Multiple Registers
	FSMOVE		Single-Precision Move	
	FDMOVE		Double-Precision Move
	FCMP		Compare	
	FABS		Absolute Value
	FSABS		Single-Precision Absolute Value	
	FDABS		Double-Precision Absolute Value
	FTST		Test	
	FNEG		Negate
	FSNEG		Single-Precision Negate	
	FDNEG		Double-Precision Negate
	FADD		Add	
	FSUB		Subtract
	FDIV		Divide	
	FMUL		Multiply
	FBcc		Branch Conditionally	
	FScc		Set According to Condition
	FDBcc		Test Cond, Dec and Branch	
	FTRAPcc		Trap Conditionally
	FSADD		Single-Precision Add	
	FSSUB		Single-Precision Subtract
	FSMUL		Single-Precision Multiply	
	FSDIV		Single-Precision Divide
	FDADD		Double-Precision Add	
	FDSUB		Double-Precision Subtract
	FDMUL		Double-Precision Multiply	
	FDDIV		Double-Precision Divide
	FSQRT		Square Root	
	FSSQRT		Single-Precision Square Root
	FDSQRT		Double-Precision Square Root	
	FNOP		No Operation
	FSAVE		Save Internal State	
	FRESTORE	Restore Internal State
	FSGLDIV 	Single-Precision Divide (68882 compatible) 
	FSGLMUL 	Single-Precision Multiply (68882 compatible)
	------------------------------------------------------------------


		Table 3-2. Support for Data Types and Data Formats
	------------------------------------------------------------------
        	     |			Data Formats
	             |----------------------------------------------------
	Data Types   |	SGL  |  DBL  |  EXT  |  Dec  |  Byte  |  Word | Long
	------------------------------------------------------------------
	Norm		 *	 *	 *	 @	 *	 *	 *
	Zero		 *	 *	 *	 @	 *	 *	 *
	Infinity	 *	 *	 *	 @			
	NaN		 *	 *	 *	 @			
	Denorm		 #	 #	 @	 @			
	Unnorm				 @	 @			
	------------------------------------------------------------------
Notes:
 @  = 	supported by FPSP
 *  = 	supported by the MC68040 FPU
 #  = 	supported by FPSP after being converted to extended precision by 	
	MC68040


		Table 3-3. Operand Errors Handled by the MC68040
	------------------------------------------------------------------
   Instruction   |	Conditions Causing Operand Error
	------------------------------------------------------------------
	FADD		( + inf )+( - inf ) or (- inf )+( + inf )
	FSUB		( + inf )-( + inf ) or (- inf )-(- inf )
	FMUL		( 0 ) x ( inf ) or ( inf ) x ( 0 ) 
	FDIV	 	0 / 0 or inf / inf 
	FMOVE.BWL	Integer overflow, Source is NaN, or Source is inf
	FSQRT		Source < 0, Source = - inf
	------------------------------------------------------------------


		Table 3-4. Operand Errors Generated by the FPSP
	------------------------------------------------------------------
    Instruction	  |  	Condition Causing Operand Error
	------------------------------------------------------------------
	FSADD		( + inf )+( - inf ) or ( - inf )+( + inf )
	FDADD		( + inf )+( - inf ) or ( - inf )+( + inf )
	FSSUB		( + inf )-( + inf ) or ( - inf )-( - inf )
	FDSUB		( + inf )-( + inf ) or ( - inf )-( - inf )
	FSMUL		( 0 ) x ( inf ) or ( inf ) x ( 0 )
	FDMUL		( 0 ) x ( inf ) or ( inf ) x ( 0 )
	FSDIV		0 / 0 or inf / inf 
	FDDIV		0 / 0 or inf / inf 
	FCOS		Source is +/- inf
	FSIN		Source is +/- inf
	FTAN		Source is +/- inf
	FACOS		Source is +/- inf, > +1, or < -1
	FASIN		Source is +/- inf, > +1, or < -1
	FATANH		Source is > +1, or < -1, Source = <20> inf
	FSINCOS		Source is +/- inf
	FGETEXP		Source is +/- inf
	FGETMAN		Source is +/- inf
	FLOG10		Source is < 0, Source = - inf
	FLOG2		Source is +/- inf, > +1, or < -1
	FLOGN		Source is +/- inf, > +1, or < -1
	FLOGNP1		Source is < -1, Source is = - inf
	FMOD		FPx is +/- inf or Source is 0, Other Operand is not a
 			NaN
	FMOVE to P	Result Exponent > 999 (Decimal) or k-Factor > +17
	FREM		FPx is +/- inf or Source, Other Operand is not a NaN
	FSCALE		Source is +/- inf, Other Operand is not a NaN
	------------------------------------------------------------------

		Table 3-5.  DZ Exceptions Generated by the FPSWP
	------------------------------------------------------------------
<EFBFBD>   Instruction	  |  	Condition Causing DZ Exception
	------------------------------------------------------------------
	FATANH		Source Operand = $ + -$1
	FLOG10		Source Operand  = 0
	FLOG2		Source Operand  = 0
	FLOGN		Source Operand  = 0
	FLOGNP1		Source Operand  = -1
	FSGLDIV		Source Operand  = 0 and FPn is not a NaN, Infinity, 
			or 0
	------------------------------------------------------------------


		Table 3-6.  DZ Exceptions Generated by the MC68040
	------------------------------------------------------------------
    Instruction	  |	Condition Causing DZ Exception
	------------------------------------------------------------------
	FDIV		Source Operand  = 0 and FPn is not a NaN, Infinity, 
			or 0
	FSDIV		Source Operand  = 0 and FPn is not a NaN, Infinity, 
			or 0
	FDDIV		Source Operand  = 0 and FPn is not a NaN, Infinity, 
			or 0
	------------------------------------------------------------------


		Table 3-7. Arithmetic Instructions
	------------------------------------------------------------------
	Name	|	Description
	------------------------------------------------------------------
	FADD*		Add	
	FSUB*		Subtract
	FSADD*+		Single-Precision Add	
	FSSUB*+		Single-Precision Subtract
	FDADD*+		Double-Precision Add	
	FDSUB*+		Double-Precision Subtract
	FMUL*		Multiply	
	FDIV*		Divide
	FSMUL*+		Single-Precision Multiply	
	FSDIV*+		Single-Precision Divide
	FDMUL*+		Double-Precision Multiply	
	FDDIV*+		Double-Precision Divide
	FINT		Integer Part	
	FINTRZ		Integer Part (Truncated)
	FABS*		Absolute Value	
	FNEG*		Negate
	FGETEXP		Get Exponent	
	FGETMAN		Get Mantissa
	FTST*		Test Operand	
	FCMP*		Compare 
	FREM		IEEE Remainder	
	FSCALE		Scale Exponent
	FMOVE*		Move FP data register	
	FSMOVE*		Single-Precision Move
	FDMOVE*		Double-Precision Move	
	FSQRT*		Square Root
	FSSQRT*		Single-Precision Square Root	
	FTWOTOX		2 to the X Power 
	FMOD		Modulo Remainder	
	FDSQRT*		Double-Precision Square Root
	FDMOD		Double-Precision Modulo Remainder	
	FSMOD		Single-Precision Modulo Remainder
	------------------------------------------------------------------
Notes:
 *  The FPSP provides these functions for all decimal data formats, 
    single, double, and extended denormalized data types, and extended 
    unnormalized data types. The MC68040 provides these functions 
    for the remaining formats and types (See page 11 of Reference 2).  
 +  Additional functions which are not provided by the MC68881/MC68882.


   		Table 3-8. Transcendental Instructions
	------------------------------------------------------------------
	Name	|	Description	
	------------------------------------------------------------------
	FCOS		Cosine	
	FSIN		Sine         
	FACOS		Arc Cosine	
	FASIN		Arc Sine
	FCOSH		Hyperbolic Cosine	
	FSINH		Hyperbolic Sine
	FSINCOS		Simultaneous Sine & Cosine	
	FATAN		Arc Tangent
	FTAN		Tangent	
	FATANH		Hyperbolic Arc Tan
	FTANH		Hyperbolic Tangent	
	FLOG10		Log Base 10
	FLOG2		Log Base 2	
	FLOGNP1		Log Base e of (x+1)
	FLOGN		Log Base e	
	FETOXM1		(e to the x Power) -1
	FETOX		e to the x Power	
	FTWOTOX		2 to the x Power
	FTENTOX		10 to the x Power		
	------------------------------------------------------------------
	

			Table 3-9.  FPSP Provided Entry Points
	------------------------------------------------------------------
	Exception Type       |	Vector Table  |  FPSP entry  |  User entry
			     |    (offset)    |   point      |   point
	------------------------------------------------------------------
	F-line unimplemented	vector 11 ($2C)     fline	real_fline
	float instruction
	------------------------------------------------------------------
	Branch or set on	vector 48 ($20)     bsun	real_bsun
	unordered
	------------------------------------------------------------------
	Inexact			vector 49 ($C4)     inex	real_inex
	------------------------------------------------------------------
	Divide-by-zero		vector 50 ($C8)	    dz		real_dz
	------------------------------------------------------------------
	Underflow		vector 51 ($CC)	    unfl	real_unfl
	------------------------------------------------------------------
	Operand error		vector 52 ($D0)	    operr	real_operr
	------------------------------------------------------------------
	Overflow		vector 53 ($D4)	    ovfl	real_ovfl
	------------------------------------------------------------------
	Signalling Not-A-	vector 54 ($D8)	    snan	real_snan
	Number
	------------------------------------------------------------------
	Unsupported data type	vector 55 ($DC)	    unsupp	
	------------------------------------------------------------------


                            File: skeleton.sa
                          |---------------------|
                          |                     |       File: x_unfl.sa
                          |                     |    |----------------------|
          VECTOR	  |                     |    |                      |
          TABLE           |                     | /->|fpsp_unfl:            |
        |------------| /->|unfl:                | |  |      .               |
        |            | |  |   jmp fpsp_unfl ----|-/  |      .               |
        |            | |  |                     |    |      .               |
vbr+$cc |addr of unfl|-/  |                     |    | HANDLE NON-MASKABLE  |
        |            |    |real_unfl:  <--------|--\ | EXCEPTION CONDITION  |
        |            |    |          .          |  | |      .               |
        |            |    |          .          |  | |      .               |
        |            |    |          .          |  | |      .               |
        |            |    |   USER TRAP HANDLER |  | |                      |
        |------------|    |          .          |  | | if FPCR Exception    |
			  |          .          |  | |    Byte UNFL bit set,|
                          |          .          |  \-|--  jmp real_unfl     |
                          |rte       .          |    | else rte             |
                          |                     |    |----------------------|
                          |---------------------|
			Figure 3-1 FPSP Entry Points


SECTION 4	FPSP Library Version 

4.1	When to use the FPSP Library Version.

The FPSP Library Version is intended to provide better performance for
trancendental instructions. It gets its performance by avoiding the
overhead involved in F-line trap emulation, as used by the Unimplemented
Instruction Handler. The FPSP Library Version is optional, and user code
needs to be recompiled to make use of it.


4.2	Installation Notes

The library version of the FPSP can be built by running 'make libFPSP.a'
from either the Makefile (for asm syntax) or fpsp.mk (for as syntax).
The 'make convert' step in Makefile will build both kernel and library .s
files from the .sa sources.  Change the SYS= and PREFIX= variables
in Makefile BEFORE running 'make convert'.  Three templates are supplied
for building the library version: GEN, CI5 and R3V6.  The GEN templates
generate entry points for single, double and extended precision routines and
provide the closest emulation of the kernel FPSP.  The CI5 and R3V6 
templates are faster, but discard most of the condition code and control
register handling, and only provide the double precision entry points.

The entry point names are contained in L_LIST.  Change the first 3
entries of each line to suit your system. 


4.3	Differences in the library version:

	1.) Single and Double precision SNAN's will not generate an SNAN
	exception because they are converted to extended precision
	and doing so causes them to turn into non-signalling NAN's.

	Example: facos.d 7ff7_ffff_ffff_ffff snan

	2.) An enabled Inexact exception may not be taken in all cases.

	Example: facos.x 000000000000000000000001 d inex2
	         fint.x 403d_0000_aaaa_aaaa_aaaa_ffff inex2

	3.) The return value in fp0 is undefined when an enabled OPERR or DZ
	exception ocurrs.  In the kernel FPSP, the destination register
	is unchanged.

	4.)fscale does not return the right result when an underflow ocurrs.
	The problem is that the t_unfl code in the l_support.sa file
	cannot exactly mimic the kernel FPSP version because the incoming
	FPCR is not in the same place every time.


4.4	Changes in the library version of FPSP rel 2.3:

  A floating point exception occurs when a transcendentals called twice.
      example: main()
                   { double d;
                     d= 0.0;
                     x = facosd(0.0);
                     y = facosd(d); 
                    }
        This is fixed in release 2.2 of FPSP.
 A followup on the above bug was to restore the fpcr beforw it unlinks.
        This is fixed in release 2.3 of FPSP.


4.5	Performance

Overall, the library version is twice as fast as the kernel code.


APPENDIX A	BUG TEMPLATE

Use the template below when reporting bugs.  

Any fields designated with an asterick (*) can be left blank.
When complete, please fax the report to the following 
phone number :
	
	(800) 248-8567

To assist you in filling out this form, a description of each field follows 
the template.
Should you have any questions, please fax us at the above number.
----------------------------------------------------------------------------

Problem# (0-0000)

Key Words

Severity (1,2,3)

Customer Description

Long Description

System Description

Date Reported

Reported By

Phone

*Resolved?

*Date Resolved

*Who fixed

*Correction Description

*Modules Affected

Problem Release/Load

*Test suite passed

*sccs version control

---------------------------------------------------------------------------

The following is a brief description of how to use the bug report template.

---------------------------------------------------------------------------

Problem# (0-0000) 	You may include a number that you will use internally
			to track this bug.  We will log it but assign our own
			# to track your bug repair.  Please choose one person
			as the individual to send in all bug reports for your
			firm.  This should help avoid confusion and ensure a 
			smooth working relationship.

Key Words 		Indicate the key terms associated with this bug 
			report.  For example: fpcr, denorm

Severity (1,2,3,4,5) 	This indicates the severity of the bug.
			The following descriptions are taken from AT&T test
			document:
			1:  an error that causes the FPSP to crash and no
			further work can be done.  An error that causes gross
			deviations of results.  Non-waiverable compliance
			violations are also severity 1. 
			2:  an error that represents a substantial deviation in
			the functionality of the FPSP or deviation from 
			IEEE 754 standard.
			3:  an error that represents a deviation in the 
			functionality.  However, the customer is able to 
			implement a workaround to this problem.
			4:  an error that represents a minor deviation or 
			incorrect documentation.
			5:  a request for product enhancement.


Customer Description 	This is a brief description of the problem. 

Long Description 	This is a more detailed description of the problem. 
			This could contain a short code fragment, a 
			suggested fix for the bug or reference a longer 
			file with this type of information in it.              

System Description 	This is a brief description of your system.

Date Reported 		MM/DD/YY                     

Reported By 		Your name                      

Phone 			Enter your telephone number including the 
			area code. 

*Resolved? 		Enter "yes" or "no" only.                          

*Date Resolved 		When the bug is fixed the date it was fixed 
			will be entered by us in this field.

*Who fixed              Name of the person who fixed the bug.

*Correction Description	When the bug is fixed we will enter a 
			description of the fix in this field.

*Modules Affected 	When the bug is fixed we will enter the
			module name. 


Problem Release/Load 	Enter the release or load information in this 
			field. This information should be on the label 
			for the tape that was sent to you. 

*Test suite passed      You do not need to fill out this field.  This is
                        the test suite file that we use to verify bugs
                        and/or fixes.

*sccs version control   You do not need to fill out this field.  This is
                        the sccs version that contains the fix.