NetBSD/sys/arch/m68k/fpsp/README

SECTION 1  	FLOATING POINT SOFTWARE PACKAGE TERMS.

From:		Microprocessor & Memory Technologies Group
		Semiconductor Products Sector
		6501 William Cannon Drive West,
		Mail Station OE33, Austin, Texas 78735-8598

To:  		FLOATING POINT SOFTWARE PACKAGE USERS

Date:		August 27, 1993


1.1	TITLE TO FLOATING POINT SOFTWARE PACKAGE FPSP

Title to the 68040 Floating Point Software Package, all copies
thereof (in whole or in part and in any form), and all rights
therein, including all rights in patents, and copyrights,
applicable thereto, shall remain vested in MOTOROLA.  All
rights, title and interest in the resulting modifications belong
to MOTOROLA except where such modifications (a) are made
solely for use with computer systems manufactured or
distributed by user; (b) are themselves copyrightable; and  (c)
would not constitute a copyright infringement if not licensed
hereunder.


1.2	DISCLAIMER OF  WARRANTY.

THE 68040 FLOATING POINT SOFTWARE PACKAGE is provided on an
"AS IS" basis and without other warranty except as stated herein.

IN NO EVENT SHALL MOTOROLA BE LIABLE FOR INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE 68040
FLOATING POINT SOFTWARE PACKAGE.  THIS DISCLAIMER OF
WARRANTY EXTENDS TO ALL USERS OF THE THE 68040 FLOATING
POINT SOFTWARE PACKAGE AND IS IN LIEU OF ALL WARRANTIES
WHETHER EXPRESS, IMPLIED, OR STATUTORY, INCLUDING
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
PARTICULAR PURPOSE.


SECTION 2	Release 2.3 Errata

As of this release, the following may be considered an
errata of the 040 (Mask 20D43B  Mask 4D50D and Mask 5D98D) FPSP:

    1. INEX1 reported by inexact conversion of packed source
       operand for a dyadic instruction will not be reported
       by the 040 upon completion of that instruction.  This
       errata corresponds to errata "F5" on the 68040 Errata Sheet.
       Fixed in D98D.

    2. FREM and FMOD with packed operands will occasionally
       differ from the 881/2 results by one ulp in the conversion
       of the packed source operand.

    3. INEX2/AINEX are not calculated in the same manner as in the
       881/882 for some cases in which the result is overflowed.
       Currently, if the operation was an integer move-out, INEX2/AINEX
       is not set for any case.
       In some cases of fscale with integer input, the INEX2 bit will not
       be set on inex calculation.
       Under extended rounding precision, FSCALE results which underflow
       and are inexact may be incorrectly rounded.  Inaddition, INEX2
       is not signaled in these cases.

    4. If an Fmove FPn,FPM(this also applies to the FNEG and FABS), is preceded        by any floating point operation with a denorm source operand , the FMOVE
       destination (FPm) is incorrectly tagged and may result in silent data
       corruption. A software fix in release 2.2.


SECTION 3  	Software Specification for an MC68040 Floating-
		Point Software Package

The purpose of this section is to provide an overview of
the floating-point software package (FPSP) for the
MC68040.  The FPSP emulates the floating-point
instructions of the MC68881/MC68882 which are not provided
by the MC68040.


3.1  DEFINITIONS, ACRONYMS, AND ABBREVIATIONS

FPn	-	Floating-Point Data Register Source
FPSP	-	Floating-Point Software Package
FPU	-	Floating-Point Unit
FPx	-	Floating-Point Data Register
See the Glossary of Reference 1 for additional
definitions.


3.2  PRODUCT OVERVIEW

The FPSP adds additional floating-point capabilities to
the MC68040.  A subset of the MC6888x instruction set is
executed by the MC68040 on-chip FPU.  The remaining
floating-point instructions are emulated in software by
the FPSP (see Reference 2).  There are two types of FPSP:
one for applications compiled for the MC68881/MC68882 and
another for applications compiled for the MC68040 (see
3.8.2 Packaging).

The FPSP provides:
* Arithmetic and Transcendental Instructions
* Decimal Conversions
* Exception Handlers
* MC68040 Unimplemented Data Type and Data Format Handlers
There are two types of users: 1) end users who are running
applications and 2) system integrators who will install
the package (see 3.8.3 Site Adaptations).


3.3  GENERAL CONSTRAINTS

The FPSP satisfies the requirements of the ANSI IEEE
Standard for Binary Floating-Point Arithmetic 754. The
FPSP runs old user code unchanged and is transparent for
old code.  The FPSP is easy to modify and install.  The
performance of the transcendental function routines is
equivalent or superior to that of a 33-MHz
MC68881/MC68882.  The error bound is equivalent or
superior to the MC68881/MC68882  (see 3.7.2 Accuracy).


3.4  ASSUMPTIONS AND DEPENDENCIES

The FPSP can be installed into any operating system.  The
MC68040 FPU shall be implemented as described in Reference
2.  Table 3-1 Lists the functions provided by the MC68040.


3.5.2  Exceptions

The main goal of the FPSP exception handlers is to provide
the user with an easy path to port over existing MC68882
exception handlers for use with the MC68040. The end
result is that the FPSP provides an entry point so that
once this point is reached, there is an indication that
an IEEE-defined trap condition exist.


3.5.2.1  BSUN <20> BRANCH/SET ON UNORDERED.

On a trap-enabled condition, the FPSP updates the floating-point
instruction address register (FPIAR) by copying the PC
value in the pre-instruction stack frame to the FPIAR.
Once this is done, the exceptional frame is restored
without clearing the exception, and the program flow goes
to the FPSP provided entry point. At the entry point the
MC68040 is in an exceptional state, ready to execute the
user-supplied exception handler.


3.5.2.2  SNAN <20> SIGNALING NOT-A-NUMBER.

On a trap-disabled condition, and if the destination format is B,
W,or L, then the FPSP stores the most significant 8, 16, or
32 bits, respectively, of the SNAN mantissa, with the SNAN
bit set, to the destination. The FPSP discards the
exceptional frame, then returns to the main program flow
without entering the FPSP provided entry point, hence the
user-provided exception handler is not executed.

On a trap-enabled condition, the FPSP checks if the
destination format is B, W, or L. Then, the FPSP stores
the most significant 8, 16, or 32 bits, respectively, of
the SNAN mantissa, with the SNAN bit set, to the
destination. The FPSP then restores the exceptional frame
without clearing the exception, and branches to  the FPSP
provided entry point.  At the entry point, the MC68040 is
in an exceptional state, ready to execute the user-
supplied exception handler.


3.5.2.3  OPERR <20> OPERAND ERROR.

This exception traps through vector number 52.
Table 3-3 shows the operand errors generated by the MC68040.
Table 3-4 shows the operand errors generated by the FPSP.
Note that the FPSP Unimplemented Instruction Handler
detects and adds to the cases in which OPERR exceptions
occur. Refer to Table 3-4 for these specific exception-
causing conditions.

On a trap-disabled condition, the FPSP checks if the
operand error is caused by an FMOVE to a B, W, or L memory
or integer data register destination. If it is caused by
an integer overflow or if the floating-point data register
to be stored contains infinity, the FPSP stores the
largest positive or negative integer that can fit in the
specified destination format size. If the destination is
integer and the floating-point number to be stored is a
NAN, then the 8, 16, or 32 most significant bits of the
NAN significand is stored as a result.
Next the FPSP checks for a false OPERR condition for an
FMOVE to memory or integer data register. This condition
occurs if the operand is equal to the largest negative
integer representable in its format. The FPSP then stores
the proper result, discards the exceptional frame, and
returns to the main program flow without executing the
user-supplied exception handler.

On a trap-enabled condition, the FPSP does the same
functions as the above trap-disabled condition, with the
exception that in the end, the FPSP restores the
exceptional frame without clearing the exception and
branches to the FPSP supplied entry point instead of
returning to the main program flow. At the FPSP supplied
entry point, the MC68040 is in an exceptional state, ready
to execute the user-supplied exception handler.


3.5.2.4  OVFL <20> OVERFLOW.

This exception traps through vector number 53.

On a trap-disabled case, the FPSP stores the result in the
destination as determined by the rounding mode at the
destination as follows:

Rounding Mode			Result
	RN		Infinity, with the sign of the intermediate result
	RZ		Largest magnitude number, with the sign of the
			intermediate result.
	RM		For positive overflow, largest positive
			number
			For negative overflow, infinity
	RP		For positive overflow, infinity
			For negative overflow, largest negative
			number

The FPSP then clears the appropriate exception bit in
the frame and restores the non-exceptional frame into
the MC68040, and  then returns to the main program flow.

On a trap-enabled case, the FPSP actions are identical to
those found in the trap-disabled case, with the exception
that instead of restoring a non-exceptional frame, the
original exceptional frame is restored to the MC68040 and
the FPSP branches to the FPSP supplied entry point. At
this entry point, the MC68040 is in an exceptional state,
ready to execute the user-supplied exception handler.


3.5.2.5  UNFL <20> UNDERFLOW.

This exception traps through vector number 51.

On a trap-disabled case, the FPSP stores the result in the
destination as determined by the rounding mode at the
destination as follows:

	RN		Zero with the sign of the intermediate result.
	RZ		Zero with the sign of the intermediate result.
	RM 		For positive underflow, +zero.  For negative
			underflow, the smallest denormalized
			negative number.
	RP		For positive underflow, the smallest denormalized
			positive number.  For negative underflow, -zero.

The FPSP then clears the appropriate exception bit in the
frame and restores the non-exceptional frame into the
MC68040, and  then returns to the main program flow.

On a trap-enabled case, the FPSP actions are identical to
those found in the trap-disabled case, with the exception
that instead of restoring a non-exceptional frame, the
original exceptional frame is restored to the MC68040 and
the FPSP branches to the FPSP supplied entry point. At
this entry point, the MC68040 is in an exceptional state,
ready to execute the user-supplied exception handler.


3.5.2.6  DZ <20> DIVIDE BY ZERO.

Note that the FPSP Unimplemented Instruction Handler detects
and adds to the cases in which DZ exceptions occur. Refer to
Table 3-5 for these specific exception-causing conditions.
Table 3-6 lists the DZ exceptions generated by the MC68040.

The FPSP is not needed for this exception. The user-
supplied exception handler is always entered. A system
call is provided by the FPSP to calculate the exceptional
operand.


3.5.2.7  INEX1/INEX2 <20> INEXACT RESULT 1/2.

Note that the FPSP Unimplemented Instruction Handler detects
and allows INEX1 exceptions to occur. Furthermore, many new
cases of INEX2 exceptions may be generated by the FPSP
Unimplemented Instruction Handler as well. The INEX1
exception traps into this handler as well as INEX2
exceptions.

The FPSP is not needed for this exception. The user-
supplied exception handler is always entered.


3.5.3  Instructions

The following paragraphs describe the arithmetic and
transcendental instructions supported by the FPSP.


3.5.3.1  ARITHMETIC.

Table 3-7 shows the arithmetic instructions supported by the FPSP.


3.5.3.2  TRANSCENDENTAL.

Table 3-8 shows the transcendental instructions supported by the FPSP.


3.6  EXTERNAL INTERFACE REQUIREMENTS

For end users the FPSP is transparent; system Integrators
will integrate the FPSP into their system. (See 3.8.3.
Site Adaptations)

For applications compiled for the MC68881/MC68882 the FPSP
provides kernel routines to support the MC68040
unimplemented instructions.  The MC68040 uses vector
number 11 for the unimplemented instructions. The MC68040
stack frames are different for unimplemented
MC68881/MC68882  instructions and other F-line traps. For
applications compiled for the MC68040 the unimplemented
instructions are contained in a library (to avoid the
F_line trap overhead at runtime).

For both applications the FPSP provides kernel routines to
support exceptions (vectors 48<34>54) and unsupported data
types (vector 55).


3.7  PERFORMANCE REQUIREMENTS

The following paragraphs describe the speed, accuracy, and
compatibility  requirements for the FPSP.


3.7.1  Speed

The performance of the transcendental function routines is
equivalent or superior to that of a 33-MHz
MC68881/MC68882.


3.7.2  Accuracy

The following paragraphs describe the arithmetic
instructions, transcendental instructions, and decimal
conversions for the FPSP.


3.7.2.1  ARITHMETIC INSTRUCTIONS.

The error bound is one-half unit in the last place of the
destination format in the round-to-nearest mode, and one
unit in the last place in the other rounding modes.


3.7.2.2  TRANSCENDENTAL INSTRUCTIONS.

The error bound is less than 0.502 ulp of double precision.


3.7.2.3  DECIMAL CONVERSIONS.

The error bound is 0.97 unit in the last digit of the
destination precision for the round-to-nearest mode; and
1.47 units in the last digit of the destination precision for
the other rounding modes.


3.7.3  Compatibility

The FPSP transcendental calculation results are not the
same as for the MC68881/MC68882.  This is because the
algorithms used by the MC68881/MC68882 (CORDIC) cannot be
effectively implemented in software.  All other
calculations are identical. The error bound is equivalent
or superior to the MC68881/MC68882.


3.8  OTHER REQUIREMENTS

The following paragraphs describe other requirements for
the FPSP, such as maintainability, packaging, and site
adaptations.


3.8.1  Maintainability

The speed requirements have forced writing most of the
package in assembly language.


3.8.2  Packaging

There are two versions of the FPSP. The FPSP Kernel
Version is used to execute pre-existing user object code
written for the MC68882. This is installed as part of the
operating system. User applications need not be recompiled
or modified in any way once the FPSP Kernel Version is
installed.

The FPSP Library Version is used to compile code that uses
only the MC68040-implemented floating point instructions.
The library version provides less overhead than the FPSP
Kernel Version. Other features of this library includes
ABI compliance as well as IEEE exception-reporting
compliant. It is not however, UNIX exception-reporting
compliant.  The FPSP is not yet available in library
format.


3.8.3  Site Adaptations

Some of the entries in the vector table needs to point to
entry points within the FPSP Kernel Version.  For those
vectors the FPSP displaces, an entry point is provided to
replace that which it takes. Note that former MC68882
floating-point exception handlers need to go through minor
modifications to account for the differences between the
MC68040 and MC68882 floating point exceptional state
frames. The FPSP provides skeleton code for each floating-
point exception handler to aid in porting the MC68882
floating-point exception handlers.

For systems and applications that never set any of the
exception bits in the FPCR, or if the former MC68882
floating-point exception handlers only contain minimum
code needed to clear the exception and return, no work is
needed and the FPSP is a drop-in package.

The FPSP Library Version needs to "intercept" the
appropriate math library calls which use MC68882
transcendental instructions. Since each site has different
naming conventions,  the FPSP subroutines need to be
renamed accordingly and recompiled. The resident compiler
also needs to provide a library path search pattern such
that the FPSP is given a chance to resolve those
trancendentals instructions.


3.8.4  Stack Area Usage

To achieve code re-entrace, the FPSP allocates context-
sensitive variables on the stack. The FPSP does not
require more than 512 bytes on the stack per context. This
may be an installation concern for UNIX applications in
which there is a limited UBLOCK area,  and that the system
stack resides there.


3.8.5  ROM-based applications

One of the goals of the FPSP Kernel Version is to be able
to fit in a read-only space of no more than 64 KBytes.
There are two main sections that need to reside in ROM.
The text section, and the initialized data section. The
text section accounts for 65% while the initialized data
section accounts for 35%.


3.9  FPSP KERNEL VERSION INSTALLATION NOTES

The following paragraphs provide the MC68882 users with an
understanding of the issues involved in porting over the
FPSP into existing MC68030/MC68882 systems. Once these
issues are understood, then the actual installation is
explained.


3.9.1  Differences between the MC68040 and MC68882
	Floating-point Exception Handling

The main reason for providing the FPSP is to provide
MC68882 compatibility. If the installer understands the
main differences between the MC68882 and MC68040 in the
area of floating-point exception handlers, skip this
section and go to the next section.

There are three areas that differ between the MC68040 and
MC68882.

The first difference is that of unimplemented
instructions. The FPSP handles this by means of the F-line
exception handling. This means that if there is an
existing F-line handler, the FPSP replaces the existing F-
line exception handler, but provides an alternate entry
point for the existing F-line handler.

The second difference is unsupported data types. The
MC68040 provides a new entry point in the vector table,
therefore no existing handler is replaced by the FPSP.
There are no installation issues here.

The third difference is that of floating point exception
differences. This issue is more involved and requires
further explanations.

The IEEE standard allows the user to enable or disable
each floating point exception individually. If an
exceptional condition occurs, the IEEE defines a specific
action for the trap-disabled condition, and it also
defines certain specific actions for a trap-enabled
condition. The IEEE standard however, does not constrain
the implementation of exception handling; both software
and hardware can be used.

The MC68882 supports the IEEE exception handling
compliance totally in hardware. For example, a user-
disabled (trap disabled) exception will cause the
specified IEEE defined actions for user-disabled exception
handling to occur.  Similarly, user-enabled exceptions
will cause the MC68882 to take the exception as defined by
the IEEE trap enabled case.

The MC68040 provides full IEEE trap-disabled exception
handling compliance for the INEX and DZ exceptions. Just
as the MC68882, the MC68040 takes these exceptions only
for an IEEE trap-enabled condition. Existing MC68882
handlers have a minimum code requirement as defined by the
MC68882 User's Manual. As the MC68882 handlers, the
MC68040 handlers have a minimum code requirement as well.
The FPSP provides this minimum code requirement.

The MC68040 does not provide full IEEE exception
compliance on IEEE defined trap-disabled conditions for
the following exceptions: OVFL, UNFL, OPERR, SNAN. For
these exceptions, the MC68040 may take an exception even
on an IEEE-defined trap-disabled condition. The FPSP
provided exception handlers decide if its job is to
implement IEEE trap-disabled exception compliance, (and
therefore not execute the user supplied exception handler)
or to implement IEEE trap-enabled exception compliance,
(hence executing the user supplied exception handler). The
FPSP provides a user entry point so that when this entry
point is taken an IEEE-defined trap-enabled condition has
definitely occurred. At this specified entry point, an
exception handler  written for the MC68882 needs to be
modified to account for MC68040 stack differences, and
then placed at the user entry point.

As with the MC68882, there is a minimum code requirement
for the MC68040 handler, but this minimum code is provided
by the FPSP.

From an installation perspective, the OVFL, UNFL, OPERR,
SNAN exception handlers are replaced by the FPSP handlers,
but the FPSP provides an entry point so that MC68882-like
exception handlers may be written. Furthermore, minimum
code is provided by the FPSP and can be used as a
template.

The BSUN exception is different in that unlike the
previous exception handlers, the difference between the
MC68882 and MC68040 resides in the IEEE-defined trap
enabled case. The FPSP handles this by performing the
patch needed for MC68882 compatibility, and then restoring
the exception to the MC68040 without performing the
necessary steps to clear the BSUN exception. The
exceptional frame is restored into the MC68040 and the
FPSP branches to the user entry point provided. At this
entry point, an MC68882-like exception handler written for
the MC68040 is executed without having to worry about the
built-in incompatibility. Although this method incurs a
performance hit, it frees the user-defined exception
handler from having to write the code needed to implement
MC68882 code compatibility. As with the other exception
handlers, the FPSP provides the minimum code needed.

In summary, the FPSP replaces the following exception
handlers and provides an entry-point for MC68882-like
exception handlers for these exceptions: OVFL, UNFL,
OPERR, SNAN, BSUN, F-line.

The FPSP is not needed for the INEX and DZ exception
handlers, and these exception handlers just need to be
MC68882-like.


3.9.2  Vector Table

The entry point into the FPSP is achieved by having the
appropriate vector table offset point to a specified entry
point within the FPSP. For simplicity, all of the FPSP
main entry points are found in the file skeleton.sa.
Table 3-9 shows the vector table offset and the
appropriate labels within the file skeleton.sa that it
needs to point to.  Figure 3-1 shows a flowchart of the
entry points.

Once the entry point is reached, the user may add some
user-specific code prior to jumping to the FPSP routines (
FPSP routines are prefixed by "fpsp_").  After the jump to
the FPSP routines, the FPSP performs its function and then
jumps to the FPSP supplied entry points (if needed) found
in the file skeleton.sa.


3.9.3   FPSP Supplied Entry Points

To replace the vector table entries it displaces, the FPSP
provides an alternate entry point. For simplicity, all of
the FPSP supplied entry points are found in the file
skeleton.sa. The FPSP supplied F-line exception entry
point is straight-forward. An F-line exception handler
written for an MC68030 can be placed here without
modifications. The Unsupported Data Type exception handler
is newly-defined, it does not displace any MC68030/MC68882
exception handler. Therefore, the FPSP does not provide an
alternate entry point for this exception.

The alternate entry points have the naming convention such
that the specified exception handler is prefixed by
"real_". For instance, the entry point for user-supplied
BSUN exception handler is named "real_bsun".

For the floating-point exception handlers (BSUN, OPERR,
SNAN, DZ, OVFL, and UNFL) previously written for an
MC68882 based system, these handlers need to be modified
slightly for use with the MC68040. Once these handlers are
modified, they are then placed in the FPSP provided entry
points.


3.9.4   Extract the Hardware Independent portion of
the MC68882 handlers

To modify the existing MC68882 handlers, all of the code
used in accessing the MC68882 generated frame needs to be
stripped off. The code used in clearing an MC68882
exception (setting bit 27 of the BIU Flag) needs to be
stripped off as well. Only the hardware independent
portions of the MC68882 handlers may be used.

To aid the installer in rewriting the MC68882 exception
handlers, the file skeleton.sa provides the minimum code
necessary to clear the exception once the specific handler
is entered.

Once the hardware-independent portion is written, the
modified MC68882 handlers need to be integrated into the
portion of the code which is hardware dependent. The
minimum code needed by each exception handler is already
provided by the FPSP within the file skeleton.sa. The
following section describes the mechanics behind the
written code.


3.9.5  MC68040 Minimum Exception Code

This section describes the minimum requirements for the
user-supplied exception handlers. As mentioned in the
previous sections, these minimum handlers are provided as
part of the package, and this section is strictly for the
user's information only.

As with the MC68882, if all exceptions are always
disabled, no minimum code is necessary since the FPSP
guarantees that these FPSP provided entry points are never
entered on trap-disabled condition. Therefore, for
existing systems that do not provide exception handlers
for the MC68882, it is likely that the assumption that all
exceptions are always disabled is valid, and therefore no
user-defined MC68040 exception handlers are needed either.

The above paragraph should not be interpreted to mean that
the FPSP provided exception handlers are unnecessary. On
the contrary, the FPSP provided exception handlers are
needed, and that these FPSP exception handlers provide the
entry points for user-defined exception handlers.  Whether
or not the user-defined MC68040 exception handers are
needed is the issue being discussed.

Assuming that it is possible that the exceptions are
enabled at some point, the minimum exception handler is
similar to that defined for an MC68882. As with the
MC68882, the MC68040 requires that the first floating
point instruction be an FSAVE. Unlike the
MC68882, the MC68040 does not always require an equivalent
FRESTORE. For an E1 exception, only the FSAVE requirement
is needed, the state frame may be discarded. The E3
exception is more similar to that found in an MC68882. As
with the MC68882, the E3 exception requires an FSAVE, an
instruction that clears the exception in the resulting
FSAVE stack, followed by an FRESTORE.

If both E3 and E1 exceptions exist at the same time, then
the exception is handled as though it were an E3
exception. After which, the MC68040 re-traps to handle the
E1 exception.

The E3 exception can only be reported by the following
exception handlers: OVFL, UNFL, INEX. For these exception
handlers, this is the minimum code requirement:
	1) FSAVE
	2) if E3 bit set,  goto (4), else goto (3)
	3) E1 exception, throw away stack and RTE
	4) Clear E3 bit, FRESTORE, RTE

The E3 exception cannot be reported by the following
exception handlers: SNAN, OPERR and DZ. Since only an E1
exception needs to be handled here, this is the minimum
code requirement:
	1) FSAVE
	2) throw away stack and RTE

For the BSUN exception handler, the minimum code
requirement is:
	1) FSAVE
	2) Do one of 4 methods described in MC68040 User's
	Manual
	3) throw away stack and RTE

If the above minimum code requirements are not met, then,
an infinitely looping exception sequence occurs.


3.9.6  Mem_read and Mem_write

The mem_write and mem_read subroutines are used by the
FPSP to read and write from user space. These routines
perform a UNIX system call to lcopyin and lcopyout. The
FPSP provides a simple version of lcopyin and lcopyout for
non-UNIX applications. Installation to UNIX-based systems
requires that the FPSP provided  lcopyin and lcopyout be
deleted or commented out. For simplicity, these
subroutines are found in the file skeleton.sa.

The production version of the FPSP is fully re-entrant. If
a page fault occurs on either a mem_read or mem_write, the
operating system may perform a page-in operation and still
allow other processes to use the FPSP.


3.9.7  Increasing F-line Handler Performance

The FPSP was written to handle all possible cases of
MC68040 vs MC68030/MC68882 problem areas. Any performance
improvement in this handler increases floating-point
performance. The F-line handling may be made quicker by
pointing the vector table entry directly into the label
"fpsp_unimp" found in the file x_unimp.sa, if these
conditions are met:

	1) That the system never has to execute an FMOVECR
	instruction in which bits 0 to 5 of the F-line word are
	non-zero.

	2) An alternate F-line entry point is unnecessary.
	This optimization saves a total of three instructions. ( 1
	bra, 1 cmpi, 1 beq).


3.10  REFERENCES

1. 	MC68881UM/AD, MC68881/MC68882 Motorola Floating-Point
	Coprocessor User's Manual.  Motorola Inc., 1989
2.	M68040UM/AD M68040 32-Bit Microprocessor User's
	Manual, Motorola, Inc.,1992,
3.  	MC68020UM/AD, MC68020 32-Bit Microprocessor User's
	Manual,  Motorola, Inc., 1990.
4.  	MC68030UM/AD, MC68030 Enhanced 32-Bit Microprocessor
	User's Manual, Motorola Inc., 1990
5.  	ANSI/IEEE Std. 754,1985 Standard for Binary Floating-
	Point Arithmetic
6.	M68000PM/AD REV. 1 Programmer's Reference Manual. Motorola Inc.,		1992


3.11	Tables and Figures

		Table 3-1. Functions Provided by MC68040
	------------------------------------------------------------------
	Name         |	Description
	------------------------------------------------------------------
	FMOVE		Move to FPU
	FMOVEM		Move Multiple Registers
	FSMOVE		Single-Precision Move
	FDMOVE		Double-Precision Move
	FCMP		Compare
	FABS		Absolute Value
	FSABS		Single-Precision Absolute Value
	FDABS		Double-Precision Absolute Value
	FTST		Test
	FNEG		Negate
	FSNEG		Single-Precision Negate
	FDNEG		Double-Precision Negate
	FADD		Add
	FSUB		Subtract
	FDIV		Divide
	FMUL		Multiply
	FBcc		Branch Conditionally
	FScc		Set According to Condition
	FDBcc		Test Cond, Dec and Branch
	FTRAPcc		Trap Conditionally
	FSADD		Single-Precision Add
	FSSUB		Single-Precision Subtract
	FSMUL		Single-Precision Multiply
	FSDIV		Single-Precision Divide
	FDADD		Double-Precision Add
	FDSUB		Double-Precision Subtract
	FDMUL		Double-Precision Multiply
	FDDIV		Double-Precision Divide
	FSQRT		Square Root
	FSSQRT		Single-Precision Square Root
	FDSQRT		Double-Precision Square Root
	FNOP		No Operation
	FSAVE		Save Internal State
	FRESTORE	Restore Internal State
	FSGLDIV 	Single-Precision Divide (68882 compatible)
	FSGLMUL 	Single-Precision Multiply (68882 compatible)
	------------------------------------------------------------------


		Table 3-2. Support for Data Types and Data Formats
	------------------------------------------------------------------
        	     |			Data Formats
	             |----------------------------------------------------
	Data Types   |	SGL  |  DBL  |  EXT  |  Dec  |  Byte  |  Word | Long
	------------------------------------------------------------------
	Norm		 *	 *	 *	 @	 *	 *	 *
	Zero		 *	 *	 *	 @	 *	 *	 *
	Infinity	 *	 *	 *	 @
	NaN		 *	 *	 *	 @
	Denorm		 #	 #	 @	 @
	Unnorm				 @	 @
	------------------------------------------------------------------
Notes:
 @  = 	supported by FPSP
 *  = 	supported by the MC68040 FPU
 #  = 	supported by FPSP after being converted to extended precision by
	MC68040


		Table 3-3. Operand Errors Handled by the MC68040
	------------------------------------------------------------------
   Instruction   |	Conditions Causing Operand Error
	------------------------------------------------------------------
	FADD		( + inf )+( - inf ) or (- inf )+( + inf )
	FSUB		( + inf )-( + inf ) or (- inf )-(- inf )
	FMUL		( 0 ) x ( inf ) or ( inf ) x ( 0 )
	FDIV	 	0 / 0 or inf / inf
	FMOVE.BWL	Integer overflow, Source is NaN, or Source is inf
	FSQRT		Source < 0, Source = - inf
	------------------------------------------------------------------


		Table 3-4. Operand Errors Generated by the FPSP
	------------------------------------------------------------------
    Instruction	  |  	Condition Causing Operand Error
	------------------------------------------------------------------
	FSADD		( + inf )+( - inf ) or ( - inf )+( + inf )
	FDADD		( + inf )+( - inf ) or ( - inf )+( + inf )
	FSSUB		( + inf )-( + inf ) or ( - inf )-( - inf )
	FDSUB		( + inf )-( + inf ) or ( - inf )-( - inf )
	FSMUL		( 0 ) x ( inf ) or ( inf ) x ( 0 )
	FDMUL		( 0 ) x ( inf ) or ( inf ) x ( 0 )
	FSDIV		0 / 0 or inf / inf
	FDDIV		0 / 0 or inf / inf
	FCOS		Source is +/- inf
	FSIN		Source is +/- inf
	FTAN		Source is +/- inf
	FACOS		Source is +/- inf, > +1, or < -1
	FASIN		Source is +/- inf, > +1, or < -1
	FATANH		Source is > +1, or < -1, Source = <20> inf
	FSINCOS		Source is +/- inf
	FGETEXP		Source is +/- inf
	FGETMAN		Source is +/- inf
	FLOG10		Source is < 0, Source = - inf
	FLOG2		Source is +/- inf, > +1, or < -1
	FLOGN		Source is +/- inf, > +1, or < -1
	FLOGNP1		Source is < -1, Source is = - inf
	FMOD		FPx is +/- inf or Source is 0, Other Operand is not a
 			NaN
	FMOVE to P	Result Exponent > 999 (Decimal) or k-Factor > +17
	FREM		FPx is +/- inf or Source, Other Operand is not a NaN
	FSCALE		Source is +/- inf, Other Operand is not a NaN
	------------------------------------------------------------------

		Table 3-5.  DZ Exceptions Generated by the FPSWP
	------------------------------------------------------------------
<EFBFBD>   Instruction	  |  	Condition Causing DZ Exception
	------------------------------------------------------------------
	FATANH		Source Operand = $ + -$1
	FLOG10		Source Operand  = 0
	FLOG2		Source Operand  = 0
	FLOGN		Source Operand  = 0
	FLOGNP1		Source Operand  = -1
	FSGLDIV		Source Operand  = 0 and FPn is not a NaN, Infinity,
			or 0
	------------------------------------------------------------------


		Table 3-6.  DZ Exceptions Generated by the MC68040
	------------------------------------------------------------------
    Instruction	  |	Condition Causing DZ Exception
	------------------------------------------------------------------
	FDIV		Source Operand  = 0 and FPn is not a NaN, Infinity,
			or 0
	FSDIV		Source Operand  = 0 and FPn is not a NaN, Infinity,
			or 0
	FDDIV		Source Operand  = 0 and FPn is not a NaN, Infinity,
			or 0
	------------------------------------------------------------------


		Table 3-7. Arithmetic Instructions
	------------------------------------------------------------------
	Name	|	Description
	------------------------------------------------------------------
	FADD*		Add
	FSUB*		Subtract
	FSADD*+		Single-Precision Add
	FSSUB*+		Single-Precision Subtract
	FDADD*+		Double-Precision Add
	FDSUB*+		Double-Precision Subtract
	FMUL*		Multiply
	FDIV*		Divide
	FSMUL*+		Single-Precision Multiply
	FSDIV*+		Single-Precision Divide
	FDMUL*+		Double-Precision Multiply
	FDDIV*+		Double-Precision Divide
	FINT		Integer Part
	FINTRZ		Integer Part (Truncated)
	FABS*		Absolute Value
	FNEG*		Negate
	FGETEXP		Get Exponent
	FGETMAN		Get Mantissa
	FTST*		Test Operand
	FCMP*		Compare
	FREM		IEEE Remainder
	FSCALE		Scale Exponent
	FMOVE*		Move FP data register
	FSMOVE*		Single-Precision Move
	FDMOVE*		Double-Precision Move
	FSQRT*		Square Root
	FSSQRT*		Single-Precision Square Root
	FTWOTOX		2 to the X Power
	FMOD		Modulo Remainder
	FDSQRT*		Double-Precision Square Root
	FDMOD		Double-Precision Modulo Remainder
	FSMOD		Single-Precision Modulo Remainder
	------------------------------------------------------------------
Notes:
 *  The FPSP provides these functions for all decimal data formats,
    single, double, and extended denormalized data types, and extended
    unnormalized data types. The MC68040 provides these functions
    for the remaining formats and types (See page 11 of Reference 2).
 +  Additional functions which are not provided by the MC68881/MC68882.


   		Table 3-8. Transcendental Instructions
	------------------------------------------------------------------
	Name	|	Description
	------------------------------------------------------------------
	FCOS		Cosine
	FSIN		Sine
	FACOS		Arc Cosine
	FASIN		Arc Sine
	FCOSH		Hyperbolic Cosine
	FSINH		Hyperbolic Sine
	FSINCOS		Simultaneous Sine & Cosine
	FATAN		Arc Tangent
	FTAN		Tangent
	FATANH		Hyperbolic Arc Tan
	FTANH		Hyperbolic Tangent
	FLOG10		Log Base 10
	FLOG2		Log Base 2
	FLOGNP1		Log Base e of (x+1)
	FLOGN		Log Base e
	FETOXM1		(e to the x Power) -1
	FETOX		e to the x Power
	FTWOTOX		2 to the x Power
	FTENTOX		10 to the x Power
	------------------------------------------------------------------


			Table 3-9.  FPSP Provided Entry Points
	------------------------------------------------------------------
	Exception Type       |	Vector Table  |  FPSP entry  |  User entry
			     |    (offset)    |   point      |   point
	------------------------------------------------------------------
	F-line unimplemented	vector 11 ($2C)     fline	real_fline
	float instruction
	------------------------------------------------------------------
	Branch or set on	vector 48 ($20)     bsun	real_bsun
	unordered
	------------------------------------------------------------------
	Inexact			vector 49 ($C4)     inex	real_inex
	------------------------------------------------------------------
	Divide-by-zero		vector 50 ($C8)	    dz		real_dz
	------------------------------------------------------------------
	Underflow		vector 51 ($CC)	    unfl	real_unfl
	------------------------------------------------------------------
	Operand error		vector 52 ($D0)	    operr	real_operr
	------------------------------------------------------------------
	Overflow		vector 53 ($D4)	    ovfl	real_ovfl
	------------------------------------------------------------------
	Signalling Not-A-	vector 54 ($D8)	    snan	real_snan
	Number
	------------------------------------------------------------------
	Unsupported data type	vector 55 ($DC)	    unsupp
	------------------------------------------------------------------


                            File: skeleton.sa
                          |---------------------|
                          |                     |       File: x_unfl.sa
                          |                     |    |----------------------|
          VECTOR	  |                     |    |                      |
          TABLE           |                     | /->|fpsp_unfl:            |
        |------------| /->|unfl:                | |  |      .               |
        |            | |  |   jmp fpsp_unfl ----|-/  |      .               |
        |            | |  |                     |    |      .               |
vbr+$cc |addr of unfl|-/  |                     |    | HANDLE NON-MASKABLE  |
        |            |    |real_unfl:  <--------|--\ | EXCEPTION CONDITION  |
        |            |    |          .          |  | |      .               |
        |            |    |          .          |  | |      .               |
        |            |    |          .          |  | |      .               |
        |            |    |   USER TRAP HANDLER |  | |                      |
        |------------|    |          .          |  | | if FPCR Exception    |
			  |          .          |  | |    Byte UNFL bit set,|
                          |          .          |  \-|--  jmp real_unfl     |
                          |rte       .          |    | else rte             |
                          |                     |    |----------------------|
                          |---------------------|
			Figure 3-1 FPSP Entry Points


SECTION 4	FPSP Library Version

4.1	When to use the FPSP Library Version.

The FPSP Library Version is intended to provide better performance for
trancendental instructions. It gets its performance by avoiding the
overhead involved in F-line trap emulation, as used by the Unimplemented
Instruction Handler. The FPSP Library Version is optional, and user code
needs to be recompiled to make use of it.


4.2	Installation Notes

The library version of the FPSP can be built by running 'make libFPSP.a'
from either the Makefile (for asm syntax) or fpsp.mk (for as syntax).
The 'make convert' step in Makefile will build both kernel and library .s
files from the .sa sources.  Change the SYS= and PREFIX= variables
in Makefile BEFORE running 'make convert'.  Three templates are supplied
for building the library version: GEN, CI5 and R3V6.  The GEN templates
generate entry points for single, double and extended precision routines and
provide the closest emulation of the kernel FPSP.  The CI5 and R3V6
templates are faster, but discard most of the condition code and control
register handling, and only provide the double precision entry points.

The entry point names are contained in L_LIST.  Change the first 3
entries of each line to suit your system.


4.3	Differences in the library version:

	1.) Single and Double precision SNAN's will not generate an SNAN
	exception because they are converted to extended precision
	and doing so causes them to turn into non-signalling NAN's.

	Example: facos.d 7ff7_ffff_ffff_ffff snan

	2.) An enabled Inexact exception may not be taken in all cases.

	Example: facos.x 000000000000000000000001 d inex2
	         fint.x 403d_0000_aaaa_aaaa_aaaa_ffff inex2

	3.) The return value in fp0 is undefined when an enabled OPERR or DZ
	exception ocurrs.  In the kernel FPSP, the destination register
	is unchanged.

	4.)fscale does not return the right result when an underflow ocurrs.
	The problem is that the t_unfl code in the l_support.sa file
	cannot exactly mimic the kernel FPSP version because the incoming
	FPCR is not in the same place every time.


4.4	Changes in the library version of FPSP rel 2.3:

  A floating point exception occurs when a transcendentals called twice.
      example: main()
                   { double d;
                     d= 0.0;
                     x = facosd(0.0);
                     y = facosd(d);
                    }
        This is fixed in release 2.2 of FPSP.
 A followup on the above bug was to restore the fpcr beforw it unlinks.
        This is fixed in release 2.3 of FPSP.


4.5	Performance

Overall, the library version is twice as fast as the kernel code.


APPENDIX A	BUG TEMPLATE

Use the template below when reporting bugs.

Any fields designated with an asterick (*) can be left blank.
When complete, please fax the report to the following
phone number :

	(800) 248-8567

To assist you in filling out this form, a description of each field follows
the template.
Should you have any questions, please fax us at the above number.
----------------------------------------------------------------------------

Problem# (0-0000)

Key Words

Severity (1,2,3)

Customer Description

Long Description

System Description

Date Reported

Reported By

Phone

*Resolved?

*Date Resolved

*Who fixed

*Correction Description

*Modules Affected

Problem Release/Load

*Test suite passed

*sccs version control

---------------------------------------------------------------------------

The following is a brief description of how to use the bug report template.

---------------------------------------------------------------------------

Problem# (0-0000) 	You may include a number that you will use internally
			to track this bug.  We will log it but assign our own
			# to track your bug repair.  Please choose one person
			as the individual to send in all bug reports for your
			firm.  This should help avoid confusion and ensure a
			smooth working relationship.

Key Words 		Indicate the key terms associated with this bug
			report.  For example: fpcr, denorm

Severity (1,2,3,4,5) 	This indicates the severity of the bug.
			The following descriptions are taken from AT&T test
			document:
			1:  an error that causes the FPSP to crash and no
			further work can be done.  An error that causes gross
			deviations of results.  Non-waiverable compliance
			violations are also severity 1.
			2:  an error that represents a substantial deviation in
			the functionality of the FPSP or deviation from
			IEEE 754 standard.
			3:  an error that represents a deviation in the
			functionality.  However, the customer is able to
			implement a workaround to this problem.
			4:  an error that represents a minor deviation or
			incorrect documentation.
			5:  a request for product enhancement.


Customer Description 	This is a brief description of the problem.

Long Description 	This is a more detailed description of the problem.
			This could contain a short code fragment, a
			suggested fix for the bug or reference a longer
			file with this type of information in it.

System Description 	This is a brief description of your system.

Date Reported 		MM/DD/YY

Reported By 		Your name

Phone 			Enter your telephone number including the
			area code.

*Resolved? 		Enter "yes" or "no" only.

*Date Resolved 		When the bug is fixed the date it was fixed
			will be entered by us in this field.

*Who fixed              Name of the person who fixed the bug.

*Correction Description	When the bug is fixed we will enter a
			description of the fix in this field.

*Modules Affected 	When the bug is fixed we will enter the
			module name.


Problem Release/Load 	Enter the release or load information in this
			field. This information should be on the label
			for the tape that was sent to you.

*Test suite passed      You do not need to fill out this field.  This is
                        the test suite file that we use to verify bugs
                        and/or fixes.

*sccs version control   You do not need to fill out this field.  This is
                        the sccs version that contains the fix.