* Add the README file from Ian Dall's original distribution.
This commit is contained in:
parent
84b238b8eb
commit
483cff53cd
|
@ -0,0 +1,321 @@
|
|||
|
||||
IEEE handler README
|
||||
-------------------
|
||||
|
||||
Ian Dall <ian.dall@dsto.defence.gov.au>
|
||||
|
||||
December 1995
|
||||
|
||||
1. Introduction
|
||||
|
||||
The ns32081 and the ns32381 floating point units implement a subset of
|
||||
the IEEE floating point standard. In cases where the correct operation
|
||||
is to generate special values, +-Infinity, NaN or a denormalized
|
||||
number, or when one of the operands is a special value, the FPU
|
||||
generates a trap. It is intended that the missing functionality be
|
||||
implemented in software. This packages provides that missing
|
||||
functionality.
|
||||
|
||||
The trap handling code can run either in the kernel, or in user code
|
||||
as the signal handler in a unix process. The latter has the
|
||||
disadvantage that code changes are required to set up the signal
|
||||
handler, so it is intended that this mode way of use be used primarily
|
||||
for debugging. So far the in-kernel implementation has been done for
|
||||
mach. There should be no large obstacles to incorporating this package
|
||||
in the NetBSD kernel.
|
||||
|
||||
2. Building
|
||||
|
||||
This section describes how to build the unix signal handler version.
|
||||
When it is being built into the kernel, it is assumed that the kernels
|
||||
build environment will be employed. The code requires gcc. Also the
|
||||
supplied Makefile assumes GNU make.
|
||||
|
||||
Typing "make" or "gmake" will build the IEEE handler library, and the
|
||||
test programs div0test, optest and ovtest. Take care to preserve the
|
||||
optimization settings in Makefile for the test code. These have been
|
||||
set to cause gcc to produce the appropriate test code. In some cases,
|
||||
compiling with the wrong optimization may cause the compiler to crash
|
||||
with a floating point exception as it attempts to pre-calculate some
|
||||
result which is intended to cause a run time floating point exception.
|
||||
|
||||
3. Implementation
|
||||
|
||||
Assume that the CPU and FPU registers are already saved in some
|
||||
structure passed as an argument to ieee_handle_exception. In the case
|
||||
of the signal handler version, ieee_sig is the appropriate entry
|
||||
point, which takes sigcontext as an argument. ieee_sig fetches the fpu
|
||||
state, calls ieee_handle_exception and restores the fpu state. For the
|
||||
rest of this section we assume in-kernel operation.
|
||||
|
||||
The trap processing proceeds as follows:
|
||||
|
||||
o decode the instruction, including addressing modes found at the
|
||||
PC.
|
||||
o fetch the operands.
|
||||
o get the operands into an internal canonical form. Floating
|
||||
operands become 8 byte doubles and integral operands become 4 byte
|
||||
integers.
|
||||
o get the trap type, eg overflow, underflow or reserved operand out of
|
||||
the FSR, check whether the trap is enabled and if not switch to
|
||||
functions to handle that trap.
|
||||
o if the trap has been successfully handled, convert the result from
|
||||
the canonical form to the destination form, write the operand and
|
||||
increment the pc so that when the thread which took the exception
|
||||
restarts, it is at the next instruction and return FPC_TT_NONE.
|
||||
o if the user elected to handle the exception, or if some problem occurred,
|
||||
ieee_handle_exception will return a trap type not equal to FPC_TT_NONE.
|
||||
|
||||
3.1 Status.
|
||||
|
||||
IEEE floating point standard says that special operands, Infinity, Nan
|
||||
etc should be handled by default, but there is provision for the user
|
||||
to specify that a trap should occur. The ns32381 always traps, so it
|
||||
is up to the kernel trap handler to either handle the trap
|
||||
transparently or pass it on to the user as required. To control this
|
||||
functionality, there needs to be flags which the user can
|
||||
set. Fortunately the floating status register (FSR) has 7 bits
|
||||
reserved for this purpose (the FPC_SWF field). The following flags are
|
||||
defined:
|
||||
|
||||
FPC_OVE 0x200 /* Overflow enable */
|
||||
FPC_OVF 0x400 /* Overflow flag */
|
||||
FPC_IVE 0x800 /* Invalid enable */
|
||||
FPC_IVF 0x1000 /* Invalid flag */
|
||||
FPC_DZE 0x2000 /* Divide by zero enable */
|
||||
FPC_DZF 0x4000 /* Divide by zero flag */
|
||||
FPC_UNDE 0x8000 /* Soft Underflow enable, requires FPC_UEN */
|
||||
|
||||
In addition there are the hardware defined flags:
|
||||
|
||||
FPC_IF 0x00000040 /* inexact result flag */
|
||||
FPC_IEN 0x00000020 /* inexact result trap enable */
|
||||
FPC_UF 0x00000010 /* underflow flag */
|
||||
FPC_UEN 0x00000008 /* underflow trap enable */
|
||||
|
||||
If the corresponding enable flag is set when a trap occurs, then
|
||||
ieee_handle_exception simply returns the trap type. The calling code
|
||||
can then send the appropriate signal. Underflow is a little different
|
||||
since there are three possible desired behaviours; produce a result of
|
||||
zero, generate denormalized numbers and generate a signal. To provide
|
||||
this level of control, there are two underflow bits:
|
||||
|
||||
FPC_UEN FPC_UNDE
|
||||
0 X Produce zero
|
||||
1 0 Produce denormalized numbers
|
||||
1 1 Pass trap on to user
|
||||
|
||||
Whenever a trap occurs, the corresponding flag bit is set. Flags are
|
||||
never cleared except by the user.
|
||||
|
||||
3.2 Subnormal numbers.
|
||||
|
||||
On an underflow trap, we need to be able to generate denormalized
|
||||
numbers. Also, having generated the denormalized numbers, they will
|
||||
cause a reserved operand trap if they are operands to any subsequent
|
||||
operations. So we need to be able to generate and perform operations
|
||||
on denormalized numbers.
|
||||
|
||||
Rather than produce a complete IEEE floating point emulation, the
|
||||
approach to doing arithmetic on denorms is to first scale the operands
|
||||
so that the operation can't possibly overflow or underflow, perform
|
||||
the operation and then normalize. Care is taken to use the same
|
||||
rounding mode as the thread which got the exception.
|
||||
|
||||
3.3 Error handling within the IEEE handler package.
|
||||
|
||||
If an instruction can't be decoded, or copyin or copyout fails
|
||||
(presumably because an address is outside the tasks address space),
|
||||
then ieee_handle_exception returns FPC_TT_ILL. It would be possible to
|
||||
invent some new codes if more information is required. FPC_TT_ILL is
|
||||
also returned if the external addressing mode is encountered. No one
|
||||
uses this addressing mode.
|
||||
|
||||
4. Usage
|
||||
|
||||
4.1 Getting and setting the FSR contents.
|
||||
|
||||
The FSR fields are defined in fpu_status.h. There is a macro
|
||||
GET_SET_FSR which gets the old value of the FSR and sets the new value
|
||||
of the FSR. There are also seperate GET_FSR and SET_FSR macros.
|
||||
Eg:
|
||||
|
||||
#include <fpu_status.h>
|
||||
|
||||
int fsr = GET_SET_FSR(FPC_UEN);
|
||||
.
|
||||
. /* Code requiring FPC_UEN to be set */
|
||||
.
|
||||
SET_FSR(fsr);
|
||||
|
||||
4.2 Signal Handler
|
||||
|
||||
The simplest way to use the package as a signal handler is as follows:
|
||||
|
||||
#include <signal.h>
|
||||
#include <ieee_handler.h>
|
||||
...
|
||||
signal(SIGFPE, ieee_sig);
|
||||
...
|
||||
|
||||
The ieee_sig function returns a code as for ieee_handle_exception. To
|
||||
make use up this return code it would be necessary to write a wrapper
|
||||
function for ieee_sig which did the right thing, possibly calling
|
||||
kill(2) to send a (different) signal. ieee_sig could be made more
|
||||
sophisticated in this respect, but hasn't since this mode of operation
|
||||
is intended primarily as a debugging aid.
|
||||
|
||||
4.3 Mach Kernel
|
||||
|
||||
The fpintr() function is duplicated here to illustrate how the interface
|
||||
is implemented in the mach kernel.
|
||||
|
||||
/*
|
||||
* FPU error.
|
||||
*/
|
||||
void fpintr(struct ns532_saved_state *regs)
|
||||
{
|
||||
int ss;
|
||||
int status;
|
||||
state state;
|
||||
state.regs = regs;
|
||||
state.fps = current_thread()->pcb->fps;
|
||||
ss = splsched(); /* Note 1 */
|
||||
fp_save(); /* Note 2 */
|
||||
#if MACH_KDB
|
||||
if (ieee_handler_enable) /* Note 3 */
|
||||
#endif
|
||||
{
|
||||
_enable_fpu(); /* Note 4 */
|
||||
status = ieee_handle_exception(&state); /* Note 5 */
|
||||
_disable_fpu(); /* Note 4 */
|
||||
}
|
||||
splx(ss);
|
||||
switch(status) {
|
||||
case FPC_TT_ILL:
|
||||
exception(EXC_BAD_INSTRUCTION, EXC_NS532_ILL, 0); /* Note 6 */
|
||||
/* NOT REACHED */
|
||||
case FPC_TT_NONE:
|
||||
break;
|
||||
default:
|
||||
exception(EXC_ARITHMETIC, EXC_NS532_SLAVE, /* Note 7 */
|
||||
(int)current_thread()->pcb->fps->fsr);
|
||||
/* NOT REACHED */
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Notes:
|
||||
|
||||
1 The ieee_handle_exception function is not re-entrant. This is for
|
||||
two reasons. Mainly the code uses the fpu and the kernel fpu state
|
||||
is not saved on interrupts. (User fpu state is managed by
|
||||
allocating the fpu on demand to a thread. The fpu state is saved
|
||||
only when the fpu is allocated to a new thread.) A second reason
|
||||
for the code not being reentrant is that a static structure is
|
||||
used to keep track of data which has been copyin'd. This latter
|
||||
case could be eliminated fairly easily, but there seems no point
|
||||
since the code can't be reentrant for the first reason.
|
||||
|
||||
To prevent other threads running and maybe using the fpu, we simply
|
||||
run ieee_handle_exception at splsched.
|
||||
|
||||
2 The fp_save() also disables the fpu bit by clearing the bit in the cfg
|
||||
register. We could make a slight saving by deferring the disable.
|
||||
|
||||
3 Having the ieee_handler_enable flag allows the in-kernel
|
||||
processing to be turned off. This is useful when trying to debug
|
||||
the signal handler version.
|
||||
|
||||
4 Ensure the fpu is enabled since ieee_handle_exception uses floating
|
||||
point operations.
|
||||
|
||||
5 Call the handler and save its return status. The state argument
|
||||
is constructed and passed. At this point in the processing, the
|
||||
general registers have not yet been saved in the pcb, although the
|
||||
floating point registers have been. Otherwise we could have arranged to
|
||||
pass the pcb.
|
||||
|
||||
6 If the handler returned FPC_TT_ILL generate an illegal instruction
|
||||
trap. This will ultimately cause a SIGILL.
|
||||
|
||||
7 Default to generating an arithmetic trap which will ultimately
|
||||
cause a SIGFPE.
|
||||
|
||||
4.4 NetBSD Kernel
|
||||
|
||||
Currently support for NetBSD in-kernel has not been
|
||||
implemented. However, care has been taken to ensure that the code is
|
||||
flexible. For example, each implementation is free to specify pretty
|
||||
much whatever structure is most convenient to contain the CPU and FPU
|
||||
state. Using the mach kernel implementation (section 4.3) as an
|
||||
example, and following the hints in section 4.5, it should be easy to
|
||||
implement the NetBSD in-kernel support.
|
||||
|
||||
4.5 Porting to other Environments
|
||||
|
||||
The package is ns32k specific and assumes gcc. Otherwise it is intended
|
||||
to be efficiently portable to other environments as easily as possible.
|
||||
|
||||
The calling code needs to get the CPU and FPU status into some
|
||||
data structure. It is pretty flexible what data structure that might
|
||||
be. In particular, it is OK for a structure to contain pointers to
|
||||
other structures.
|
||||
|
||||
There must be a "typedef struct x state" statement in ieee_handler.h
|
||||
since the type "state" is used in the prototype for
|
||||
ieee_handle_exception(). Access to long (double precision floating
|
||||
point), float and general purpose registers in the "state" type is
|
||||
facilitated by the macros:
|
||||
|
||||
LREGBASE(s)
|
||||
LREGOFFSET(n)
|
||||
FREGBASE(s)
|
||||
FREGOFFSET(n)
|
||||
REGBASE(s)
|
||||
REGOFFSET(n)
|
||||
|
||||
The OFFSET macros and the BASE macros must be defined so that, for
|
||||
example, REGOFFSET(3) + REGBASE(state) is the address of register
|
||||
3. The BASE macros must be constant expressions since they are used to
|
||||
initialize a table of offsets with the "const" and "static"
|
||||
attributes. Other macros, FSR, FP, SP, SB, PC and PSR are defined
|
||||
such that, for example, (state *)s->FP accesses the fp (frame pointer)
|
||||
register.
|
||||
|
||||
This should be quite flexible and examples for the mach and signal
|
||||
handler implementations can be found in ieee_handler.h.
|
||||
|
||||
There should be only one other place where customization may be required.
|
||||
That is in ieee_handler.c. The functions setjmp, longjmp and get_dword
|
||||
are required. The first two may not be available in the kernel. In the
|
||||
case of mach, setjmp and longjmp are defined in terms of _setjmp and
|
||||
_longjmp. The get_dword macro uses copyin to get a long int from
|
||||
user space using copyin.
|
||||
|
||||
It is assumed that copyin and copyout are available. In the signal
|
||||
handler case, these are defined in terms of memcpy.
|
||||
|
||||
5. To Do
|
||||
|
||||
The testing has been cursory at best. A more sophisticated test suite
|
||||
is needed. The conformance to the standard is probably patchy since
|
||||
the author has never actually seen the IEEE floating point standard!
|
||||
|
||||
6. BUGS
|
||||
|
||||
Please report any bugs, and especially any improvements to the
|
||||
author, Ian Dall <ian.dall@dsto.defence.gov.au>.
|
||||
|
||||
6. Copyright
|
||||
|
||||
This code is Copyright Ian Dall. Please respect it.
|
||||
|
||||
Permission to use, copy, modify and distribute this software and its
|
||||
documentation is hereby granted, provided that both the copyright
|
||||
notice and this permission notice appear in all copies of the
|
||||
software, derivative works or modified versions, and any portions
|
||||
thereof, and that both notices appear in supporting documentation.
|
||||
|
||||
If you have a good reason to want to vary the permission notice,
|
||||
I am open to negotiation.
|
Loading…
Reference in New Issue