* Add the README file from Ian Dall's original distribution.

1996-10-09 07:40:45 +00:00 · 1996-10-09 07:40:45 +00:00 · 483cff53cd
parent 84b238b8eb
commit 483cff53cd
1 changed files with 321 additions and 0 deletions
--- a/sys/arch/pc532/fpu/README
+++ b/sys/arch/pc532/fpu/README
@ -0,0 +1,321 @@
+
+			 IEEE handler README
+			 -------------------
+
+	       Ian Dall <ian.dall@dsto.defence.gov.au>
+
+			    December 1995
+
+1. Introduction
+
+The ns32081 and the ns32381 floating point units implement a subset of
+the IEEE floating point standard. In cases where the correct operation
+is to generate special values, +-Infinity, NaN or a denormalized
+number, or when one of the operands is a special value, the FPU
+generates a trap. It is intended that the missing functionality be
+implemented in software. This packages provides that missing
+functionality.
+
+The trap handling code can run either in the kernel, or in user code
+as the signal handler in a unix process. The latter has the
+disadvantage that code changes are required to set up the signal
+handler, so it is intended that this mode way of use be used primarily
+for debugging. So far the in-kernel implementation has been done for
+mach. There should be no large obstacles to incorporating this package
+in the NetBSD kernel.
+
+2. Building
+
+This section describes how to build the unix signal handler version.
+When it is being built into the kernel, it is assumed that the kernels
+build environment will be employed.  The code requires gcc. Also the
+supplied Makefile assumes GNU make.
+
+Typing "make" or "gmake" will build the IEEE handler library, and the
+test programs div0test, optest and ovtest. Take care to preserve the
+optimization settings in Makefile for the test code. These have been
+set to cause gcc to produce the appropriate test code. In some cases,
+compiling with the wrong optimization may cause the compiler to crash
+with a floating point exception as it attempts to pre-calculate some
+result which is intended to cause a run time floating point exception.
+
+3. Implementation
+
+Assume that the CPU and FPU registers are already saved in some
+structure passed as an argument to ieee_handle_exception. In the case
+of the signal handler version, ieee_sig is the appropriate entry
+point, which takes sigcontext as an argument. ieee_sig fetches the fpu
+state, calls ieee_handle_exception and restores the fpu state. For the
+rest of this section we assume in-kernel operation.
+
+The trap processing proceeds as follows:
+
+  o decode the instruction, including addressing modes found at the
+    PC.
+  o fetch the operands.
+  o get the operands into an internal canonical form. Floating
+    operands become 8 byte doubles and integral operands become 4 byte
+    integers.
+  o get the trap type, eg overflow, underflow or reserved operand out of
+    the FSR, check whether the trap is enabled and if not switch to
+    functions to handle that trap.
+  o if the trap has been successfully handled, convert the result from
+    the canonical form to the destination form, write the operand and
+    increment the pc so that when the thread which took the exception
+    restarts, it is at the next instruction and return FPC_TT_NONE.
+  o if the user elected to handle the exception, or if some problem occurred,
+    ieee_handle_exception will return a trap type not equal to FPC_TT_NONE.
+
+3.1 Status.
+
+IEEE floating point standard says that special operands, Infinity, Nan
+etc should be handled by default, but there is provision for the user
+to specify that a trap should occur. The ns32381 always traps, so it
+is up to the kernel trap handler to either handle the trap
+transparently or pass it on to the user as required. To control this
+functionality, there needs to be flags which the user can
+set. Fortunately the floating status register (FSR) has 7 bits
+reserved for this purpose (the FPC_SWF field). The following flags are
+defined:
+
+    FPC_OVE  0x200		/* Overflow enable */
+    FPC_OVF  0x400		/* Overflow flag */
+    FPC_IVE  0x800		/* Invalid enable */
+    FPC_IVF  0x1000		/* Invalid flag */
+    FPC_DZE  0x2000		/* Divide by zero enable */
+    FPC_DZF  0x4000		/* Divide by zero flag */
+    FPC_UNDE 0x8000		/* Soft Underflow enable, requires FPC_UEN */
+
+In addition there are the hardware defined flags:
+
+    FPC_IF		0x00000040	/* inexact result flag */
+    FPC_IEN		0x00000020	/* inexact result trap enable */
+    FPC_UF	        0x00000010	/* underflow flag      	   */
+    FPC_UEN	        0x00000008	/* underflow trap enable */
+
+If the corresponding enable flag is set when a trap occurs, then
+ieee_handle_exception simply returns the trap type. The calling code
+can then send the appropriate signal. Underflow is a little different
+since there are three possible desired behaviours; produce a result of
+zero, generate denormalized numbers and generate a signal. To provide
+this level of control, there are two underflow bits:
+
+	FPC_UEN	FPC_UNDE
+	0	X		Produce zero
+	1	0		Produce denormalized numbers
+	1	1		Pass trap on to user
+
+Whenever a trap occurs, the corresponding flag bit is set. Flags are
+never cleared except by the user.
+
+3.2 Subnormal numbers.
+
+On an underflow trap, we need to be able to generate denormalized
+numbers. Also, having generated the denormalized numbers, they will
+cause a reserved operand trap if they are operands to any subsequent
+operations. So we need to be able to generate and perform operations
+on denormalized numbers.
+
+Rather than produce a complete IEEE floating point emulation, the
+approach to doing arithmetic on denorms is to first scale the operands
+so that the operation can't possibly overflow or underflow, perform
+the operation and then normalize. Care is taken to use the same
+rounding mode as the thread which got the exception.
+
+3.3 Error handling within the IEEE handler package.
+
+If an instruction can't be decoded, or copyin or copyout fails
+(presumably because an address is outside the tasks address space),
+then ieee_handle_exception returns FPC_TT_ILL. It would be possible to
+invent some new codes if more information is required. FPC_TT_ILL is
+also returned if the external addressing mode is encountered. No one
+uses this addressing mode.
+
+4. Usage
+
+4.1 Getting and setting the FSR contents.
+
+The FSR fields are defined in fpu_status.h. There is a macro
+GET_SET_FSR which gets the old value of the FSR and sets the new value
+of the FSR. There are also seperate GET_FSR and SET_FSR macros.
+Eg:
+
+    #include <fpu_status.h>
+
+       int fsr = GET_SET_FSR(FPC_UEN);
+       .
+       . /* Code requiring FPC_UEN to be set */
+       .
+       SET_FSR(fsr);
+
+4.2 Signal Handler
+
+The simplest way to use the package as a signal handler is as follows:
+
+    #include <signal.h>
+    #include <ieee_handler.h>
+       ...
+       signal(SIGFPE, ieee_sig);
+       ...
+
+The ieee_sig function returns a code as for ieee_handle_exception.  To
+make use up this return code it would be necessary to write a wrapper
+function for ieee_sig which did the right thing, possibly calling
+kill(2) to send a (different) signal. ieee_sig could be made more
+sophisticated in this respect, but hasn't since this mode of operation
+is intended primarily as a debugging aid.
+
+4.3 Mach Kernel
+
+The fpintr() function is duplicated here to illustrate how the interface
+is implemented in the mach kernel.
+
+    /*
+     * FPU error.
+     */
+    void fpintr(struct ns532_saved_state *regs)
+    {
+	    int ss;
+	    int status;
+	    state state;
+	    state.regs = regs;
+	    state.fps = current_thread()->pcb->fps;
+	    ss = splsched();	/* Note 1 */
+	    fp_save();		/* Note 2 */
+    #if MACH_KDB
+	    if (ieee_handler_enable) /* Note 3 */
+    #endif
+	      {
+		_enable_fpu();	/* Note 4 */
+		status = ieee_handle_exception(&state);	/* Note 5 */
+		_disable_fpu();	/* Note 4 */
+	      }
+	    splx(ss);
+	    switch(status) {
+	    case FPC_TT_ILL:
+	      exception(EXC_BAD_INSTRUCTION, EXC_NS532_ILL, 0);	/* Note 6 */
+	      /* NOT REACHED */
+	    case FPC_TT_NONE:
+	      break;
+	    default:
+	      exception(EXC_ARITHMETIC, EXC_NS532_SLAVE, /* Note 7 */
+			(int)current_thread()->pcb->fps->fsr);
+	      /* NOT REACHED */
+	    }
+    }
+
+
+Notes:
+
+  1 The ieee_handle_exception function is not re-entrant. This is for
+    two reasons. Mainly the code uses the fpu and the kernel fpu state
+    is not saved on interrupts. (User fpu state is managed by
+    allocating the fpu on demand to a thread. The fpu state is saved
+    only when the fpu is allocated to a new thread.) A second reason
+    for the code not being reentrant is that a static structure is
+    used to keep track of data which has been copyin'd. This latter
+    case could be eliminated fairly easily, but there seems no point
+    since the code can't be reentrant for the first reason.
+
+    To prevent other threads running and maybe using the fpu, we simply
+    run ieee_handle_exception at splsched.
+
+  2 The fp_save() also disables the fpu bit by clearing the bit in the cfg
+    register. We could make a slight saving by deferring the disable.
+
+  3 Having the ieee_handler_enable flag allows the in-kernel
+    processing to be turned off. This is useful when trying to debug
+    the signal handler version.
+
+  4 Ensure the fpu is enabled since ieee_handle_exception uses floating
+    point operations.
+
+  5 Call the handler and save its return status. The state argument
+    is constructed and passed. At this point in the processing, the
+    general registers have not yet been saved in the pcb, although the
+    floating point registers have been. Otherwise we could have arranged to
+    pass the pcb.
+    
+  6 If the handler returned FPC_TT_ILL generate an illegal instruction
+    trap. This will ultimately cause a SIGILL.
+
+  7 Default to generating an arithmetic trap which will ultimately
+    cause a SIGFPE.
+
+4.4 NetBSD Kernel 
+
+Currently support for NetBSD in-kernel has not been
+implemented. However, care has been taken to ensure that the code is
+flexible. For example, each implementation is free to specify pretty
+much whatever structure is most convenient to contain the CPU and FPU
+state. Using the mach kernel implementation (section 4.3) as an
+example, and following the hints in section 4.5, it should be easy to
+implement the NetBSD in-kernel support.
+
+4.5 Porting to other Environments
+
+The package is ns32k specific and assumes gcc. Otherwise it is intended
+to be efficiently portable to other environments as easily as possible.
+
+The calling code needs to get the CPU and FPU status into some
+data structure. It is pretty flexible what data structure that might
+be. In particular, it is OK for a structure to contain pointers to
+other structures.
+
+There must be a "typedef struct x state" statement in ieee_handler.h
+since the type "state" is used in the prototype for
+ieee_handle_exception().  Access to long (double precision floating
+point), float and general purpose registers in the "state" type is
+facilitated by the macros:
+
+   LREGBASE(s)
+   LREGOFFSET(n)
+   FREGBASE(s)
+   FREGOFFSET(n)
+   REGBASE(s)
+   REGOFFSET(n)
+
+The OFFSET macros and the BASE macros must be defined so that, for
+example, REGOFFSET(3) + REGBASE(state) is the address of register
+3. The BASE macros must be constant expressions since they are used to
+initialize a table of offsets with the "const" and "static"
+attributes.  Other macros, FSR, FP, SP, SB, PC and PSR are defined
+such that, for example, (state *)s->FP accesses the fp (frame pointer)
+register.
+
+This should be quite flexible and examples for the mach and signal
+handler implementations can be found in ieee_handler.h.
+
+There should be only one other place where customization may be required.
+That is in ieee_handler.c. The functions setjmp, longjmp and get_dword
+are required. The first two may not be available in the kernel. In the
+case of mach, setjmp and longjmp are defined in terms of _setjmp and
+_longjmp. The get_dword macro uses copyin to get a long int from
+user space using copyin.
+
+It is assumed that copyin and copyout are available. In the signal
+handler case, these are defined in terms of memcpy.
+
+5. To Do
+
+The testing has been cursory at best. A more sophisticated test suite
+is needed. The conformance to the standard is probably patchy since
+the author has never actually seen the IEEE floating point standard!
+
+6. BUGS
+
+Please report any bugs, and especially any improvements to the
+author, Ian Dall <ian.dall@dsto.defence.gov.au>.
+
+6. Copyright
+
+This code is Copyright Ian Dall. Please respect it.
+
+Permission to use, copy, modify and distribute this software and its
+documentation is hereby granted, provided that both the copyright
+notice and this permission notice appear in all copies of the
+software, derivative works or modified versions, and any portions
+thereof, and that both notices appear in supporting documentation.
+
+If you have a good reason to want to vary the permission notice,
+I am open to negotiation.