core entry points. i.e. the linker does the work rather than the stub code.
Optimised transferring the trapframe between UND32 and SVC32 mode stacks
in the fpe_post_proc handler.
Added experimental code to handle must of userret in UND32 mode. This means
that the copy of the trapframe and the switch to SVC32 mode is only needed
if mi_switch() has to be called. (This saves a vast number of pointless
trapframe copies).