efp_assist.h File Reference

#include <ppc/context.h>

Go to the source code of this file.

Functions

int _math_emulator (int sigcode, void **pdata, PPC_CPU_REGISTERS *regs)
 IEEE 754 compliance handler.


Function Documentation

int _math_emulator ( const int  sigcode,
void **const   pdata,
PPC_CPU_REGISTERS *const   regs 
)

IEEE 754 compliance handler.

This routine serves two major, somewhat independent purposes. It is primarily used as a user-mode handler for embedded floating point data and rounding exceptions raised by the hardware; it is also used as a means for the fp_status and fenv implementations in libm to make use of the SPEFSCR register in cooperation with the exception handling functionality.

Although the source code is quite lengthy, much of the bulk is made up of switch statements which sieve the opcode in various ways. The execution path through the routine for any given execution is intended to be as short as possible. Owing to the amount of state shared from step to step in what is essentially a sequential algorithm, it did not seem advisable to split the routine into pieces as it is conjectured that it might substantially degrade the compiler's ability to optimize the code.

Exception Handling API and Protocol
When this routine is called as part of the handling of an embedded floating point exception, the code byte of the SIGFPE sigcode describes the exception (FPE_NOFPU for data, FPE_FLTRES for rounding); the IAR pointing at the instruction causing the exception; and the remainder of the register context filled in appropriately. The high words of the GPRs and the accumulator are assumed to be in the state they were in at the time of the exception (since they can only be modified by embedded floating point instructions).
The first action is to allocate a thread context, if one has not already been allocated. The thread context is required (by kerext_debug) to begin with a (true, non-embedded) floating point register save area, which is not otherwise used by the fpassist library and in which all registers are always set to zero. Following this is a softfloat environment, which is also unused by the fpassist library. (It is reserved for future extension and preserves an identical layout with the context used by the fpemu library. Although there is currently no functional need for compatibility, it may prove to be worthwhile if it is ever necessary to support both fpemu and fpassist simultaneously.) Next is the private context used by the fpassist library. This contains the current value of the application visible exception flags and mask.
The general processing of the exception then follows three main phases: operand extraction, operation emulation, and result storage.
Broadly speaking, operand extraction falls into one of three categories: vector, single precision, or double precision. Operations having single precision operands operate on only the low 32 bit word of those operands; those with double precision operands operate on the full 64 bits. Operands for vector operations are always fetched from the full 64 bits as well, but the operation itself is performed as two single precision SIMD operations. For vector operations, either (or both) the low and high word operands may be needed; if only the low or high computation faulted, only that computation is emulated.
The emulation step consists of computing the IEEE compliant result of the mathematical, logical, or conversion operations. A separate softfloat environment is maintained for each operation which is performed (i.e. up to two, when a vector operation is performed). Single precision computations always involve only the low word(s) of GPRs; double precision computations always involve the full 64 bits their operand(s). Vector operations perform simulataneous operations on pairs of operand(s), one value each in the low and high words.
Result storage consists of determining which, if any, destination registers need to be updated. In particular, the SPEFSCR sticky flags are updated from the softfloat environment(s) which were used. The application's exception mask is consulted to determine if a SIGFPE should still be delivered; typically the result is not stored if a signal will be delivered (just as it would not had the fpassist library not been present.)
fp_status API and Protocol
The fp_setenv() implementation in libm for SPE PPC uses a simple protocol to communicate with the fpassist library in order to cooperatively use the SPEFSCR register. It calls the _emulator_callout() function directly, passing SIGFPE; an IAR of ~0U; and a USPRG0 of either O_RDONLY, for a read, or O_WRONLY, for a write of the SPEFSCR. If the fpassist library is missing, this will return non-zero (SIGFPE) and fp_setenv() will fall back to accessing the hardware directly. (In this case, unmasked exceptions will result in SIGFPE delivery.) Otherwise, the value manipulated will in fact be the one from the application context allocated above.
The saved value is used only for its exception flags and mask; the remaining bits are always read directly from the hardware.
When the SPEFSCR is written by the application, the value supplied is stored in the application context. The hardware SPEFSCR is updated unconditionally to enable all of the exceptions required for proper IEEE 754 emulation by the fpassist library (FINVE, FDBZE, FOVFE, and FUNFE; FINXE is set only as the application requests).
This procedure allows the actual hardware sticky exception flags and the hardware exception mask to vary independently from the application exception flags and mask. This is necessary because four of the five exceptions must be enabled at all times to provide IEEE 754 compliant results, regardless of whether the application has masked those exceptions to avoid SIGFPE delivery. Likewise, the application exception flags must never be cleared by the emulator
This latter situation would be extremely rare anyway in practice. It could only arise if the hardware sticky flags were used directly to represent the application exception flags and the application had directly set one or more of them. In that case, if the corresponding exception were actually to arise during hardware execution, but be negated during emulation, it would not be possible to know if it were correct to clear the flag (which would be the proper course of action had the application not set the flag). A practical example of this would be the "invalid" flag raised by the hardware for a subnormal operand. Subsequent emulation will probably result in the operand no longer being considered "invalid", so although the hardware sticky flag is raised, the application should see it as clear - unless it had earlier set the flag itself directly. This situation will be exceedingly rare because most applications will never set a sticky flag; they will only clear them.
Note:
No special effort has been expended to ensure that the underlying softfloat library returns results identical to those of the hardware (where such a correspondence is meaningful). This is in the same spirit as the fpemu library.
Parameters:
[in] sigcode The sigcode suggested by the caller.
[in,out] pdata Pointer to pointer of TLS context save area.
[in,out] regs Processor context.
Returns:
sigcode to be delivered to the application thread (if any)
Return values:
==0 No signal required.
!=0 sigcode to deliver.

Definition at line 642 of file efp_assist.c.

References DPRINTF, float32_add(), float32_div(), float32_eq(), float32_eq_signaling(), float32_gt(), float32_gt_quiet(), float32_lt(), float32_lt_quiet(), float32_mul(), float32_sub(), float32_to_float64(), float32_to_int32(), float32_to_int32_round_to_zero(), float32_to_q31(), float32_to_uint32(), float32_to_uint32_round_to_zero(), float32_to_uq32(), float64_add(), float64_div(), float64_eq(), float64_eq_signaling(), float64_gt(), float64_gt_quiet(), float64_lt(), float64_lt_quiet(), float64_mul(), float64_sub(), float64_to_float32(), float64_to_int32(), float64_to_int32_round_to_zero(), float64_to_int64_round_to_zero(), float64_to_q31(), float64_to_uint32(), float64_to_uint32_round_to_zero(), float64_to_uint64_round_to_zero(), float64_to_uq32(), _run_options::float_exception_flags, float_flag_divbyzero, float_flag_inexact, float_flag_invalid, float_flag_overflow, float_flag_underflow, float_round_down, float_round_nearest_even, float_round_to_zero, float_round_up, _run_options::float_rounding_mode, getappenv(), getsfenv(), int32_to_float32(), int32_to_float64(), int64_to_float64(), load32(), load64(), loadh32(), NELEM, PPC_XO_efdabs, PPC_XO_efdadd, PPC_XO_efdcfs, PPC_XO_efdcfsf, PPC_XO_efdcfsi, PPC_XO_efdcfsid, PPC_XO_efdcfuf, PPC_XO_efdcfui, PPC_XO_efdcfuid, PPC_XO_efdcmpeq, PPC_XO_efdcmpgt, PPC_XO_efdcmplt, PPC_XO_efdctsf, PPC_XO_efdctsi, PPC_XO_efdctsidz, PPC_XO_efdctsiz, PPC_XO_efdctuf, PPC_XO_efdctui, PPC_XO_efdctuidz, PPC_XO_efdctuiz, PPC_XO_efddiv, PPC_XO_efdmul, PPC_XO_efdnabs, PPC_XO_efdneg, PPC_XO_efdsub, PPC_XO_efdtsteq, PPC_XO_efdtstgt, PPC_XO_efdtstlt, PPC_XO_efsabs, PPC_XO_efsadd, PPC_XO_efscfd, PPC_XO_efscfsf, PPC_XO_efscfsi, PPC_XO_efscfuf, PPC_XO_efscfui, PPC_XO_efscmpeq, PPC_XO_efscmpgt, PPC_XO_efscmplt, PPC_XO_efsctsf, PPC_XO_efsctsi, PPC_XO_efsctsiz, PPC_XO_efsctuf, PPC_XO_efsctui, PPC_XO_efsctuiz, PPC_XO_efsdiv, PPC_XO_efsmul, PPC_XO_efsnabs, PPC_XO_efsneg, PPC_XO_efssub, PPC_XO_efststeq, PPC_XO_efststgt, PPC_XO_efststlt, PPC_XO_evfsabs, PPC_XO_evfsadd, PPC_XO_evfscfsf, PPC_XO_evfscfsi, PPC_XO_evfscfuf, PPC_XO_evfscfui, PPC_XO_evfscmpeq, PPC_XO_evfscmpgt, PPC_XO_evfscmplt, PPC_XO_evfsctsf, PPC_XO_evfsctsi, PPC_XO_evfsctsiz, PPC_XO_evfsctuf, PPC_XO_evfsctui, PPC_XO_evfsctuiz, PPC_XO_evfsdiv, PPC_XO_evfsmul, PPC_XO_evfsnabs, PPC_XO_evfsneg, PPC_XO_evfssub, PPC_XO_evfststeq, PPC_XO_evfststgt, PPC_XO_evfststlt, q31_to_float32(), q31_to_float64(), softfloat_env_init(), appenv_type::spefscr, SPEFSCR_ENABLE_EXC_MASK, SPEFSCR_NEEDED_EXC_MASK, SPEFSCR_STICKY_EXC_MASK, SPEFSCR_STICKY_EXC_SHIFT, SPEFSCR_VECT_HI_EXC_MASK, SPEFSCR_VECT_LO_EXC_MASK, store32(), store64(), storeh32(), uint32_to_float32(), uint32_to_float64(), uint64_to_float64(), uq32_to_float32(), and uq32_to_float64().


Generated on Fri Nov 13 15:38:37 2009 for fpassist for PPC SPE by  doxygen 1.5.9