AFP(7) — UNIX Programmer’s Manual

NAME

afp − ARM floating point format

SYNOPSIS

#include <arm/fp.h>
struct fp_regs {
intfp_status;
structfp_reg {
int first, second, third;
} fp_reg[8];
};

DESCRIPTION

The ARM system defines the format of floating point numbers and a floating point instruction set conforming to ANSI/IEEE standard 754-1985. This man page describes the layout of floating point numbers when they are stored in memory and the specification of the floating point instruction set. This instruction set is implemented on UNIX systems either by a floating point emulator or by a plug in mathematical coprocessor − the software need not know which implementation is in use.

There are four supported floating point formats − float, double, struct fp_reg and a packed format. The format defined by the fp_reg structure corresponds to the extended format of the emulator and coprocessor. The packed format is a packed bcd format which facilitates input and output of floating point numbers.

The ARM IEEE floating point system has 8 "HIGH PRECISION" floating point registers, F0 to F7. All basic floating point operations as if the result is computed to infinite precision and then rounded to the length and in the way specified by the instruction (the rounding is selectable from "Round to nearest", "Round to +infinity", "Round to -infinity", "Round to zero"). The working precision of the system is 80 bits, comprising a 64 bit mantissa, a 15 bit exponent and a sign bit. Specific "fast, but may be low precision" instructions provide higher performance in some implementations, particularly the fully software ones.

The floating point system architecture is, like ARM, "Load/Store" - the data processing operations only refer to floating point registers. Values may be stored into ARM memory in one of four formats: IEEE Single Precision (S), IEEE Double Precision (D), Double Extended Precision (E) and Packed Decimal (P). Storing a floating point register in "E" format (which may vary from implementation to implementation) is guaranteed to maintain precision when loaded back into the floating point system in this format. There is a Floating Point Status Register (FPSR) which, like ARM’s combined PC and PSR, has all the necessary status for the floating point system. The FPSR contains the IEEE flags. Bits in the FPSR allow a determined client to distinguish different implementations of the floating point system. There are privileged instructions to turn the floating point system on and off to permit efficient context changes − these instructions are usable only by the UNIX kernel.

Floating point systems may be built from software only, hardware only, or some combination of software and hardware and the result look the same to the programmer. The manner in which Exceptions are signalled is at the discretion of the surrounding operating system − in UNIX the standard signal handling mechanism described in sigvec(2) is used. However, due to the nature of the ARM CoProcessor Interface, the exception for floating point data operations will be asynchronus: it will arrive at some time after the instruction has started, while the ARM processor is dealing with instructions (well) after the failed floating point one.

To get the most out of this document, it is a good idea to be familiar with the IEEE 754 standard (or an existing implementation of it, like the M68881 or WE32206).

Co-Processor Data Transfer

31..28 27.24   23     22   21    20 19..16 15..12 11..8 7........0
-------------------------------------------------------------------
|Cond | 110P | U/D | CLn | Wb | L/S | Rn | x Fd |0001 | offset | CPDT
-------------------------------------------------------------------
   {LDF|STF}<cond>{S|D|E|P} Fd,[Rn] {,<offset>}
                              [Rn {,<offset>]
                              [Rn, <offset>]!

Load or Store the high precision value into one of the four memory formats. On store the value is rounded using the Round to Nearest rounding method to the destination precision, or is precise if the destination has sufficient precision. Thus other rounding methods may be used by having applied a suitable floating point data operation at some time before the store - this does not compromise the requirement of "rounding once only" since the rounded value of another rounding mode to the specified precision is now exact and not altered by further attempts at rounding.

Exceptions could occur if a Trapping NAN is used or the value cannot be converted to the format due to exponent overflow. If the Trapping NAN exception is not enabled trapping NANs will be converted to nontrapping NANs.

The length field is encoded into CLn and bit 15 (top bit of Crd):

                      CLn   bit 15
         Single   S    0      0             One Memory Word
         Double   D    0      1             Two Memory Words
         Extended E    1      0             Three Memory Words
         Packed   P    1      1             Three Memory Words

The ARM bits all function like ARM’s Load/Store Multiple instructions:

Wb behaves as Wb bit in LDM/STM (Wb SHOULD be set with post index)

U/D allows the offset to be added or subtracted

L/S specifies load or save

P (not the P character in the instruction) is Pre or Post index

Data Formats in ARM Memory:

Integer:             31.....................................0
                     -------------------------------------------
                     |msb      2’s complement               lsb|
                     -------------------------------------------
   Single:              31 30....23 22........................0
                     -------------------------------------------
                     |sign|Exponent|msb   Fraction          lsb|
                     -------------------------------------------
   Double:              31 30....20 19........................0
                     -------------------------------------------
    First Word       |sign|Exponent|msb   Fraction          lsb|
                     -------------------------------------------
    Second Word      |msb           Fraction                lsb|
                     -------------------------------------------

Single and Double Exceptional values:

                       sign     exponent     fraction
      Nontrapping NAN   x       maximum      1xxxxxxxxx
      Trapping NAN      x       maximum      0non-zero
      +Infinity         0       maximum      0000000000
      -Infinity         1       maximum      0000000000
      Zero              x       0            0000000000
      Denormalised no. x       0            non-zero
      Normalised no.    x       not 0 or max xxxxxxxxxx

Extended:            31 30...........16 15 14.............0
                     -------------------------------------------
    First Word       |sign|     zeroes         |15 bit exponent|
                     -------------------------------------------
    Second Word      |J|msb         Fraction                lsb|
                     -------------------------------------------
    Third Word       |msb           Fraction                lsb|
                     -------------------------------------------
                     J is one bit to the left of the binary point

Extended Exceptional values:

                       sign     exponent       J    fraction
      Nontrapping NAN   x       maximum        x    1xxxxxxxxx
      Trapping NAN      x       maximum        x    0non-zero
      +Infinity         0       maximum        0    0000000000
      -Infinity         1       maximum        0    0000000000
      Zero              x       0              0    0000000000
      Denormalised no. x       0              0    non-zero
      Normalised no.    x       0              1    xxxxxxxxxx
      Normalised no.    x       not 0 or max   1    xxxxxxxxxx

FB Packed Decimal:     31.....................................0
                     -----------------------------------------
    First Word       |sign| e3 | e2 | e1 | e0 |d18 |d17 |d16 |
                     -----------------------------------------
    Second Word      |d15 |d14 |d13 |d12 |d11 |d10 | d9 | d8 |
                     -----------------------------------------
    Third Word       | d7 | d6 | d5 | d4 | d3 | d2 | d1 | d0 |
                     -----------------------------------------

Value is +/- d ∗ 10 ^ +/- e. d18 or e3 is the most significant digit. Sign contains both the number’s sign (top bit) and the exponent’s sign (next bit). The other two bits are 0. The value of d is arranged with decimal point between d18 and d17, and is normalised so that for a normal number 1<=d18<=9. The guaranteed ranges for d and e are 17 and 3 digits respectively: e3 and d0, d1 may always be zero in a particular system. A single precision number has a maximum exponent of 53 and 9 digits of significand; a double precision number has a maximum exponent of 340 and 17 digits in the significand. The result when the packed values are A through F is undefined. Zero will always be output as +zero, but either +0 or -0 may be input.

Packed Decimal Exceptional values:

                      sign top bit, next bit    exponent     digit values
      Nontrapping NAN          x      x          FFFF      d18>7, rest non-zero
      Trapping NAN             x      x          FFFF      d18<8, rest non-zero
      +Infinity                0      x          FFFF           all 0
      -Infinity                1      x          FFFF           all 0
      Zero                     0      0          0000           all 0
      Number                 0,1    0,1        0000-9999 1-9.99999999999999999
      (denormalised numbers do not exist in this format)

Co-Processor Data Operations

31..28 27.24   23................20 19..16 15..12 11..8 7..4 3..0
-------------------------------------------------------------------
|Cond | 1110 |        abcd          | e Fn | j Fd |0001 |fgh0|i Fm| CPDO
-------------------------------------------------------------------
   {ADF|SUF|RSF|MUF|DIF|RDF|POW|RPW}{S|D|E}{P|M|Z}     Fd, Fn, {Fm|#<value>}
{RMF|FML|FDV|FRD|POL}
   {MVF|MNF|ABS|RND|SQT|LOG|LGN|EXP}{S|D|E}{P|M|Z}     Fd, {Fm|#<value>}
{SIN|COS|TAN|ASN|ACS|ATN}
"opcode" - abcd
"dyadic/monadic" - j
"destination size" - ef
"rounding mode" - gh
"constant ROM/Fm" - i
   abcdj
00000 ADF Add:                        Fd := Fn + Fm
00010 MUF Multiply:                   Fd := Fn ∗ Fm
00100 SUF Sub:                        Fd := Fn - Fm
00110 RSF Reverse Subtract:           Fd := Fm - Fn
01000 DVF Divide:                     Fd := Fn / Fm
01010 RDF Reverse Divide:             Fd := Fm / Fn
01100 POW Power:                      Fd := Fn raised to the power of Fm
01110 RPW Reverse Power:              Fd := Fm raised to the power of Fn
10000 RMF Remainder:                  Fd := IEEE remainder of Fn / Fm
10010 FML Fast Multiply:              Fd := Fn ∗ Fm
10100 FDV Fast Divide:                Fd := Fn / Fm
10110 FRD Fast Reverse Divide:        Fd := Fm / Fn
11000 POL Polar angle (ArcTan2):      Fd := polar angle of (Fn, Fm)
11010     trap: undefined instruction
11100     trap: undefined instruction
11110     trap: undefined instruction
00001 MVF Move:                       Fd := Fm
00011 MNF Move Negated:               Fd := - Fm
00101 ABS Absolute value:             Fd := ABS ( Fm )
00111 RND Round to integral value:    Fd := integer value of Fm
01001 SQT Square root:                Fd := square root of Fm
01011 LOG Logarithm to base 10:       Fd := logten of Fm
01101 LGN Logarithm to base e:        Fd := loge of Fm
01111 EXP Exponent:                   Fd := e ∗∗ Fm
10001 SIN Sine:                       Fd := sine of Fm
10011 COS Cosine:                     Fd := cosine of Fm
10101 TAN Tangent:                    Fd := tangent of Fm
10111 ASN Arc Sine:                   Fd := arcsine of Fm
11001 ACS Arc Cosine:                 Fd := arccosine of Fm
11011 ATN Arc Tangent:                Fd := arctangent of Fm
11101     trap: undefined instruction
11111     trap: undefined instruction
   ef suffix Destination Rounding precision
00    S   IEEE Single precision       | The precision must be
01    D   IEEE Double precision       ] specified: there is no
10    E   Extended precision          | default.
11        trap: undefined instruction
   gh        Rounding Mode
00        Round to Nearest            | default
01    P   Round towards Plus Infinity
10    M   Round towards Minus Infinity
11    Z   Round towards Zero
   Fm        Constants (when i = 1)
000       0.0
001       1.0
010       2.0
011       3.0
100       4.0
101       5.0
110       0.5
111       10.0

FML, FRD, FDV produce a result only accurate to single precision. Directed rounding is done only at the last stage of a SIN, COS etc. - the calculations to compute the value are done with round to nearest using the full working precision.

Co-Processor Register Transfer

31..28 27.24   23...........21   20 19..16 15..12 11..8 7..4 3..0
-------------------------------------------------------------------
|Cond | 1110 |        abc     | L/S | e Fn | Rd   |0001 |fgh1|i Fm| CPRT
-------------------------------------------------------------------
   L/S = 1 -> the transfer is TO an ARM register
L/S = 0 -> the transfer is FROM an ARM register
"operation" - abc
"destination size" - ef (see CPDO)
"rounding mode" - gh (see CPDO)
   abcL/S
0000 FLT Integer to Floating Point:    Fn := Rd
0001 FIX Floating point to integer:    Rd := Fm
0010 WFS Write Floating Point Status: FPSR := Rd
0011 RFS Read Floating Point Status:   Rd := FPSR
0100 WFC Write Floating Point Control: FPC := Rd     Supervisor Only
0101 RFC Read Floating Point Control: Rd := FPC     Supervisor Only
011x
1000
1010      trap: undefined instruction
1100
1110

Constants cannot be specified in the Fm field for the FIX instruction since there is no point FIXing a known value into an ARM integer register − a MOV instruction could put it there quicker. The FIX/FLT operations generate signed integers − (int) values. (unsigned int) values cannot be handled.

The Floating Point Status

31..24 23 22 21 20 19 18 17 16 15...6   5   4   3   2   1   0
------------------------------------------------------------------------
| SysId |          |INX|UFL|OFL|DVZ|IVO|           |INX|UFL|OFL|DVZ|IVO|
------------------------------------------------------------------------
                     Interrupt Masks                 Cumulative Flags

Whenever the appropriate condition arises, the Cumulative Flags in bits 0 to 4 will be set to 1. They can only become unset (0) with a user’s WFS instruction. If the relevant Interrupt Mask is set (1), then the same condition that sets the cumulative flags will also cause an exception to be delivered to the user’s program in an operating system specific manner. The floating point system will provide the exception routine with a word indicating (in the same position as the cumulative flags) which floating point exception occured.

IVO Invalid Operation. The IVO is set when an operand is invalid for the operation to be performed. The result (if the exception is not enabled) is a nontrapping NAN. Invalid operations are:

Any operation on a NAN

Magnitude subtraction of infinities e.g. +infinity + -infinity

Multiplication of 0 by an infinity

Division of 0/0 or infinity/infinity

x REM y where x is infinity or y is 0

Square root of any number less than zero (but SQR(-0) is -0)

Conversion to integer or decimal when overflow, infinity or NAN make it impossible. When an integer is produced the largest positive or negative integers take the place of overflow.

Comparison with exceptions of unordered operands.

ACS, ASN when input absolute value is > 1

SIN, COS, TAN when input is infinite

LOG, LGN when input <= 0

DVZ Division by zero: If the divisor is zero and the dividend a finite, non zero number then the exception occurs or a correctly signed infinity.

OFL Overflow: whenever the destination format’s largest finite number is exceeded by the result after rounding has taken place. As overflow is detected after rounding a result, whether overflow occurs or not after some operations depends on rounding mode.

The untrapped result returned is the correctly signed infinity, independent of the rounding mode - overflow can be seen as a signal that an infinite result has been generated from an operation on finite values.

UFL Underflow: whenever a result is so tiny that it is rounded to zero, but has a non-zero value. As underflow is detected after rounding a result, whether underflow occurs or not after some operations depends on rounding mode.

The untrapped result returned is zero, with the sign set to that of the non-zero value.

INX Inexact: if the rounded result of an operation is not exact (different to the value computable with infinite precision) or overflow has occured while the OFL trap was disabled. If there is no trap the result will be used directly. OFL or UFL traps take precedence over INX. INX will also be set when computing SIN or COS or TAN of values larger than 10^20 (i.e. values for which the multiple of PI ranging gives a useless answer).

Attempts to write undefined bits (or the SysId) in the FPS will be trapped as an illegal operation. The undefined bits will return 0 when read. The 8 bit SysId allows a user and operating system to distinguish the implementations: the top bit (bit 31) is set for HARDWARE (i.e. fast) systems, and clear for SOFTWARE (i.e. slow) systems. The remaining 7 bits are allocated by Acorn to different systems: the first software and hardware systems will be 0 and &80.

The Floating Point Control register may only be present in some implementations: it is there to control the hardware in an implementation specific manner, for example to disable the floating point system. The user mode of the ARM is not permitted to use this register (since Acorn reserve the right to alter it between implementations) and the WFC and RFC instructions will trap if tried.

Co-Processor Status Transfer

31..28 27.24 23...........21 20 19..16 15..12 11..8 7..4 3..0
-------------------------------------------------------------------
|Cond | 1110 | abc | 1 | e Fn | 1111 |0001 |fgh1|i Fm| CPST
-------------------------------------------------------------------
"operation" - abc
"constant ROM/Fm" - i (see CPDO)

Operation: Compare, CompareNegated, CompareWithExceptions.

abcefgh
1000000 CMF Compare floating:                        compare Fn with Fm
1010000 CNF Compare negated floating:                compare Fn with -Fm
1100000 CMFE Compare floating with exception:         compare Fn with Fm
1110000 CNFE Compare negated floating with exception: compare Fn with -Fm

Compares are provided with and without the exception that could arise if the numbers are unordered. To comply with IEEE 754, the CMF instruction should be used to test for equality (i.e. when a BEQ or BNE will be used afterwards) or to test for unordered (in the V flag): the CMFE instruction should be used for all other tests (BGE, BGE, BLT, BLE afterwards).

The ARM flags N, Z, C, V refer to the following after compares:

N Less Than (i.e. Fn less than Fm (or -Fm))

Z Equal

C Greater Than or Equal (i.e. Fn greater than or equal to Fm)

V UnOrdered

Note that when two numbers are Not Equal N and C are not necessarily opposites: if the result is UnOrdered they will both be false.

AUTHORS

RWilson, MClemoes Acorn Computers Ltd

RISC iX — Revision 1.2 of 05/12/88

Museum

Related Articles