Extended Type
The Extended
type is an Intel standard floating-point
type that uses 10 bytes to store a sign bit, a 15-bit exponent, and a
64-bit mantissa. Extended
conforms to the minimum
requirements of the IEEE-754 extended double precision type.
The limits of the Extended
type are approximately
3.37 × 10-4932 to 1.18 ×
104932, with about 19 decimal digits of
precision.
Unlike Single
and Double
,
Extended
contains all of its significant bits.
Normalized values, infinity, and not-a-number have an explicit 1 bit
as the most significant bit. Table 5-2 shows the
detailed format of finite and special Extended
values. Not all bit patterns are valid Extended
values. Delphi raises runtime error 6 (EInvalidOp
)
if you try to use an invalid bit pattern as a floating-point number.
Numeric Class |
Sign |
Exponent Bits |
Mantissa Bits |
Positive | |||
Normalized |
0 |
0...1 to 1...0 |
10...0 to 11...1 |
Denormalized |
0 |
0...0 |
0...1 to 01...1 |
Zero |
0 |
0...0 |
0...0 |
Infinity |
0 |
1...1 |
10... |
Quiet NaN |
0 |
1...1 |
110...0 to 11...1 |
Signaling NaN |
0 |
1...1 |
100...1 to 101...1 |
Negative | |||
Normalized |
1 |
0...1 to 1...0 |
10...0 to 11...1 |
Denormalized |
1 |
0...0 |
0...1 to 01...1 |
Zero |
1 |
0...0 |
0...0 |
Infinity |
1 |
1...1 |
10... |
Quiet NaN |
1 |
1...1 |
110...0 to 11...1 |
Signaling NaN |
1 |
1...1 |
Use Extended
when you must preserve the maximum
precision or exponent range, but realize that you will pay a
performance penalty because of its awkward size.
Delphi sets the floating-point control word to extended precision, so
intermediate computations are carried out with the full precision of
Extended
values. When you save a floating-point
result to a Single
or Double
variable, Delphi truncates the extra bits of precision.
Refer to the Intel architecture manuals (such as the
Pentium Developer’s Manual, volume 3,
Architecture and Programming Manual ) or IEEE
standard 754 for more information about infinity and NaN (not a
number). In Delphi, use of a signaling NaN raises runtime error 6
(EInvalidOp
).
type TExtended = packed record case Integer of 0: (Float: Extended;); 1: (Bytes: array[0..9] of Byte;); 2: (Words: array[0..4] of Word;); 3: (LongWords: array[0..1] of LongWord; LWExtra: Word;); 4: (Int64s: array[0..0] of Int64; Exponent: Word;); end; TFloatClass = (fcPosNorm, fcNegNorm, fcPosDenorm, fcNegDenorm, fcPosZero, fcNegZero, fcPosInf, fcNegInf, fcQNaN, fcSNan); // Return the class of a floating-point number: finite, infinity, // not-a-number; also positive or negative, normalized or denormalized. // Determine the class by examining the exponent, sign bit, and // mantissa separately. function fp_class(X: Extended): TFloatClass; overload; var XParts: TExtended absolute X; Negative: Boolean; Exponent: LongWord; Mantissa: Int64; begin Negative := (XParts.Exponent and $8000) <> 0; Exponent := XParts.Exponent and $7FFF; Mantissa := XParts.Int64s[0]; // The first three cases can be positive or negative. // Assume positive, and test the sign bit later. if (Exponent = 0) and (Mantissa = 0) then // Mantissa and exponent are both zero, so the number is zero. Result := fcPosZero else if (Exponent = 0) and (Mantissa < 0) then // If the exponent is zero, and the mantissa has a 0 MSbit, // the number is denormalized. Note that Extended explicitly // stores the 1 MSBit (unlike Single and Double). Result := fcPosDenorm else if Exponent <> $7FFF then // Otherwise, if the exponent is not all 1, // the number is normalized. Result := fcPosNorm else if Mantissa = $8000000000000000 then // Exponent is all 1, and mantissa has 1 MSBit means infinity. Result := fcPosInf else begin // Exponent is all 1, and mantissa is non-zero, so the value // is not a number. Test for quiet or signaling NaN. MSBit is // always 1. The next bit is 1 for quiet or 0 for signaling. if (Mantissa and $4000000000000000) <> 0 then Result := fcQNaN else Result := fcSNaN; Exit; // Do not distinguish negative NaNs. end; if Negative then Inc(Result); end;