16.1 FLOATING-POINT SYSTEM DEFINITION

Assume that a set of real numbers x belonging to the interval

is represented in such a way that the following specifications are satisfied:

d₁ is the maximum distance between small exactly-represented non zero numbers;

d₂ is the maximum distance between large exactly-represented numbers;

x_min is the maximum distance between 0 and the smallest exactly-represented numbers:

where the adjectives small and large refer to the absolute value of the corresponding numbers.

Every number x will be represented in the form ±s.b^e, with b ≥ 2, s being the significand and e the exponent.

In order to make the implementation of the arithmetic operations easier (Section 16.2), the two following conditions must be satisfied:

Thus x is expressed in the form

The values of p, e_min, and e_max are chosen in such a way that

Example 16.1 Define a floating-point representation system where

Choose B = 2. A straightforward solution of the system (16.2)–(16.5) is

The smallest nonzero exactly-represented positive number is 2⁻³⁰; the distance between small exactly-represented numbers is

the largest exactly-represented positive number is

the distance between large exactly-represented numbers is

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 16.1 FLOATING-POINT SYSTEM DEFINITION