A basic multiplier deduced from Algorithm 16.5 is shown in Figure 16.6. The rounding circuit is the same as in the case of the adder-subtractor (Figure 16.4).
Example 16.8 (Complete VHDL code available.) Generate the VHDL model of a generic floating-point multiplier. It is made up of four blocks:
1. Multiplication. The multiplication circuit corresponds to a (p + 1)-by-(p + 1) multiplier, an adder and a XOR gate—Figure 16.6—and generates the exact value of the product. Any type of multiplier can be used (Chapter 12). In this model, a simple parallel multiplier has been used:
entity multiplication is port ( s1, s2: in digit_vector(0 downto -p); sign1, sign2: in std_logic; e1, e2: in integer; s: out digit_vector(1 downto -2*p); sign: out std_logic; e: out integer ); end multiplication; architecture circuit of multiplication is component basic_base_B_mult…end component; … end circuit;
2. Generation of the guard digits. This block computes the sticky digit and concatenates its value with positions 1 down to – (p + 1) of the exact product:
entity guard_digits is port ( s: in digit_vector(1 downto -2*p); product: out digit_vector(1 downto -(p+2)) ); end guard_digits; architecture behavior of guard_digits is begin process(s) variable acc_or: digit; begin acc_or:=0; for i in -(p+2) downto -2*p loop if (s(i)>0) or (acc_or>0) then acc_or:=1; end if; end loop; product<=s(1 downto -(p+1))&acc_or; end process; end behavior;
3. Normalization. This block updates the significand as well as the exponent if the value of product (Figure 16.6) is greater than or equal to B:
entity normalization is port ( e: in natural; product: in digit_vector(1 downto -(p+2)); new_s: out digit_vector(0 downto -(p+3)); new_e: out natural ); end normalization; architecture rtl of normalization is signal product_div_B: digit_vector(0 downto -(p+3)); begin divide_by_B: for i in -(p+3) to 0 generate product_div_ B(i)<=product(i+1); end generate; new_s<=product(0 downto -(p+2))&0 when product(1)=0 else product_div_B; new_e<=e when product(1)=0 else e+1; end rtl;
4. The rounding block is the same as before (Figure 16.4):
entity rounding is port ( s: in digit_vector(0 downto -(p+3)); e: in natural; new_s: out digit_vector(0 downto -p); new_e: out natural ); end rounding;
It remains to assemble the four blocks:
entity fp_multiplier is port ( sign1, sign2: in std_logic; e1, e2: in integer; s1, s2: in digit_vector(0 downto -p); sign: out std_logic; e: out natural; s: out digit_vector(0 downto -p) ); end fp_multiplier; architecture circuit of fp_multiplier is component multiplication…end component; component guard_digits…end component; component normalization…end component; component rounding…end component; signal e_m, e_n: natural; signal s_m: digit_vector(1 downto -2*p); signal s_g: digit_vector(1 downto -(p+2)); signal s_n: digit_vector(0 downto -(p+3)); begin multiplication_component: multiplication port map (s1, s2, sign1, sign2, e1, e2, s_m, sign, e_m); guard_digits_component: guard_digits port map (s_m, s_g); normalization_component: normalization port map (e_m, s_g, s_n, e_n); rounding_component: rounding port map (s_n, e_n, s, e); end circuit;
If a carry-save multiplier is used, the part of the circuit that generates the value of product can be modified (Figure 16.7). The multiplier generates two (2.p + 2)-digit numbers u and v (stored-carry encoding of the product). Then it remains to generate the carry cy corresponding to the position number −(p + 1) as well as the sticky_digit.
The computation of cy can be performed with any one of the methods described in Chapter 11. The sticky_digit can be generated directly from the stored-carry representation (Chapter 8 of [ERC2004]). For that purpose, observe that the equality
where
First encode the result of (16.30) in stored-carry form, that is,
Relation (16.30) is equivalent to
and the preceding relation only holds if, for every position i, the sum s(i) + c(i) is equal to B − 1. Thus the sticky digit is equal to 0 if, and only if,
The corresponding circuit is shown in Figure 16.8. The comp block works as follows:
If B = 2 the comp circuit is a 2-input XOR gate.