In order to execute Algorithm 8.24, the computation resources that correspond to the procedures should be synthesized. Most of them (invert, by_coefficient, sub) have been studied in the preceding sections. The shift procedure can be implemented with a barrel shifter ([ULL1984]). The implementation of the degree procedure could be based on the following iterative algorithm.
Algorithm 15.1 Degree Computation
state(n):=0; for i in 1..n-1 loop if state (n-i+1)=0 and a(n-i)=0 then state (n-i):=0; else state (n-i):=1; end if; end loop; degree_a:=count (state);
where the count function returns the number of 1's in state. The corresponding iterative circuit includes n − 1 cells. Each of them computes state(i) as a function of a(i) and state(i+1): if state(i+1) = 0 and a(i) = 0 then state(i) = 0; in all other cases state(i) = 1. The circuit that generates the output degree is an (n − 1)-to-log2(n) binary counter.
The data path corresponding to Algorithm 8.24 is shown in Figure 15.14. It is made up of the following computation resources:
degree: implements the degree procedure,
shifter: implements the shift procedure,
coefficient_inverter: implements the invert procedure,
subtractor: implements the sub procedure,
coefficient_multiplier: implements the by_coefficient procedure.
Furthermore, a lot of memory (registers) and connection (multiplexers) resources are necessary. The computation time is proportional to the number of executions of the main iteration (Algorithm 8.24, while t>0 loop…end loop). As the degree of r is reduced at every step, the maximum number of iteration steps is n.
Example 15.9 (Complete VHDL source code available.) Generate the VHDL model of a circuit that computes (a(x))−1 modulo f(x):
entity polynomial_inverter is port ( a: in polynomial; result: out polynomial; start, clk, reset: in std_logic; done: out std_logic ); end polynomial_inverter; architecture circuit of polynomial_inverter is component degree … end component; component selector … end component; component shifter … end component; component coefficient_inverter … end component; component subtractor … end component; component coefficient_multiplier … end component; signal u, next_u, v, next_v, c, next_c, e, next_e, k_by_v, k_by_v_shifted, r, k_by_e, k_by_e_shifted, cc, r_a: polynomial; signal m, next_m, t, next_t, j, deg_v: index; signal uu_m, next_u_m, u_m, v_t, v_t_inverted, k, k_v: coefficient; signal load, sign, t_equal_zero: std_logic; signal mux_control: std_logic_vector (1 downto 0); subtype state_type is natural range 0 to 8; signal state: state_type; begin --data_path f<=irreducible; f_n<=irreducible_n; process (clk) begin if clk'event and clk=‘1’ then if load=‘1’ then u<=next_u; v<=next_v; c<=next_c; e<=next_e; m<=next_m; t<=next_t; u_m<=next_u_m; end if; end if; end process; process (mux_control, f_n, f, v, r, a, e, cc, t, deg_v, k, v_t_inverted, uu_m) begin case mux_control is when “00”=>next_u<=f; next_v<=a; next_c<=zero_polynomial; next_e<=one_polynomial; next_m<=conv_std_logic_vector(n, logn); next_t<=deg_v; next_u_m<=f_n; r_a<=a; k_v<=k; when “01”;=>next_u<=v; next_v<=r; next_c<=e; next_e<=cc; next_m<=t; next_t<=deg_v; next_u_m<=uu_m; r_a<=r; k_v<=k; when “10”=>next_u<=r; next_v<=v; next_c<=cc; next_e<=e; next_m<=deg_v; next_t<=t; next_u_m<=uu_m; r_a<=r; k_v<=k; when others=>next_u<=r; next_v<=v; next_c<=cc; next_e<=e; next_m<=deg_v; next_t<=t; next_u_m<=uu_m; r_a<=r; k_v<=v_t_inverted; end case; end process; j<=m - t; selector1: selector port map (next_u, m, uu_m); selector2: selector port map (v, t, v_t); inverter: coefficient_inverter port map (v_t, v_t_inverted); multiplier1: coefficient_multiplier port map (u_m, v_t_inverted, k); multipliers1: for i in 0 to n-1 generate multiplier2: coefficient_multiplier port map (v(i), k, k_by_v(i)); end generate; shifter1: shifter port map (k_by_v, j, k_by_v_shifted); subtractors1: for i in 0 to n-1 generate subtractor1: subtractor port map (u(i), k_by_v_shifted(i), r(i)); end generate; multipliers2: for i in 0 to n-1 generate multiplier3: coefficient_multiplier port map (e(i), k_v, k_by_e(i)); end generate; shifter2: shifter port map (k_by_e, j, k_by_e_shifted); subtractors2: for i in 0 to n-1 generate subtractor2: subtractor port map (c(i), k_by_e_shifted(i), cc(i)); end generate; degree1: degree port map (r_a, deg_v); sign<=‘0’ when conv_integer(t)>=conv_integer(deg_v) else ‘1’; t_equal_zero<=‘1’ when conv_integer(t)=0 else ‘0’; result<=k_by_e; --control unit: process (clk, reset, state) begin case state is when 0=>load<=‘0’; mux_control<=“11”; done<=‘1’; when 1=>load<=‘0’; mux_control<=“11”; done<=‘1’; when 2=>load<=‘1’; mux_control<=“00”; done<=‘0’; when 3=>load<=‘0’; mux_control<=“00”; done<=‘0’; when 4=>load<=‘1’; mux_control<=“01”; done<=‘0’; when 5=>load<=‘1’; mux_control<=“10”; done<=‘0’; when 6=>load<=‘0’; mux_control<=“01”; done<=‘0’; when 7=>load<=‘0’; mux_control<=“10”; done<=‘0’; when 8=>load<=‘0’; mux_control<=“11”; done<=‘0’; end case; if reset=‘1’ then state<=0; elsif clk'event and clk=‘1’ then case state is when 0=>if start=‘0’ then state<=1; end if; when 1=>if start=‘1’ then state<=2; end if; when 2=>state<=3; when 3=>if t_equal_zero=‘1’ then state<=8; elsif sign=‘0’ then state<=4; else state<=5; end if; when 4=>state<=6; when 5=>state<=7; when 6=>if t_equal_zero=‘1’ then state<=8; elsif sign=‘0’ then state<=4; else state<=5; end if; when 7=>if t_equal_zero=‘1’ then state<=8; elsif sign=‘0’ then state<=4; else state<=5; end if; when 8=>state<=0; end case; end if; end process; end circuit;
As mentioned earlier (Chapter 8), a different method of inversion can be used if f(x) is a binomial. In particular, if
then Algorithm 8.28 can be used. The computation resources corresponding to the procedures must be synthesized. Most of them (multiply, invert, by_ coefficient) have been studied in the preceding sections. The exponentiation procedure can be implemented by a table storing the coefficients fsi (Appendix 8.1) for s = 1, 2, 22, …,2m − 1 and n − 1 coefficient multipliers (Figure 15.15).
Example 15.10 (Complete VHDL source code available.) The circuit of Figure 15.16 implements the inversion in GF(pn) with
As p is small, the inversion in GF(239) is implemented by a table storing x−1 mod 239 for all x in {1, 2, …, 238}.