3.8 Multiplication and division

The classical method of multiplying two binary numbers of length n requires Θ(n2) operations. Let the two numbers r and s of 2k bits each be of the following form:

Here, A represents the k most significant bits of r and B the k least significant bits. Similar for C and D with respect to s. In other words, we can write r = A 2k + B and s = C 2k + D. From that we obtain

k

Instead of following this approach, Karatsubas algorithm (Anatoly Alexeyevich Karatsuba, 19372008) recursively computes the three products AC, (A + B)(C + D), and BD. Then, rs can be computed performing only three multiplications of numbers with bit length at most k

Let tmult(n) be the time which this recursive method takes to multiply numbers of n bits. Since n-bit numbers can be added in time (n), the master theorem yields the following estimate for tmult(n):

Thus, by a divide-and-conquer approach, we have reduced the exponent of the classical method from 2 to approximately 1.58496.

Another important arithmetic operation is the computation of a mod m. Because of

this operation can be reduced to subtraction, multiplication and integer division. For subtraction, the complexity of the classical method is linear in the number of digits. For multiplication, we have just seen an algorithm with subquadratic running time. So, now division remains to be investigated. To this end, we sketch in the following a method, which, using only subtractions and multiplications, is able to compute sufficiently many decimal places of the reciprocal of a given integer. Once we know enough bits of after the decimal point, we can compute by multiplying a and . Those bit values of contributing merely to the fractions part of this product need not be computed.

To approximate , we can use the Newton method (Sir Isaac Newton, 16431727) to find the root of For this purpose, we compute better and better approximations xi to this root by the rule

In our particular case, this results in the rule

If we begin with an initial value x0 between 0 and , the sequence (xi)i very quickly converges to . A good initial value is, for example, x0 = 2log2 m1. The value log2 m + 1 can be found easily by counting the number of binary digits of m. Without further computation, the initial value x0 can be specified in binary by a sequence of zeros after the decimal point, followed by a 1. The problem that we have to work with noninteger numbers can be eliminated by previously multiplying by a sufficiently large power of 2; this can be achieved by a simple shift operation.

3.9 Discrete fourier transform

The goal of the discrete Fourier transform (named after Jean Baptiste Joseph Fourier, 17681830) is to multiply polynomials very efficiently. Let R be a commutative ring and b 1 a natural number with an inverse in R. That is, the multiplicative inverse b1 R exists. This is always the case if R is a field of characteristic zero, like . It also holds in fields of characteristic p such that gcd(b, p) = 1. And it is also true if b = 2r is a power of two and R = /n for odd n. Note that /n may have various zero divisors.

The b-fold direct product Rb of R consists of all vectors (u0, . . . , ub1) with ui R. For Rb, we can consider two different multiplications to be defined in a natural way. First, we can perform a componentwise multiplication:

Or we interpret a vector (u0, . . . , ub1) as a polynomial Σi uiXi and multiply polynomials in the quotient ring R [X]/(Xb 1). To distinguish these two kinds of multiplication, we denote the product of the polynomials f and g as f g (it is called the convolution product). By convention ui = 0 for all i < 0 and all i b. This eases notation because we need no explicit summation limits. In this quotient ring, the identity X b = 1 holds, and we obtain

Let now ω R be a bth root of unity, that is, an element satisfying ωb = 1. Then, we can evaluate f(X) R[X]/(Xb 1) at powers ωi and

defines a ring homomorphism from R[X]/(Xb 1) into Rb. A primitive bth root of unity is an element ω R {1} satisfying the following conditions:

(a)ωb = 1,

(b) for all 1 k < b.

If ω is a primitive bth root of unity, then ω1 is as well, since the equation in condition (b) can be multiplied by ωk(b1). If c 1 divides b, then ωc is a primitive (b/c)th root of unity in R because we have

with k' = kc < b. Note that, in the third term of the equation above, each power is counted exactly c times. From Corollary 1.57, the complex number is a primitive bth root of unity in . The discrete Fourier transform is possible in each ring R in which b has a multiplicative inverse b1 and where a primitive bth root of unity ω exists. For such rings R, the polynomial ring R[X]/(Xb 1) with the identity Xb = 1 and the direct product Rb are isomorphic. The isomorphism can be explained by matrix multiplications. The {0, . . . , b 1}×{0, . . . , b 1} matrices F = (ωij)i,j and satisfy

Here, (aij)i,j denotes the matrix with entry aij at the (i, j)-position. For i j we have and for i = j this sum equals b. Thus, F is a diagonal matrix with value b on the diagonal. In particular, the matrices F and are invertible and F b1 is the unit matrix. Now, and we obtain the following ring isomorphism F : R[X]/(Xb 1) Rb, mapping each element f(X) = Σi aiXi to

In particular, multiplicationwith the matrix F can be interpreted as evaluating a polynomial f at points 1, ω, ω2, . . . , ωb1. The mapping F : R[X]/(Xb 1) Rb is called discrete Fourier transform. For the inverse mapping, the matrix F is replaced by and in the end the result is multiplied by the scalar b1.

Fig. 3.1. Computing f g using the discrete Fourier transform.

To compute the coefficients zi in f(X) g(X) = Σi ziXi according to Figure 3.1, we use the following strategy:

(1)Compute F(f(X)) and F(g(X)), that is, f(ωi) and g(ωi ) for 0 i < b.

(2)Evaluate the products hi = f(ωi) g(ωi) in R for 0 i < b.

(3)Compute

Since F is a ring isomorphism, we have

Next, let us consider the special case that b is a power of 2, that is, b = 2r for r 1. Then, the computation of F (f(X)) can efficiently be performed using a divide-and-conquer strategy. This leads to the fast Fourier transform (abbreviated FFT) as follows. We write polynomials f(X) of degree less than b in the form

The polynomials fj have degree less than b/2 and ω2 is a primitive (b/2)th root of unity. If f = (a0, . . . , ab1), then f0 = (a0, a2, . . . , ab2) and f1 = (a1, a3, . . . , ab1).

This yields

Let F' : R[X]/(Xb/2 1) Rb/2 be the Fourier transform of dimension b/2 with the root of unity ω2. If we have computed F'(f0) = (u0, . . . , ub/21) and F'(f1) = (υ0, . . . , υb/21), then we can determine F (f) as follows. With vectors

of length b, we determine by componentwise computation

The transformed F(f0) and F(f1) can be recursively computed with the same procedure. The additions in R and multiplications with ωi and b1 are called elementary arithmetic operations. Let t(b) be the number of elementary arithmetic operations necessary in this scheme for computing F (f). Then, t(b) satisfies the recurrence t(b) 2 t(b/2) + (b) since to determine the Fourier transform two recursive calls for f0 and f1 of half the size are needed. The results of these two recursive calls will be combined to obtain the transform f using a linear number of operations. The master theorem, Theorem 3.2, then yields t(b) (b log b).

If we want to compute the product f(X) g(X) of two polynomials of degree less than d in R[X], then we can choose b as the smallest power of 2 with b 2d. If a primitive bth root of unity ω is known, then using the FFT according to the scheme in Figure 3.1, the product of f(X) and g(X) can be computed with only (d log d) elementary arithmetic operations. The naive approach, in contrast, requires (d2) elementary arithmetic operations.

3.10 Primitive roots of unity

Let b and ' be a field of characteristic zero or characteristic p with gcd(b, p) = 1. Then, b is invertible in . Moreover, the formal derivative of the polynomial Xb 1 is bXb1. Therefore, the polynomial Xb1 has no multiple roots. In a field extension of , the polynomial Xb 1 splits into linear factors and the bth roots in the field extension are pairwise distinct. The group of bth roots of unity thus has b elements. It is cyclic and generated by an element ω. Then, ωb = 1 and ωk 1 for all 1 k < b. This implies for all 1 k < b. Since fields have no zero divisors, follows; and the generator ω is a primitive bth root of unity. If = , for example, we may choose

In general, however, a ring has zero divisors, and therefore the properties ωb = 1 and ωk 1 for all 1 k < b are not sufficient to ensure The remaining part of this section is dedicated to the following theorem which is crucial for Schönhage and Strassens algorithm for the fast multiplication of large numbers.

Theorem 3.17. Let b = 2r and let m be a multiple of b, and let n = 2m + 1. We define ψ = 2m/b and ω = ψ2. Then the following properties hold in /n:

(a)b1 = 2mr

(b)ψb = 1

(c)ω is a primitive bth root of unity.

Proof: In /n we have 2m = 1. The claims b1 = 2mr and ψb = 1, as well as ωb = 1, are therefore trivial. We have to show for all 1 k < b. Since b = 2r is a power of 2, an induction on r yields

Now let k = 2u with an odd number u. Using 0 < r, we have a factor of the form in the above product. Now, Since u is odd, we obtain

3.11 SchönhageStrassen integer multiplication

Multiplying two n-bit numbers requires n2-bit operations when using the classical method. In 1960, Karatsuba noticed that it can be done much faster with a surprisingly simple divide-and-conquer approach; he showed that asymptotically less than n1.6 operations are sufficient; see Section 3.8. A major breakthrough came when Arnold Schönhage (born 1934) and Volker Strassen were able to present a method with a nearly linear number of arithmetic operations in 1971 [91]. More specifically, they showed that the multiplication of two n-bit numbers can be done in time (n log n log log n) on a multitape Turing machine. It took 35 more years, until Martin Fürer (born 1947) could further improve this time bound to (n log n 2log n); see [43]. Here, log n denotes the number of applications of the logarithm needed to have a result less than 1. The function 2log n asymptotically grows much slower than log log n; however, on all realistic inputs, log log n is smaller.

In the remainder of this section, we prove the result of SchönhageStrassen.

Theorem 3.18. The multiplication of two natural numbers of bit size n can be done with (n log n log log n) bit operations.

The input consists of two natural numbers u and υ. We may assume that the binary representation of the product requires at most n bits and that n = 2s is a power of 2. It is sufficient to compute modulo 2n + 1. In the according quotient ring, 2n = 1 holds. We define b = 2s/2 and = 2s/2. As a rule of thumb, we note

2n + 1 is large and odd.

n = 2s is the input size and a power of 2.

s is a small number.

We decompose the input into b blocks of length and write u = Σi ui2i and υ = Σi υi2i, where 0 ui , υi < 2 for all i. By convention, ui = υi = 0 holds for all i < 0 and for all i b. Let yi = Σj ujυij; then yi 0 can only hold for 0 i < 2b 1. For 0 i < b at most i + 1 summands ujυij in yi are nonzero, which yields yi < (i + 1)22; and in yb+i at most b i 1 summands do not vanish, resulting in yb+i < (b i 1)22. With 2b = 1, we obtain

in /(2n + 1). Since (b i 1)22 < wi < (i + 1)22, it is sufficient to determine the values wi modulo b(22 + 1). Since b is a power of 2 and 22 + 1 is odd, according to the Chinese remainder theorem we can first compute the wi modulo b, then modulo 22 + 1, and finally combine these two intermediate steps for the result modulo b(22 + 1). Letting mod b and mod 22 + 1, we obtain wi from and by the computation

since 22 0 mod b. The resulting values wi with (b i 1)22 < wi < (i + 1)22 each have bits. From w0, . . . , wb1 we compute

This requires additions or subtractions, respectively, each on operands of length These can be performed by the classical methods in linear time. It remains to show how to find the values and

We first describe how the values wiare computed. First, we let mod b and mod b for 0 i < b. This is easy because b is a power of 2 and ui and υi are given in binary. Since we have mod b. Therefore, it is sufficient to determine the for 0 i < 2b. We know that and therefore 1 + 3 log b < 4 log b bits for each are sufficient. Let and This means that u' can be obtained from the by inserting sufficiently long blocks of zeros between them; υ' can be constructed in the same way. Then, we have

The binary lengths of u' and υ' are bounded by 4b log b, so with the method of Karatsuba, we can compute the product u' υ' in time ((b log b)1.6) (b2) = (n). From the result, one can directly extract the values yi since the corresponding blocks do not overlap.

Next, wewant to determine the We let N = 22+1 andR = /N. In particular, 22 = 1 in R. Note that N is approximately The number N is large, but it is much smaller than 2n. The subsequent calculations are performed in the polynomial ring R[X]/(Xb 1). The elements ψ = 22/b and ω = ψ2 have the properties given by Theorem 3.17. Thus, this polynomial ring is isomorphic to the direct product Rb, and multiplication of polynomials can be reduced to multiplications in R as shown in Section 3.9. We consider the following two polynomials:

Since ψb = 1 and Xb = 1, we have the following:

By the factor ψi in the definition of f and g it is ensured that yb+i has the desired negative sign. Let h(X) = f(X)g(X) R[X]/(Xb 1) and write Then, we obtain the as follows:

We compute the zi using the fast discrete Fourier transform, in which we recursively use the same algorithm for multiplication in R. It remains to show how to compute the number ziψi mod 22 + 1 efficiently. The time bound () is sufficient. Elements of R are represented by numbers z in {0, . . . ,22}. Numbers z R can efficiently be multiplied with 1 by computing 22 + 1 z with the classical method. Because of ψi = ψbi in R, it suffices to compute elements j mod N for 0 < j < b. Now, ψj 2k mod N for k = 2j/b {1, . . . ,2 1}. The value z2k can be computed by left-shifting z by k bits. We can split the result of the shift operation into two 2-bit blocks z' , z' {0, . . . ,22 1} such that z2k = z' + z22. Subtracting z' from z' modulo N now yields the desired result since z' z' z' + z22 = z2k ψj mod N.

Overview of the algorithm

In the following, we give a sketch of how the algorithm works. Here, x R means that the element x from the ring R is given or was computed. Arrows stand for a causal connection. In order to give a better insight into the proportions, b and are replaced by

The individual steps of the algorithm can be summarized as follows:

(1)The numbers u and υ each are decomposed in blocks of bits. Each block is computed modulo then u' and υ' are compiled from the blocks with sufficiently many zeros inserted.

(2)The numbers u' and υ' are multiplied using the Karatsuba multiplication algorithm.

(3)Due to the shielding by sufficiently many zeros, the blocks can be reconstructed from the product uυ.

(4)The blocks are viewed as numbers modulo and interpreted as coefficients of polynomials f and g.

(5)The Fourier transforms F (f) and F (g) of f and g are computed.

(6)In the computation of F(f) F(g) for each of the multiplications in the ring the algorithm is called recursively.

(7)Using the inverse Fourier transform the product of the polynomials f and g is determined.

(8)The values appear as coefficients of the polynomial f(X) g(X).

(9)With the Chinese remainder theorem the numbers mod and mod can be combined to wi mod

(10)The product is computed from the numbers wi.

Runtime analysis

Computing F(f(X)), F(g(X)), and takes (b log b) elementary arithmetic operations, each of which can be performed with ()-bit operations. This results in ( b log b) = (n log n) operations. Furthermore, b products hi = f(ωi) g(ωi ) modulo 22 + 1 have to be computed recursively. Let M (n) be the number of bit operations needed to multiply two n-bit numbers using the SchönhageStrassen algorithm. Then, we obtain the following recurrence:

Thus, M(n)/n 2M(2)/2 + O(log n). For n = 2s and t(s) = M(2s)/2s, we obtain

This yields t(s) (s log s) by the master theorem (see Theorem 3.2 and Example 3.3).

Thus, we finally have M(n) (n log n log log n).

Exercises

3.1. (Master theorem II) Let and Show that f(n) (n).

3.2. Show that on input two binary numbers a, b , one can decide in polynomial time whether there exists c with c2 = a/b.

3.3. Give an extension of the binary gcd algorithm such that, on input k, , it computes a, b, t with ak + b = t = gcd(k, ). The algorithm should not use division and multiplications by numbers other than 2.

3.4. Show that 1729 is an Eulerian pseudoprime to base 2, but not a strong pseudoprime to base 2. In addition, use the MillerRabin scheme to find a nontrivial divisor of 1729.

3.5. (Lucas test; named after Édouard Lucas, 18421891) Let n 1 and a . Show that n is prime if the following two conditions are satisfied:

(i)an1 1 mod n and

(ii) a(n1)/q 1 mod n for all prime divisors q of n 1.

3.6. (Pépin test; named after Théophile Pépin, 18261904) Let n 1. Show that the Fermat number is prime if and only if 3(fn1)/2 1 mod fn.

3.7. Let p > 2 be a prime number and consider the Mersenne number n = 2p 1 (named after Marin Mersenne, 15881648).

(a)Show that if n is prime, then f(X) = X2 4X + 1 n[X] is irreducible.

(b)Suppose that n is prime and let K = n[X]/f . Show that Xn = 4 X and (X 1)n+1 = 2 in K.

(c)Suppose that n is prime and let K = n[X]/f . Show that (X 1)n+1 = 2X(n+1)/2 in K.

(d)Show that n is prime if and only if X(n+1)/2 1 mod X2 4X + 1 in (/n)[X].

3.8. (LucasLehmer test; named after É. Lucas and Derrick Lehmer, 19051991) Let p > 2 be a prime number and n = 2p 1 the corresponding Mersenne number. The sequence (j)j is defined by 0 = 4 and mod n. Show that n is prime if and only if p2 = 0.

Hint: Show that mod X2 4X + 1 in (/n)[X].

3.9. A linear Diophantine equation has the form a1X1 + +anXn = an+1 with ai . Explain how to check whether a solution (x1, . . . , xn) n exists and how to compute it.

3.10. Compute the square root of 2 in the field 41 = /41. Use Tonellis algorithm and choose g = 3 in the first step.

3.11. Let p be prime with p 5 mod 8. We consider the field p and a square a p. Show that if a(p1)/4 = 1, then a(p+3)/8 is a square root of a; and if a(p1)/4 = 1, then 21(4a)(p+3)/8 is a square root of a.

3.12. Let p be a prime number with p 1 mod 4 and let f(X) = X2 + 1.

(a)Show that the polynomial f(X) is irreducible in p[X].

(b)Show how to efficiently find an element a with 1 a < p 1 such that and

(c)Show that in p2 = p[X]/f it is possible to extract roots in deterministic polynomial time.

3.13. Compute the square root of 2 in the field 23 = /23. Use Cipollas algorithm and choose t = 0 in the first step.

3.14. Use Pollards p 1 algorithm to find a nontrivial divisor of n = 253. In the first step, let B = {2, 3} and a = 2, in the second step B = {2, 3, 5} and a = 2.

3.15. Find a nontrivial divisor of n = 689 with Pollards ρ algorithm. Use F (x) = x2 + 1 mod n and the initial value x0 = 12.

3.16. Compute the discrete logarithm of 3 to base 2 in the multiplicative group (/19) using Shanks baby-step giant-step algorithm.

3.17. Let G be a finite group and g G. Show how to determine the order of g using Shanks baby-step giant-step algorithm.

3.18. Let G = (/23) and g = 3.

(a)Determine the order of g in G.

(b)Use Pollards ρ algorithm to compute the discrete logarithm of 18 to base 3 in G. Decompose G with respect to the residue classes modulo 3 and use r1 = s1 = 1 as initial values.

3.19. Compute the discrete logarithm of 5 to base 2 in the multiplicative group (/19). Reduce the problem with the PohligHellman method to discrete logarithms in groups of smaller order.

3.20. Let be a Fermat prime. Show that discrete logarithms in (/p) can be computed in deterministic polynomial time.

3.21. Let q = 97, b = 4 and ω = 22.

(a)Show that ω is a primitive bth root of unity in q.

(b)Specify the Fourier matrices F, with respect to ω.

(c)Compute the product f g of the polynomials f(X) = 1 + X + X2 and g(X) = 2 + 3X using the FFT.

3.22. Let p1, p2, p3 256 + 1 be distinct primes of at most 64 bits, let ωi be a primitive 256th root of unity in the field Ki = /pi, and let Fi be the Fourier transform in Ki with respect to ωi. On input two numbers u = Σj0 uj264j and υ = Σj0 υj264j with 0 uj , υj < 264 we determine Ui(X) = Σj0(uj mod pi)Xj and Vi(X) = Σj0(υj mod pi)Xj. Then, in Ki we compute the coefficients of

using the FFT. Finally, we determine the polynomial W(X) = Σj0wjXj with wj wi,j mod pi by means of the Chinese remainder theorem. At last, we return the number w = W(264). What is the running time of this algorithm? What are the maximum values for u and υ to ensure that w = holds?

Summary

Notions

addition chain

Carmichael number

square-free

pseudoprime

Eulerian pseudoprime

strong pseudoprime

square, quadratic residue

square root

root extraction in groups

factorization

pseudo-random sequence

B-smooth

discrete logarithm

primitive root of unity

elementary arithmetic operation

Methods and results

master theorem for estimation of recurrence equations

birthday paradox: For random sequences of m events from Ω with the probability to have two identical elements in the sequence is greater than 1/2.

fast exponentiation for computing an with O(log n) multiplications

binary gcd

Fermat test

properties of Carmichael numbers

The error probability of the SolovayStrassen primality test is at most 1/2.

description of the MillerRabin primality test

The error probability of the MillerRabin primality test is at most 1/4.

MillerRabin scheme yields an algorithm for factorization of n if a multiple of φ(n) is known or if φ(n) has only small prime divisors

MillerRabin scheme as evidence for the secret RSA key to be secure

n Eulerian pseudoprime to base a n pseudoprime to base a

n strong pseudoprime to base a n Eulerian pseudoprime to base a

root extraction in groups of odd order

root extraction in q with q 1 mod 4

Tonellis algorithm for extracting roots in q with odd number q

Cipollas algorithm for extracting roots in q with odd q

Pollards p 1 algorithm

Pollards ρ algorithm for integer factorization

quadratic sieve

Shanks baby-step giant-step algorithm for discrete logarithm

Pollards ρ algorithm for discrete logarithm

PohligHellman algorithm for discrete logarithm: step 1 reduction to prime powers, step 2 reduction to prime divisors

index calculus algorithm

Karatsuba integer multiplication in (n1,58496 )

division using Newton method

discrete Fourier transform: ring isomorphism R[X]/(Xb 1) Rb

fast Fourier transform: computing the discrete Fourier transform with (b log b) elementary arithmetic operations

primitive roots of unity in /(2m + 1)

SchönhageStrassen integer multiplication in (n log n log log n)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset