BCH codes are a class of cyclic codes. They were discovered around 1959 by R. C. Bose and D. K. Ray-Chaudhuri and independently by A. Hocquenghem. One reason they are important is that there exist good decoding algorithms that correct multiple errors (see, for example, [Gallager] or [Wicker]). BCH codes are used in satellites. The special BCH codes called Reed-Solomon codes (see Section 24.9) have numerous applications.
Before describing BCH codes, we need a fact about finite fields. Let FF be a finite field with qq elements. From Section 3.11, we know that q=pmq=pm is a power of a prime number pp. Let nn be a positive integer not divisible by pp. Then it can be proved that there exists a finite field F′F′ containing FF such that F′F′ contains a primitive nnth root of unity αα. This means that αn=1αn=1, but αk≠1αk≠1 for 1≤k<n1≤k<n.
For example, if F=Z2F=Z2, the integers mod 2, and n=3n=3, we may take F′=GF(4)F′=GF(4). The element ωω in the description of GF(4)GF(4) in Section 3.11 is a primitive third root of unity. More generally, a primitive nnth root of unity exists in a finite field F′F′ with q′q′ elements if and only if n|q′−1n|q′−1.
The reason we need the auxiliary field F′F′ is that several of the calculations we perform need to be carried out in this larger field. In the following, when we use an nnth root of unity αα, we’ll implicitly assume that we’re calculating in some field F′F′ that contains αα. The results of the calculations, however, will give results about codes over the smaller field FF.
The following result, often called the BCH bound, gives an estimate for the minimum weight of a cyclic code.
Theorem 24.8
Let CC be a cyclic [n,k,d][n,k,d] code over a finite field FF, where FF has q=pmq=pm elements. Assume p|n.p|n. Let g(X)g(X) be the generating polynomial for CC. Let αα be a primitive nnth root of unity and suppose that for some integers ℓℓ and δδ,
g(αℓ)=g(αℓ+1)=⋯=g(αℓ+δ)=0.
g(αℓ)=g(αℓ+1)=⋯=g(αℓ+δ)=0.
Then d≥δ+2d≥δ+2.
Proof. Suppose (c0,c1,…,cn−1)∈C(c0,c1,…,cn−1)∈C has weight ww with 1≤w<δ+21≤w<δ+2. We want to obtain a contradiction. Let m(X)=c0+c1X+⋯+cn−1Xn−1m(X)=c0+c1X+⋯+cn−1Xn−1. We know that m(X)m(X) is a multiple of g(X)g(X), so
m(αℓ)=m(αℓ+1)=⋯=m(αℓ+δ)=0.
m(αℓ)=m(αℓ+1)=⋯=m(αℓ+δ)=0.
Let ci1,ci2,…,ciwci1,ci2,…,ciw be the nonzero coefficients of m(X)m(X), so
m(X)=ci1Xi1+ci2Xi2+⋯+ciwXiw.
m(X)=ci1Xi1+ci2Xi2+⋯+ciwXiw.
The fact that m(αj)=0m(αj)=0 for l≤j≤ℓ+w−1l≤j≤ℓ+w−1 (note that w−1≤δw−1≤δ) can be rewritten as
We claim that the determinant of the matrix is nonzero. We need the following evaluation of the Vandermonde determinant. The proof can be found in most books on linear algebra.
(The product is over all pairs of integers (i,j)(i,j) with 1≤i<j≤n1≤i<j≤n.) In particular, if x1,…,xnx1,…,xn are pairwise distinct, the determinant is nonzero.
In our matrix, we can factor αℓi1αℓi1 from the first column, αℓi2αℓi2 from the second column, etc., to obtain
Since αi1,⋯,αiwαi1,⋯,αiw are pairwise distinct, the determinant is nonzero. Why are these numbers distinct? Suppose αij=αikαij=αik. We may assume ij≤ikij≤ik. We have 0≤ij≤ik<n0≤ij≤ik<n. Therefore, 0≤ik−ij<n0≤ik−ij<n. Note that αik−ij=1αik−ij=1. Since αα is a primitive nnth root of unity, αi≠1αi≠1 for 1≤i<n1≤i<n. Therefore, ik−ij=0ik−ij=0, so ij=ikij=ik. This means that the numbers αi1,⋯,αiwαi1,⋯,αiw are pairwise distinct, as claimed.
Since the determinant is nonzero, the matrix is nonsingular. This implies that (ci1,…,ciw)=0(ci1,…,ciw)=0, contradicting the fact that these were the nonzero cici’s. Therefore, all nonzero codewords have weight at least δ+2δ+2. This completes the proof of the theorem.
Example
Let F=Z2F=Z2= the integers mod 2, and let n=3n=3. Let g(X)=X2+X+1g(X)=X2+X+1. Then
C={(0,0,0),(1,1,1)},
C={(0,0,0),(1,1,1)},
which is a binary repetition code. Let ωω be a primitive third root of unity, as in the description of GF(4)GF(4) in Section 3.11. Then g(ω)=g(ω2)=0g(ω)=g(ω2)=0. In the theorem, we can therefore take ℓ=1ℓ=1 and δ=1δ=1. We find that the minimal weight of CC is at least 3. In this case, the bound is sharp, since the minimal weight of CC is exactly 3.
Example
Let FF be any finite field and let nn be any positive integer. Let g(X)=X−1g(X)=X−1. Then g(1)=0g(1)=0, so we may take ℓ=0ℓ=0 and δ=0δ=0. We conclude that the minimum weight of the code generated by g(X)g(X) is at least 2 (actually, the theorem assumes that p|n,p|n, but this assumption is not needed for this special case where ℓ=δ=0ℓ=δ=0). We have seen this code before. If (c0,…,cn−1)(c0,…,cn−1) is a vector, and m(X)=c0+⋯+cn−1Xn−1m(X)=c0+⋯+cn−1Xn−1 is the associated polynomial, then m(X)m(X) is a multiple of X−1X−1 exactly when m(1)=0m(1)=0. This means that c0+⋯+cn−1=0c0+⋯+cn−1=0. So a vector is a codeword if and only if the sum of its entries is 0. When F=Z2F=Z2, this is the parity check code, and for other finite fields it is a generalization of the parity check code. The fact that its minimal weight is 2 is easy to see directly: If a codeword has a nonzero entry, then it must contain another nonzero entry to cancel it and make the sum of the entries be 0. Therefore, each nonzero codeword has at least two nonzero entries, and hence has weight at least 2. The vector (1,−1,0,…)(1,−1,0,…) is a codeword and has weight 2, so the minimal weight is exactly 2.
Example
Let’s return to the example of a binary cyclic code of length 7 from Section 24.7. We have F=Z2F=Z2, and g(X)=1+X2+X3+X4g(X)=1+X2+X3+X4. We can factor g(X)=(X−1)(X3+X+1)g(X)=(X−1)(X3+X+1). Let αα be a root of X3+X+1X3+X+1. Then αα is a primitive seventh root of unity (see Exercise 18), and we are working in GF(8)GF(8). Since Z2⊂GF(8)Z2⊂GF(8), we have 2=1+1=02=1+1=0 and −1=1−1=1. Therefore, α3=α+1α3=α+1. Squaring yields α6=α2+2α+1=α2+1α6=α2+2α+1=α2+1. Therefore, (α2)3+(α2)+1=0(α2)3+(α2)+1=0. This means that g(α2)=0g(α2)=0, so
g(1)=g(α)=g(α2)=0.
g(1)=g(α)=g(α2)=0.
In the theorem, we can take ℓ=0ℓ=0 and δ=2δ=2. Therefore, the minimal weight in the code is at least 4 (in fact, it is exactly 4).
To define the BCH codes, we need some more notation. We are going to construct codes of length nn over a finite field FF. Factor Xn−1Xn−1 into irreducible factors over FF:
Xn−1=f1(X)f2(X)⋯fr(X),
Xn−1=f1(X)f2(X)⋯fr(X),
where each fi(X)fi(X) is a polynomial with coefficients in FF, and each fi(X)fi(X) cannot be factored into lower-degree polynomials with coefficients in FF. We may assume that the highest nonzero coefficient of each fi(X)fi(X) is 1. Let αα be a primitive nnth root of unity. Then α0,α1,α2,…,αn−1α0,α1,α2,…,αn−1 are roots of Xn−1Xn−1. This means that
Xn−1=(X−1)(X−α)(X−α2)⋯(X−αn−1).
Xn−1=(X−1)(X−α)(X−α2)⋯(X−αn−1).
Therefore, each fi(X)fi(X) is a product of some of these factors (X−αj)(X−αj), and each αjαj is a root of exactly one of the polynomials fi(X)fi(X). For each jj, let qj(X)qj(X) be the polynomial fi(X)fi(X) such that fi(αj)=0fi(αj)=0. This gives us polynomials q0(X),q1(X),…qn−1(X)q0(X),q1(X),…qn−1(X). Of course, usually these polynomials are not all distinct, since a polynomial fi(X)fi(X) that has two different powers αj,αkαj,αk as roots will serve as both qj(X)qj(X) and qk(X)qk(X) (see the examples given later in this section).
A BCH code of designed distance dd is a code with generating polynomial
g(X)=least common multiple of qk+1(X),qk+2(X),…,qk+d−1(X)
g(X)=least common multiple of qk+1(X),qk+2(X),…,qk+d−1(X)
for some integer kk.
Theorem
A BCH code of designed distance dd has minimum weight greater than or equal to dd.
Proof. Since qj(X)qj(X) divides g(X)g(X) for k+1≤j≤k+d−1k+1≤j≤k+d−1, and qj(αj)=0qj(αj)=0, we have
g(αk+1)=g(αk+2)=⋯=g(αk+d−1)=0.
g(αk+1)=g(αk+2)=⋯=g(αk+d−1)=0.
The BCH bound (with ℓ=k+1ℓ=k+1 and δ=d−2δ=d−2) implies that the code has minimum weight at least d=δ+2d=δ+2.
Example
Let F=Z2F=Z2, and let n=7n=7. Then
X7−1=(X−1)(X3+X2+1)(X3+X+1).
X7−1=(X−1)(X3+X2+1)(X3+X+1).
Let αα be a root of X3+X+1X3+X+1. Then αα is a primitive 7th root of unity, as in the previous example. Moreover, in that example, we showed that α2α2 is also a root of X3+X+1X3+X+1. In fact, we actually showed that the square of a root of X3+X+1X3+X+1 is also a root, so we have that α4=(α2)2α4=(α2)2 is also a root of X3+X+1X3+X+1. (We could square this again, but α8=αα8=α, so we are back to where we started.) Therefore, α,α2,α4α,α2,α4 are the roots of X3+X+1X3+X+1, so
X3+X+1=(X−α)(X−α2)(X−α4).
X3+X+1=(X−α)(X−α2)(X−α4).
The remaining powers of αα must be roots of X3+X2+1X3+X2+1, so
We obtain the cyclic [7,3,4][7,3,4] code discussed in Section 24.7. The theorem says that the minimum weight is at least 3. In this case, we can do a little better. If we take k=−1k=−1 and d=4d=4, then we have a generating polynomial g1(X)g1(X) with
g1(X)=lcm(q0(X),q1(X),q2(X))=g(X).
g1(X)=lcm(q0(X),q1(X),q2(X))=g(X).
This is because q2(X)=q1(X)q2(X)=q1(X), so the least common multiple doesn’t change when q2(X)q2(X) is included. The theorem now tells us that the minimum weight of the code is at least 4. As we have seen before, the minimum weight is exactly 4.
Example (continued)
Let’s continue with the previous example, but take k=0k=0 and d=7d=7. Then
We obtain the repetition code with only two codewords:
{(0,0,0,0,0,0,0),(1,1,1,1,1,1,1)}.
{(0,0,0,0,0,0,0),(1,1,1,1,1,1,1)}.
The theorem says that the minimum distance is at least 7. In fact it is exactly 7.
Example
Let F=Z5={0,1,2,3,4}F=Z5={0,1,2,3,4} = the integers mod 5. Let n=4n=4. Then
X4−1=(X−1)(X−2)(X−3)(X−4)
X4−1=(X−1)(X−2)(X−3)(X−4)
(this is an equality, or congruence if you prefer, in Z5Z5). Let α=2α=2. We have 24=124=1, but 2j≠12j≠1 for 1≤j<41≤j<4. Therefore, 2 is a primitive 4th root of unity in Z5Z5. We have 20=1,22=4,23=320=1,22=4,23=3 (these are just congruences mod 5). Therefore,
q0(X)=X−1,q1(X)=X−2,q2(X)=X−4,q3(X)=X−3.
q0(X)=X−1,q1(X)=X−2,q2(X)=X−4,q3(X)=X−3.
In the theorem, let k=0,d=3k=0,d=3. Then
g(X)=lcm(q1(X),q2(X))=(X−2)(X−4)=X2−6X+8=X2+4X+3.
g(X)=lcm(q1(X),q2(X))=(X−2)(X−4)=X2−6X+8=X2+4X+3.
We obtain a cyclic [4,2][4,2] code over Z5Z5 with generating matrix
(34100341).
(30431401).
The theorem says that the minimum weight is at least 3. Since the first row of the matrix is a codeword of weight 3, the minimum weight is exactly 3. This code is an example of a Reed-Solomon code, which will be discussed in the next section.
24.8.1 Decoding BCH Codes
One of the reason BCH codes are useful is that there are good decoding algorithms. One of the best known is due to Berlekamp and Massey (see [Gallager] or [Wicker]). In the following, we won’t give the algorithm, but, in order to give the spirit of some of the ideas that are involved, we show a way to correct one error in a BCH code with designed distance d≥3d≥3.
Let CC be a BCH code of designed distance d≥3d≥3. Then CC is a cyclic code, say of length nn, with generating polynomial g(X)g(X). There is a primitive nnth root of unity αα such that
HH is not necessarily a parity check matrix for CC, since there might be noncodewords that are also in the null space of HH. However, as we shall see, HH can correct an error.
Suppose the vector r=c+er=c+e is received, where cc is a codeword and e=(e0,…,en−1)e=(e0,…,en−1) is an error vector. We assume that at most one entry of ee is nonzero.
Here is the algorithm for correcting one error.
Write rHT=(s1,s2)rHT=(s1,s2).
If s1=0s1=0, there is no error (or there is more than one error), so we’re done.
If s1≠0s1≠0, compute s2/s1s2/s1. This will be a power αj−1αj−1 of αα. The error is in the jjth position. If we are working over the finite field Z2Z2, we are done, since then ej=1ej=1. But for other finite fields, there are several choices for the value of ejej.
Compute ej=s1/α(j−1)(k+1)ej=s1/α(j−1)(k+1). This is the jjth entry of the error vector ee. The other entries of ee are 0.
Subtract the error vector ee from the received vector rr to obtain the correct codeword cc.
Example
Let’s look at the BCH code over Z2Z2 of length 7 and designed distance 7 considered previously. It is the binary repetition code of length 7 and has two codewords: (0,0,0,0,0,0,0),(1,1,1,1,1,1,1)(0,0,0,0,0,0,0),(1,1,1,1,1,1,1). The algorithm corrects one error. Suppose the received vector is r=(1,1,1,1,0,1,1)r=(1,1,1,1,0,1,1). As before, let αα be a root of X3+X+1X3+X+1. Then αα is a primitive 7th root of unity.
Before proceeding, we need to deduce a few facts about computing with powers of αα. We have α3=α+1α3=α+1. Multiplying this relation by powers of αα yields
Therefore, s1=α+α2s1=α+α2 and s2=αs2=α. We need to calculate s2/s1s2/s1. Since s1=α+α2=α4s1=α+α2=α4, we have
s2/s1=α/α4=α−3=α4.
Therefore, j−1=4, so the error is in position j=5. The fifth entry of the error vector is s1/α4=1, so the error vector is (0,0,0,0,1,0,0). The corrected message is
r−e=(1,1,1,1,1,1,1).
Here is why the algorithm works. Since cHT=0, we have
rHT=cHT=eHT=eHT=(s1,s2).
If e=(0,0,…,ej,0,…) with ej≠0, then the definition of H gives
s1=ejα(j−1)(k+1),s2=ejα(j−1)(k+2).
Therefore, s2/s1=αj−1. Also, s1/α(j−1)(k+1)=ej, as claimed.