3
Linear Block Codes

3.1. Introduction

The aim of channel coding is to protect the data delivered by the source coder against the transmission errors.

We have shown that the random coding allows us to reach the limit of the channel coding theorem when the size of the codewords N tends to + ∞. However, this technique is not feasible due to the complexity of the associated coder and decoder. Indeed, let us consider a random binary code image composed of 2k codewords where K is the number of bits of the information word and each information word is associated with a codeword of N bits. To construct a random encoder, it is necessary to first build a set 2k codewords drawn randomly. The encoding will correspond to the association of an information word with a unique codeword using a look up table. Since the code has no specific structure, the decoding will consist of an exhaustive search between the received word and all the 2k codewords of the set in order to determine the maximum likelihood (ML) codeword. The complexity of this decoder increases exponentially with K and is almost always unfeasible in practice.

As a consequence, we have to use codes with an algebraic structure such as the linearity property in order to simplify the coder and decoder. These codes will have to be adapted to the class of errors (random, isolated, bursty, etc.). In this book, we will focus on the three following families of codes:

  • – linear block codes: a q-ary block code image (N, K) is a set composed of qk codewords. We associate a q-ary codeword composed of N symbols with each q-ary information word composed of K symbols. The linearity means that the N symbols of the codeword are obtained by linear combination of the K symbols of the information word.
  • – convolutional codes: compared to block codes, for a convolutional code of rate k/n, the codeword composed of n symbols is a function of the current information word composed of k symbols but also of a given number of previous information words. The length of the input and output sequences is infinite.
  • – concatenated codes: theses codes are obtained by concatenation of linear block codes or convolutional codes.

In this chapter, we will study the main properties of linear block codes applied to error correction and error detection. After introducing some fundamental notions on finite fields in section 3.2, we will study the linear block codes, their structures, properties and their matrix representations in section 3.3. In section 3.4, we will develop the hard and soft input decoding algorithms for those codes. In section 3.5, we will study their theoretical performance and finally, we will focus on the class of cyclic codes and their properties in section 3.6.

3.2. Finite fields

3.2.1. Fields

A field F is a non-empty set that has two operations defined on it, namely addition and multiplication and satisfying the following axioms:

  • – F is an Abelian group under addition. It has the properties of associativity, identity element written 0, symmetry and commutativity;
  • – the multiplication is associative: if a,b,cF, then a(bc) = (ab)c;
  • – the multiplication is commutative: if a, bF, then ab = ba;
  • – the multiplication is right distributive and left distributive with respect to addition: if a,b,cF, then a(b + c) = ab + ac and (a + b)c = ac + bc;
  • – the field has an identity element denoted as 1 for the multiplication;
  • – each non-zero element of F is invertible; if aF (a = 0), a−1 is its inverse with aa−1 = 1.

3.2.2. Finite fields

A finite field or Galois Field is a field with q elements and is denoted by image or GF(q) in memory of Evariste Galois1. It is possible to build a finite field if q is a prime number or if q = pm with p as a prime number. When q is a prime number, the addition and multiplication in the finite field image is the addition and multiplication modulo q. Since each finite field should contain the identity elements 0 and 1, the simplest finite field is image. For digital communications, we will mainly use the finite fields image and image with q = 2m since we mainly consider binary elements. In this chapter, we will mainly restrict ourselves to these two finite fields.

Example 3.1.– Addition and multiplication in image.

Table 3.1. Addition and multiplication in image

image

The addition in image is equivalent to the XOR operation while the multiplication is a logical AND.

Example 3.2.– Addition and multiplication in image.

Table 3.2. Addition and multiplication in image

image

3.2.3. Irreducible and primitive polynomials

Let f(p) be a polynomial the coefficients of which are elements of image, f(p) = f0 + f1x + f2p2 + … + fmpm with fiimage. The degree of the polynomial is the highest non-zero power of p. If fm ≠ 0, then the polynomial is of degree m.

Definition 3.1.– A polynomial is irreducible in image if it cannot be written as a product of polynomials b(p)c(p) with b(p) and c(p) of degree higher or equal to 1.

Theorem 3.1.– All irreducible polynomials in image of degree m divide p2m −1 − 1.

The proof of this theorem is given for example in [LIN 83].

In Table 3.3, we give the decomposition of the polynomial of the form p2m −1 − 1 in a product of irreducible polynomials for m ≤ 5.

Table 3.3. Decomposition of the polynomial of the form p2m−1 − 1 in a product of irreducible polynomials

m =2 p3 − 1 = (1 +p)(1+p + p2)
m =3 p7 − 1 = (1 + p)(1 + p + p3)(1 + p2 +p3)
m =4 p15 − 1 = (1 + p)(1 + p + p2)(1 + p3 +p4)(1 + p + p4)(1 +p + p2 + p3 +p4)
m =5 p31 − 1 = (1 + p)(1 + p3 + p5)(1 + p2 + p5)(1 + p2 + p3 + p4 + p5)
(1 + p + p3 + p4 + p5)(1 + p + p2 + p4 + p5)(1 + p + p2 + p3 + p5)

Definition 3.2.– An irreducible polynomial f(p) of degree m is a primitive polynomial if the smallest positive integer n for which f(p) divide pn+ 1 is n = 2m 1. A non-exhaustive list of the primitive polynomials is given in Table 3.4.

Table 3.4. List of primitive polynomials for m ≤ 16

image

Table 3.5. List of the elements of the finite field image

Element Polynomial Binary representation
0 0 00
1 1 01
α p 10
α2 1+ p 11

3.2.4. Finite field with 2m elements

A finite field image with q = 2m is isomorph to the field of polynomials with coefficients in image modulo an irreducible polynomial f(p) in image and with degree m. image is called the basis field. Let α be a root of this polynomial (f(α) = 0). We can show that the successive powers of α generate the 2m − 1 non-zero elements of the finite field image.

Example 3.3.– Addition and multiplication in image.

Let us study the finite field image (m = 2 case) built from the primitive polynomial 1 + p + p2. Let α be a root of this polynomial: 1 + α + α2 = 0. We can check that the successive powers of α generate the 22 − 1 non-zero elements of image. Table 3.5 gives the list of the elements of the finite field image.

Table 3.6. Addition and multiplication in image

image

We have α2 = 1 + α, α3 = α + α2 = 1 and α4 = α. We can check from Table 3.6 that image is indeed a finite field.

Example 3.4.– List of the elements in image. Let m = 4 and the irreducible polynomial f(p) = 1 + p + p4. We can check in Table 3.3 that this polynomial of degree 4 is a factor of p15 − 1 (the other irreducible polynomial of degree 4 is 1 + p + p2 + p3 + p4). Let α be a root of this polynomial: 1 + α + α4 = 0. As previously, the successive powers of α generate the 24 − 1 = 15 non-zero elements of image. Table 3.7 gives the list of these elements.

Table 3.7. List of elements of finite field image built from the irreducible polynomial 1 + p + p4

Element Polynomial Binary representation
0 0 0000
1 1 0001
α p 0010
α2 p2 0100
α3 p3 1000
α4 1 + p 0011
α5 p + p2 0110
α6 p2 + p3 1100
α7 1 + p + p3 1011
α8 1 + p2 0101
α9 p + p3 1010
α10 1 + p + p2 0111
α11 p + p2 + p3 1110
α12 1 + p + p2 + p3 1111
α13 1 + p2 + p3 1101
α14 1 + p3 1001

At each non-zero element of the finite field image with q = 2m, we associate a minimal polynomial.

Definition 3.3.– The minimal polynomial mi(p) is the polynomial with the lowest degree for which αi is a root.

For the previous example, the 15 minimal polynomials are the following:

[3.1] images

Theorem 3.2.– Each minimal polynomial mi(p) is irreducible.

The proof is given in [LIN 83]. We can check that the 5 polynomials above are irreducible and factors of p15 − 1. We will see further in the chapter that the irreducible polynomials of this table are among the main polynomials used for the construction of cyclic codes.

3.3. Linear block codes

3.3.1. Introduction

A q-ary linear block code C (N, K) is a set composed of qK codewords.

We associate a q-ary codeword composed of N symbols with each q-ary information word composed of K symbols. The linearity means that the N symbols of the codeword are obtained by linear combination of the K symbols of the information word. This property allows us, in particular, to describe the coding operation in a matrix form.

In this section, we will only consider the binary linear block codes for which q = 2.

It is convenient to represent the information word and codeword using vectors. Let u = [u0, u1,…, uK−1] be an information word composed of K elements of information image and c = [c0, c1, , cN−1] be the associated codeword composed of N elements image. We have the matrix relation between the information word u and the associated codeword c:

[3.2] images

G is the generator matrix of the coder with dimension K × N.

Each codeword is a linear combination of the vectors gi de G composed of elements image. Thus, a linear block code can be defined as a vector subspace with K < N dimensions built according to [3.3].

It is always possible by combining the lines to obtain the generator G under a systematic form as follows:

where IK is the identity matrix of dimension K × K. When the generator matrix is systematic, the first K bits of the codeword are the information bits. We have: c = [u0, u1, …, uK−1cK, cK+1,…, cN−1]

Example 3.5.– Repetition code C1 (3, 1) in image:

[3.5] images

The information bit is repeated three times:

images

Example 3.6.– Parity check code C2 (3, 2) in image:

[3.6] images

c2 is the so called parity bit:

images

Each codeword is composed of an even number of 1. The codewords of this code C2 are 000, 011, 110 and 101. Figure 3.1 gives a graphical representation of this code in a three-dimension (3D) space.

Example 3.7.– Hamming code C3 (7, 4) in image is defined by the following generator matrix:

[3.7] images
image

Figure 3.1. Parity check code C2 (3, 2)

Here, we can obtain a systematic generator matrix simply as follows: we add lines 1, 2 and 3 to obtain line 1, lines 2, 3 and 4 to obtain line 2 and line 3 and 4 to obtain line 3. The last line remains unchanged. This technique allows us to convert any generator matrix into a systematic generator matrix.

The systematic form of this generator matrix is as follows:

The three parity or redundancy bits are equal to:

images

The 24 codewords for this code are listed in Table 3.8.

Table 3.8. List of codewords for Hamming code C3 (7, 4)

0000000 0001101 0010111 0011010
0100011 0101110 0110100 0111001
1000110 1001011 1010001 1011100
1100101 1101000 1110010 1111111

Definition 3.4.– The rate R of a block code (N, K) is equal to:

[3.9] images

Definition 3.5.– Let c1 and c2 be two codewords of code C, α1 and α2 be two elements of the finite field. The linearity implies that α1c1 + α2c2 is also a codeword of C. Consequently, the word c0 = [00 … 0] is always a codeword of any linear code. This codeword is called the null codeword.

3.3.2. Minimum distance of a code

Definition 3.6.– Let c1 and c2 be two codewords of length N of the code C, the Hamming distance dH(c1, c2) is equal of the number of elements which are different.

images

Definition 3.7.– The Hamming weight w(c) of a binary codeword c is equal to the number of non-zero elements of this codeword.

images

Definition 3.8.– The minimum distance dmin of the code C is the Hamming distance between the pair of codewords with the smallest Hamming distance:

[3.10] images

When the code is linear, the minimum distance dmin is equal to the minimum Hamming weight of the code C (by excluding the null codeword c0):

[3.11] images

Example 3.8.– The minimum distance of the code C1 (3, 1) and code C3 (7, 4) is equal to 3; the minimum distance of the code C2 (3, 2) is equal to 2.

Until recently, the minimum distance was the unique criterion to evaluate the performance of error correcting codes. This criterion has been partially challenged with the discovery of very efficient codes imitating random coding [BER 93, BAT 97].

3.3.3. Parity check matrix

There exists a block linear code (N, NK) associated with each block linear code C (N, K). Let H be the generator matrix of this dual code. Each of the codewords c of the code C is orthogonal to all the codewords of the dual code:

[3.12] images

Since this relation is true for all the codewords of the code C, we have the relation between the generator matrix G of the code C and H:

[3.13] images

If the generator matrix G is systematic as described in equation [3.4], H can be described as follows:

[3.14] images

The matrix H is called the parity check matrix of the code C.

Example 3.9.– The parity check matrix of the Hamming code C3 (7, 4) is as follows:

We can observe that the parity check matrix of the Hamming code C3 is composed of the seven non-zero different column vectors of dimension 3. Each of the three lines of the parity check matrix correspond to a parity check equation (addition modulo 2 in the case of binary code) involving the different bits of the codewords c = (c0 c1 c2 c3 c4 c5 c6) = (u0 u1 u2 u3 c4 c5 c6).

[3.16] images

We find the same three parity check equations as in the previous example.

3.3.4. Weight enumerator functions

The weight enumerator functions (WEFs) allow us to study the performance of the linear block codes.

Definition 3.9.– The WEF of a systematic binary block coder (N, K) is defined as follows:

[3.17] images

Ad is the number of codewords with length N and Hamming weight d.

Definition 3.10.– The input redundancy weight enumerator function (IRWEF) of a systematic binary block coder (N, K) is defined as follows:

[3.18] images

Aw,z is the number of codewords with length N and for which the weight of the information sequence is w and the weight of the redundancy sequence is equal to z.

The IRWEF function can also be written as follows:

[3.19] images

with:

[3.20] images

Definition 3.11.– The input output weight enumerator function (IOWEF) of a systematic binary block coder (N, K) is defined as follows:

[3.21] images

Bw,d is the number of codewords with length N and Hamming weight d and for which the weight of the information sequence is w.

Example 3.10.– Parity check code C2 (3, 2) in image.

Table 3.9. Weight enumeration of information sequence and codewords for the parity check code (3, 2)

u c w z d
00 00 0 0 0 0
01 01 1 1 1 2
10 10 1 1 1 2
11 11 0 2 0 2

The weights of the different codewords are detailed in Table 3.9. The WEF, IRWEF and IOWEF for the parity check code (3, 2) are the following:

images

Example 3.11.– Hamming code C3 (7, 4). From the list of codewords, the WEF, IRWEF and IOWEF for the code C3 (7, 4) are the following:

images

3.3.5. Error correction and error detection capabilities of linear block codes

Over a binary symmetric channel, the number of errors e that an error correction code is capable of correcting is directly related to the minimum distance of the code. The 2K codewords of the code can be seen as the center of a Hamming sphere with radius e. In order that the 2K spheres do not overlap, we should have the following relation:

If a received word is inside the Hamming sphere of the transmitted codeword, the decoder will be able to recover the codeword as shown in Figure 3.2. This is possible only if the Hamming distance between the transmitted codeword and the received word is less or equal to e. As a conclusion, a linear block code (N, K) with minimum distance dmin can correct up to e errors according to relation [3.22].

image

Figure 3.2. Hamming sphere

Furthermore, the same code can also be used as an error detection code. In that case, it will be able to detect up to dmin 1 errors.

[3.23] images

A code can correct up to e errors (error correction) or detect up to ed errors (error detection) in a binary symmetric channel.

A code can correct up to dmin 1 erasures on an erasure channel.

[3.24] images

Example 3.12.– The codes C1 (3, 1) and C3 (7, 4) with minimum distance 3 can correct one error or detect up to two errors.

3.3.5.1. Hamming bound and perfect codes

Definition 3.12.– The Hamming bound is given by the relation2 between K, N and e the number of errors that can correct the (N, K) in image:

or by dividing the two terms by 2K

[3.26] images

Proof.– The number of words included in a Hamming sphere of radius e is equal to:

[3.27] images

The total number of words included in the qK Hamming spheres cannot be higher than qN to avoid overlapping of the spheres. Consequently, a code correcting e errors should satisfy the inequality [3.25].

In the binary case, the inequality [3.25] becomes the following:

Definition 3.13.– A code is perfect if all the qN possible words are included in the qK Hamming spheres of radius e. The inequality in [3.28] is replaced by an equality.

3.3.6. Bounds on the minimum distance of the linear block codes

In this section, we will give the main upper and lower bounds on the minimum distance of the linear block codes in function of the rate R = K/N.

A first upper bound is the Singleton bound:

The codes for which the inequality is replaced by an equality in relation [3.29] are called maximum distance separable (MDS) codes.

By dividing the two terms of [3.29] by N, we have:

[3.30] images

When N → +∞ we obtain:

[3.31] images

A second upper bound is the Plotkin bound that is defined when N → +∞:

[3.32] images

A tighter upper bound is the Elias–Bassalygo bound [BAS 65]

[3.33] images

where Hq(α) = α logq(q − 1) − α logq(α) − (1 − α) logq(1 − α).

For the binary codes, the Elias-Bassalygo bound is given by:

[3.34] images

where H2(α) = − log2 α − (1 − α) log2(1 − α)

The upper bound of Hamming is obtained from the Hamming bound developed in section 3.3.5.1:

[3.35] images

Finally, we will give the most popular lower bound of Gilbert–Varshamov:

[3.36] images

For binary codes, we have:

[3.37] images

In Figure 3.3, we show the curves R = f(dmin/N) relative to the lower bound of Gilbert–Varshamov and the upper bounds of Plotkin, Hamming and Elias–Bassalygo in the binary case. All the points of coordinate (dmin/N, R) under the Gilbert–Varshamov bound are reachable and all the points above the Elias–Bassalygo bound are unreachable. We can see that the upper bound of Hamming is better than the Plotkin bound when dmin/N < 0.3.

image

Figure 3.3. Bounds on the minimum distance for linear block codes with q = 2

In Figure 3.4, we show the curves R = f(dmin/N) relative to the lower bound of Gilbert–Varshamov and the upper bounds of Plotkin, Hamming and Elias–Bassalygo in the case q = 28. We can see that the results are quite different from the binary case. The Plotkin bound (equivalent to the Singleton bound in that case) is tighter than the upper bound of Elias–Bassalygo when q ≥ 16.

In the non-binary case, there exist MDS codes such as the Reed–Solomon (RS) codes while in the binary codes, only trivial codes such as the parity check codes (N, N − 1)) achieve the singleton bound.

3.3.7. Bounds on the rate in the non-asymptotic regime

From Figure 1.18 given in Chapter 1, we can write the rate R as the ratio H(U)/H(X). When the input of the channel is binary, we have H(X) = 1 Shsymb and R = H(U). The theorem of channel coding given by relation [1.105] becomes R ≤ C. The information theory gives the expression of the capacity of the channel transmission C which is the achievable rate under the assumption that the length of the codewords tends asymptotically toward infinite. From these results, different researchers including Feinstein and Elias [ELI 55] have evaluated the word error rate of communication systems in the non-asymptotical regime, i.e. when the length of the codewords N is not infinite. This probability decreases exponentially with the length N Pe ≈ exp(−NE(R)) where E(R) is a function of the rate R called error exponent as shown in Figure 3.5. E(R) is a positive function for all the values of R less than the capacity.

image

Figure 3.4. Bounds on the minimum distance for linear block codes with q = 256

In 1967, Shannon, Gallager and Berlekamp [SHA 67] introduced a lower bound on the error exponent called sphere packing bound. For the memoryless discrete channel, this bound can be written as follows:

[3.38] images
[3.39] images

with

[3.40] images

and

image

Figure 3.5. Error exponent function versus rate

The maximum in equation [3.41] is calculated over the set of a priori probability vectors q = [q1, …, qM].

For the BSC channel, a tighter upper bound based on the random coding has been proposed by Poltyrev [POL 94] and more recently by Polyanskiy et al. [POL 10]. The authors have shown that for a BSC channel with transition probability p, there exists a linear code (N, K) that can guarantee a given word error probability Pe. We have:

[3.42] images

In Figure 3.6, we show the curves rate versus length N obtained using the Poltyrev bound for different word error probabilities Pe.

image

Figure 3.6. Poltyrev bound rate versus length N

For the additive white Gaussian noise (AWGN), Shannon [SHA 59a] has determined a sphere packing bound as follows:

[3.43] images

where image. This lower bound is computed under the hypothesis that the 2K vectors associated with the codewords are distributed on the surface of a hypersphere of radius image. Consequently qs, A) is the probability that the received vector is outside the cone corresponding to the transmitted codeword. This probability can be written as:

[3.44] images

where the half angle Θs of the elementary cone is computed such that the fraction between the surface of the spherical cap and the surface of the associated hypersphere is equal to:

[3.45] images

with

[3.46] images

In Figure 3.7, we have given the ratio Eb/N0 as a function of N for R = 1/2 and R = 1/3 obtained using the sphere packing bound.

3.3.8. Main block codes

In this section, we will present some important classes of block codes. Other families of block codes will be developed in section 3.6 on cyclic codes. We will first describe the perfect block codes introduced in section 3.3.5.1. Two classes of perfect block codes exist: the Hamming codes and Golay codes. Finally, we will introduce the class of Reed–Muller codes covering a large range of dimension and minimum distance.

3.3.8.1. Hamming codes

Hamming codes are perfect binary linear block codes (N, K) with N = 2J−1 and K = 2J− 1 − J. Hamming code (N, K) can be simply described by its parity check matrix H of dimension J × N since J = NK. Indeed, the columns of H are the N binary non-null vectors containing J elements. For example, for J = 3, the Hamming code is a (7, 4) code and its parity check matrix is given by [3.15].

The minimum distance of these codes is equal to 3 and can correct up to one error. We will see in section 3.6 that the Hamming codes belong to the class of cyclic codes and can be built from a polynomial generator.

image

Figure 3.7. Sphere packing bound ratio Eb/N0 versus N

3.3.8.2. Golay code

The binary Golay code is a binary linear block code (23, 12) and its minimum distance is equal to 7. This code is perfect since for N = 23, K = 12 and e = 3, we have the equality:

images

The Golay code is also a cyclic code. The systematic generator matrix of the (23, 12) Golay code is the following:

[3.47] images

Its WEF is equal to:

[3.48] images

From the (23, 12) Golay code, it is possible to build an extended Golay code (24, 12) by adding a parity bit. The minimum distance of this code is equal to 8. The non-systematic generator matrix of the extended Golay code (24, 12) is the following:

[3.49] images

The WEF is equal to:

[3.50] images

The ternary Golay code 3 is a linear code (11, 6) in image and its minimum distance is equal to 5. The non-systematic generator matrix of this code is the following:

[3.51] images

It is a perfect code since we have:

images

3.3.8.3. Reed–Muller codes

The Reed–Muller codes are a class of codes covering a large range of dimension and minimum distance. They can be graphically described using simple trellis allowing efficient soft decoding. For each integer m and r < m, there is a Reed–Muller code (N, K, dmin) with:

[3.52] images

The generator matrix of a Reed–Muller code of order r is built by the concatenation of r + 1 matrices as follows:

[3.53] images

where G0 = [1 1 1 … 1 1] of dimension 1×N, G1 is the matrix of dimension m × N composed of the set of the N different non-null binary column vectors of dimension m. The dimension of the matrix G2 is image. Each line of G2 is obtained by performing the product element by element of the two lines of G1. More generally, the dimension of the matrix Gi is image and its lines are obtained by performing the product element by element of the i lines of the matrix G1.

Example 3.13.– Let us construct a Reed–Muller (8,4,4) of order 1 with m = 3. We have G0 = [1 1 1 1 1 1 1 1] and:

[3.54] images

Consequently, we have:

[3.55] images

In order to construct the Reed–Muller code (8,7,2) of order 2, it is just necessary to add to the previous matrix generator the matrix G2 of dimension 3 × 7 with

[3.56] images

In Table 3.10, we give the list of Reed–Muller codes that can be obtained for m = 3, 4 and 5

3.3.9. Graphical representations of binary linear block codes

We will now present different graphical representations such as the Tanner graph or the trellis diagram which are practical tools to study error correcting codes and derive efficient decoding algorithms.

Table 3.10. List of Reed–Mullercodes (N, K, dmin) for m = 3, 4 and 5

  m=3 m =4 m =5
order 0 (8,1,8) (16,1,16) (32,1,32)
order 1 (8,4,4) (16,5,8) (32,6,16)
order 2 (8,7,2) (16,11,4) (32,16,8)
order 3 (16,15,2) (32,26,4)
order 4 (32,31,2)

3.3.9.1. Tanner graph

Generally speaking, any binary linear code can be graphically represented by a Tanner graph. A Tanner graph is a bipartite graph composed of two types of nodes: the binary variable nodes represented by a circle and the parity check nodes represented by a square. Each branch means a dependency between the variable node and the parity check node that it connects.

The Tanner graph can be deduced directly from the parity check matrix H of the code: each parity check node Ti of the Tanner graph corresponds to the i-th line of the parity check matrix H and each variable node cj corresponds to the j-th column of H. Indeed, each line of H defines a parity check equation between different variables. A branch will connect the parity check node Ti with the variable node cj if and only if hij = 1.

Example 3.14.– Let us consider again the Hamming code C3 (7, 4) the systematic generator matrix of which is given in equation [3.8]. We have shown that each of the 3 lines of the parity check matrix H corresponds to a parity check equation that defines links between the different bits of the codeword c = (c0 c1 c2 c3 c4 c5 c6) = (u0 u1 u2 u3 c4 c5 c6).

The associated Tanner graph is given in Figure 3.8. In version (b), the variable nodes are situated on the left side and the parity check nodes are situated on the right side.

image

Figure 3.8. Two versions of the Tanner graph for the Hamming code (7, 4)

3.3.9.2. Trellis diagram for linear block codes

Wolf [WOL 78], Massey [MAS 78] in 1978 and then Forney [FOR 88] in 1988 have shown that it is possible to represent any block code using a trellis diagram composed of N elementary sections. This trellis fully describes the set of codewords.

The trellis diagram can be deduced from the relation image with H = [h0h1 … hN−1] the parity check matrix of the code. Let sr+1 be the vector of (N − K) × 1 representing the state of the trellis at position i. The state vector si+1 is also called the partial syndrome. We have the following recursive relation:

[3.58] images

ci is chosen under the constraint that the vectors associated with the paths in the trellis diagram constructed until the position i + 1 are partial codewords. If the arrival state is si+1 and the departure state is si, then a branch will connect these two states in the trellis diagram. In the binary case, by convention, a dotted line will correspond to ci = 0 and a plain line will correspond to ci = 1 as shown in Figure 3.9.

image

Figure 3.9. Branch of a trellis section

Let us consider the previous example of the Hamming code C3 (7, 4). From the parity check matrix given in [3.15], we can draw the trellis diagram of this code as shown in Figure 3.10.

image

Figure 3.10. Trellis diagram obtained from the parity check matrix of the code C3

The number of branches in this trellis diagram is equal to 21 + 22 + 23 + 24 + 23 + 22 + 21 = 44. We will show in the next section that it is also possible to build an equivalent trellis diagram when the code is cyclic.

Another method to build a trellis diagram is to use the generator matrix [FOR 88]. We must first adapt the generator matrix for the trellis representation i.e. to build a trellis diagram with the minimum number of branches and nodes. We will now describe this method.

Let us consider a codeword c = (c0 c1, … cN−1). We define L(c) as the smallest index i such that ci ≠ 0 and R(c) as the highest index i such that ci ≠ 0. The envelope of c is the sequence of bits 1 starting from the first non-zero bit of the codeword and finishing with the last non-zero bit of the codeword. For example, the envelope associated with the codeword 1101000 is 1111000. The span of c is the number of “1” in the envelope of c span(c) = R(c) − L(c).

A greedy algorithm proposed by Kschischang and Sorokine [KSH 95] allows to determine the generator matrix with minimum span, i.e. minimizing the number of branches in the trellis diagram. The pseudo-code is given as follows:

  • – Step 1: find a pair of lines ci and cj in the generator matrix G such that L(ci) = L(cj) and R(ci) ≤ R(cj), or R(ci) = R(cj) and L(ci) ≥ L(cj);
  • – Step 2: if no pair is found at step 1, go to step 4;
  • – Step 3: apply ci = ci + cj, i.e. replace line ci of the generator matrix by the sum of the two lines;
  • – Step 4: the obtained matrix G is a minimum span generator matrix.

Example 3.15.– Let us consider the Hamming code C3 (7, 4) defined by the systematic matrix generator given by [3.8] and its associated envelope matrix:

[3.59] images

By applying the above algorithm (for example by summing line 2 and 3, then line 3 and 4, then line 1 and 3 and finally line 1 and 2), we obtain the following generator matrix and envelope matrix:

[3.60] images

In order to determine the number of branches of the associated trellis diagram we count the number of 1 per column of the envelope matrix. Let ni be the number of 1 of the i-th column; the total number of branches is equal to image. In this example, the total number of branches in the trellis diagram is equal to 44.

Then to construct the trellis diagram of this code, we have to consider each line of the generator matrix as a subcode (N, 1) and composed of only 2 codewords.

Let us construct the trellis diagram from the obtained generator matrix G. The different construction phases of the trellis diagram are given in Figure 3.11. For the first line, the subcode contains the codewords 0000000 and 11010000. The trellis diagram associated contains 2 paths. We then construct the trellis diagram corresponding to the two first lines of the generator matrix. This trellis diagram is simply the product of the trellis diagram of the subcodes associated with the first and second line. This procedure is repeated until the end of the construction of the trellis diagram. We can check that the obtained trellis diagram is composed of 44 branches. We have the same low complexity than when using the construction based on the parity check matrix [MCE 96].

Since the complexity of the decoding algorithms based on trellis diagram is proportional to its number of branches, it is important to build a trellis with the smallest number of branches. In this section, we have presented two approaches to obtain such a trellis diagram.

image

Figure 3.11. Trellis diagram of Hamming code (7,4)

3.4. Decoding of binary linear block codes

3.4.1. Introduction

In Chapter 1, we introduced different transmission channel models. In Figure 3.12, we show a point-to-point transmission chain with an AWGN channel.

image

Figure 3.12. Block diagram of a transmission chain with an additive white Gaussian noise channel

So, we can distinguish two cases:

– If the input of the decoder is binary, we will perform hard input decoding. The set modulation-additive white Gaussian channel-decision function can be seen as an equivalent BSC channel. The error probability p of this BSC channel is equal to the error probability in an uncoded additive white Gaussian channel as given by the following equation:

where:

[3.62] images

In [3.61], the energy per bit Eb of the non-coded case is replaced by REb since we have to take into account the impact of the rate R of the channel encoder. In that case, the hard input decoder will compute Hamming distances. The relation between the received word r and the codeword c can be written as:

[3.63] images

where e is the binary error vectors (here the addition is performed by modulo 2).

– If the input of the decoder is real, we will perform a soft input decoding. In practice, the input will be quantized using few bits (3 to 4 bits are usually sufficient not to degrade the performance of the decoding). Compared to the hard input decoding, the soft input decoding will compute Euclidean distances. The input output relation is the following:

[3.64] images

where x is the transmitted word, n is the noise vector and y is the received word.

In this section, we will also study the decoding of block code on erasure channel.

3.4.2. Optimal decoding

The aim of the optimal decoding is to determine the ML sequence denoted as image.

Let x = (x0, x1, x2, … ,xN−1) be the sequence of length N transmitted in a stationary memoryless discrete with conditional density probability p(y/xi) and y = (y0, y1, y2, …, yN−1) be the received sequence.

In this section, we will consider two decoding criterions: the maximum a posteriori (MAP) criterion and the ML criterion. The MAP decoder searches among all the possible sequences x, the estimated sequence image for which the conditional probability Pr(x|y) is the highest.

[3.65] images

We can write:

[3.66] images

Let us assume that the probability of occurrence of each codeword is the same Pr(x) = 1/2K. Under this hypothesis and since the denominator p(y) is common for all the sequences, the estimated sequence image is the sequence for which the conditional probability p(y|x) is the highest.

A decoder using this criterion is called an ML decoder. So, when the codewords are equiprobable, the MAP and ML decoders are the same.

If the channel is memoryless, the transmitted signals are perturbed independently and the conditional probability p(y|x is equal to the product of the conditional probabilities p(yi|xi):

[3.68] images

For the BSC channel with conditional probability p, the probability Pr(r|c) is the following:

[3.69] images

where dH(r, c) is the Hamming distance between the binary received sequence r and the transmitted codeword c. Since p is between 0 and 0.5, we have 0 < image. We have proved that maximizing Pr(r|c) is equivalent to minimize the Hamming distance between r and c. Consequently:

[3.70] images

For the BSC channel, the ML criterion implies the computation of the Hamming distance between the received sequence and all the possible codewords.

Let us consider now the case of soft input decoding of a sequence received from an additive white Gaussian channel after matched filtering. We have seen that at time i, the output yi can be written as follows:

[3.71] images

where image the average energy per symbol and ni is a sample of white centered Gaussian noise with variance image. Consequently, the density probability of yi conditionally to xi is:

[3.72] images

Since the logarithm function is increasing, instead of using the relation [3.67] to determine image, we can compute:

[3.73] images
[3.74] images

We obtain a first version of the ML decoder:

[3.75] images

In this case, the ML criterion implies the computation of the Euclidian distance between the received sequence and all the possible sequences.

A second version is obtained by replacing the Euclidian distance by the Manhattan or L1 distance as follows:

[3.76] images

This suboptimal version achieves a rather good performance in practice.

When the modulation is bipodal image, a third version can be obtained:

[3.77] images

Indeed, since image and image are common for each sequence, we can simplify the calculations without any performance degradation.

3.4.3. Hard input decoding of binary linear block codes

The received word r is the sum modulo 2 of the transmitted codeword c and an error vector e:

[3.78] images

A straightforward approach for the decoding is to compare the received word r with all the 2K codewords of the code C. To minimize the word error probability, the decoder will select as estimated codeword image, the codeword with the smallest Hamming distance of the received word r. This approach is however complex to implement and has a limited practical interest.

When multiplying the received word r by the transposed parity check matrix HT , we obtain the so-called error syndrome s of dimension 1 × (N − K):

[3.80] images
[3.81] images

If there is no transmission error, the error syndrome s is the null vector.

We will now present three hard decoding methods: the standard array, the syndrome decoding and the Viterbi algorithm.

3.4.3.1. Standard array method

Since the error syndrome can take 2N−K different values, each error syndrome s is associated with 2N/2N−K = 2K different error vectors. The standard array method performs a partition of the N dimensional vector space into 2K disjoint classes. Each class is associated with a codeword.

The standard array as shown in Table 3.11 is built as follows:

  • – the first line is composed of the 2K codewords starting with the null codeword;
  • – under the null codeword c0, we list the set of error vectors starting with the patterns with weight 1, then weight 2 (if N ≤ 2NK), until 2NK elements of the first column are filled;
  • – under each codeword of the first line, we compute the sum of the codeword and the associated error vector.

Table 3.11. Standard array

image

Each row is a subset of words or coset corresponding to an error syndrome and a commun representative called the coset leader. The coset leader is the ML error vector for the words of this subset.

The decoding consists of searching the column corresponding to the received word. The decoded codeword will be the first element of the column, i.e. the codeword associated with this column. This method is too complex and is only interesting from a pedagogical point of view.

3.4.3.2. Syndrome decoding

Syndrome decoding is a direct extension of the standard array method. We have seen that the error syndrome can take 2N−K different values. We first compute the syndrome using equation [3.79]. Then, we associate with this syndrome the corresponding error vector ê. This method only requires a look-up table that associates the error syndromes with the error vectors.

Table 3.12. Syndrome table

image

The estimated codeword ĉ is then:

[3.82] images

This method is less complex than the standard array method, but is only limited to the short block codes (codes with an error capability of a few errors). Indeed, we have to perform the product rHT and memorize 2N−K error vectors. For example, the decoding of the Golay code (23, 12) using the syndrome decoding method requires a memory of 2048 words of 23 bits.

Example 3.16.– Let us consider the code C4 (5, 2) with the following generator matrix:

[3.83] images

and the associated parity check matrix H:

[3.84] images

The standard array is given in Table 3.13. We can notice that there are different ways to fill the last two lines of the standard array.

Table 3.13. Standard array for the code (5,2)

00000 01011 10101 11110 000
00001 01010 10100 11111 001
00010 01001 10111 11100 010
00100 01111 10001 11010 100
01000 00011 11101 10110 011
10000 11011 00101 01110 101
11000 10011 01101 00110 110
10010 11001 00111 01100 111

Let us consider the information word u = [11]. The associated codeword is c = [11110]. We can check that the syndrome associated with this codeword is the null vector: s = [000]. If we suppose that an error occurs during the transmission of the fourth bit of the transmitted word: e = [00010], the received vector is then r = [11100]. Using the standard array method, we obtain directly ĉ = [11110]. The decoder has been able to recover the error.

We will now use the syndrome decoding. We will first calculate the error syndrome associated with the most probable error vectors (1 error, then 2 errors, etc.). The syndrome decoding table for this code is given in Table 3.14.

Table 3.14. Syndrome decoding table for the (5, 2) code

ê s
00000 000
00001 001
00010 010
00100 100
01000 011
10000 101
11000 110
10010 111

The syndrome associated with the received word r is s = rHT = [010]. Using the syndrome decoding table, we find ê = [00010]. After adding the estimated error vector ê to the received word r, we have ĉ = [11110] and the error has been corrected.

Since the correction capability of this code is equal to e = 1, this code can correct any error pattern with one error. The last two lines do not guarantee a correct decoding since more than one pattern with two errors is associated with these syndromes. Consequently, it is generally better not to use those syndromes for the decoding.

For the Hamming codes (2J − 1, 2J − 1 − J), the syndrome decoding table is composed of 2J − 1 non-null error syndromes associated with the 2J 1 error vectors with one error.

3.4.3.3. Viterbi algorithm

We have seen previously that any linear block code can be graphically described using a trellis diagram. The search of the ML codeword ĉ is equivalent to the search of the ML path in the trellis diagram. The estimated codeword ĉ can then be immediately deduced.

The general principle of the Viterbi algorithm consists, at each section of the trellis diagram, of eliminating the paths (and the associated codewords) that cannot be the highest likelihood path. At each node of the trellis diagram, we will keep only one path. At each section, the Viterbi algorithm performs the following operations:

  • – computation of the branch metric (in the case of hard input decoding, Hamming distance dH(ri, ci) between the received bit ri and the bit associated with the considered branch ci);
  • – at each state, computation of the cumulated metric for each branch arriving to this state (summation of the cumulated metric of the starting state and the branch metric);
  • – for each state, selection of the survival path corresponding to the path arriving to this state with the smallest cumulated metric. The other paths, called concurrent paths, are eliminated.

Finally, the ML path and the associated codeword are obtained by performing trace-back from the right to the left by starting with the node with the smallest cumulated metric.

We will consider an example of a point-to-point transmission over an AWGN channel. We will use the Hamming code C3 (7, 4) defined by the systematic generator matrix given in [3.8] and its associated trellis. Let u = [1110] be the information word and c = [1110010] be the associated codeword. After thresholding, we receive the word r = [1010010]. An error has occurred during the transmission of the second bit of the codeword.

image

Figure 3.13. Branch metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

At each step of the Viterbi algorithm, we compute the branch metrics and then the cumulated metrics. At time 0, the cumulated metric is initialized to 0. After calculating the cumulated metrics on the trellis diagram, we obtain the estimated codeword: ĉ = [1110010] which corresponds to the bold path on the trellis diagram given in Figure 3.17. We have been able to recover the initial codeword since only one error occurs in the transmission.

image

Figure 3.14. Cumulated metric calculation after the reception of the 1st bit of the word r. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

3.4.4. Soft input decoding of binary linear block codes

The Viterbi algorithm is mainly used for soft input decoding of binary linear block codes. It allows us to determine the ML sequence ĉ while avoiding to compute the cumulated metrics associated with each possible transmitted sequence. Compared to the hard input decoding, the branch metrics are square Euclidian distance (yixi)2 between the received sample yi and the symbol associated with the considered branch xi.

image

Figure 3.15. Cumulated metric calculation after the reception of the 4th bit of the word r. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

We consider again a point-to-point transmission over an AWGN channel and using the Hamming code C3 (7, 4). Let u = [1110] be the information code, the associated codeword c = Gu = [1110010] and the transmit sequence x = [+1 + 1 + 1 − 1 − 1 + 1 − 1]. The received sequence after filtering and sampling is y = [−0.2 + 0.91.1 − 1.3 + 0.4 + 2.5 − 0.7]. The branch metrics associated with this received sequence are given in the Table 3.15.

After calculating the cumulated metrics on the trellis diagram, we obtain the estimated transmitted sequence: image and the associated codeword: ĉ = [1110010] which corresponds to the bold path on the trellis diagram given in Figure 3.21. We have been able to recover the initial codeword. We can also check that image.

image

Figure 3.16. Cumulated metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

Table 3.15. Branch metric table

image
image

Figure 3.17. Determination of the estimated sequence. Fora color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

3.4.5. Erasure decoding of binary linear block codes

We have seen in section 1.6.3 that the erasure channel is often used to model the packet loss in high-level communication protocols. In an erasure channel, a fraction p of the transmitted symbols is not received at the receiver. These symbols are erased and denoted by X.

Different approaches are possible to recover the erased symbols. A first method is to test all the possible words by replacing the erased symbols by 0 and 1. If only one of these word is a codeword, then we are sure that this is the transmitted codeword. If we find no codeword or more than one codeword among the possible words, then the decoding is a failure.

image

Figure 3.18. Branch metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

Example 3.17.– We consider again the Hamming code C3 (7, 4). Let us assume that the transmitted codeword is c = [1111111] and that the received word is r = [1XX1111]. The set of all possible codewords is {1001111; 1011111; 1101111; 1111111} and their respective error syndrome is {100; 011; 111; 000}. Since only one word of the set is a codeword, we can recover the transmitted message. The decoded codeword is ĉ = [1111111].

image

Figure 3.19. Cumulated metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

A second method consists of solving the system composed of the N − K parity check equations by replacing the binary variables with the associated received bits. As previously, if there exists more than one solution, the decoding is a failure.

image

Figure 3.20. Cumulated metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

Example 3.18.– From the three parity check equations of the code [3.57], we have:

[3.85] images
image

Figure 3.21. Determination of the estimated sequence. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

After solving this system, we have a unique solution

[3.86] images

and the decoded word is ĉ = [1111111] as previously.

A third suboptimal method uses the Tanner graph of the code to recover iteratively the erasures. At each iteration, we build a vector e = [e0 e1eN−1] containing the position of the erasures:

[3.87] images

From the vector e, we can compute the number of erased bits ij for each parity check equation Tj. At each iteration, the decoding algorithm searches for a parity equation with only one erasure. If such an equation exists, then the erased symbol can be simply corrected. This procedure is repeated after updating the number of erased bits ij for each equation Tj. The algorithm is ended when there is no more erasure or in case of failure, i.e. if ij > 1 ∀j. This algorithm is described in Figure 3.22 considering the previous example.

image

Figure 3.22. Iterative decoding on the Tanner graph

We can define the erasure enumeration function T(x) as follows:

[3.88] images

where Ti is the number of non-recoverable erasure patterns of weight i (i.e. containing i erasures). In an erasure channel, the probability to have i erasures is equal to pi(1 − p)Ni where p is the erasure probability of the erasure channel.

The word error probability can be easily obtained from T(x):

[3.89] images

For the Hamming code C3 (7, 4), the erasure enumeration function calculated using the first and second methods is equal to:

[3.90] images

Consequently, the decoder can correct all the patterns composed of one or two erasures. Seven of the image patterns composed of three erasures are not corrected and also all the pattern with 4, 5… 7 erasures.

When using the suboptimal iterative method, the erasure enumeration function is slightly different:

[3.91] images

While the decoder is able to correct all the words containing one or two erasures, 10 of the image patterns composed of three erasures are not corrected. Schwartz and Vardy [SCH 06] have shown that by adding some complementary parity check equations in the Tanner graph, it is possible to correct more patterns and consequently to obtain results closed to the optimal solution. These equations are linear combinations of the parity check equations of the matrix H. Another solution is to use the Viterbi algorithm.

3.5. Performances of linear block codes

3.5.1. Performances of linear block codes with hard input decoding

For the binary symmetric channel with error probability p, we have shown that the probability Pr(r|c) is given by:

[3.92] images

where dH(r, c) is the Hamming distance between the received sequence r and the transmitted codeword c. For a binary symmetric channel, the probability that a received sequence of length N bits contains i errors is the following:

[3.93] images

where p is the error probability of the binary symmetric channel.

Consequently, for a linear block code with an error correction capability of e errors, we can derive an upper bound on the word error rate after decoding by summing the probabilities PNi corresponding to the cases where the number of errors at the output of the channel is higher than the error correction capability of the code. We have:

The inequality can be replaced by an equality only when a perfect code is used.

3.5.2. Union bound

The union bound is a simple technique to obtain an upper bound on the error probability.

We consider a ML decoder searching the most probable codeword, i.e. the closest codeword according to the Euclidian distance:

[3.95] images

The probability of decoding a wrong codeword knowing that the codeword xi was transmitted can be upper bound as follows:

[3.96] images

where Λi is the decision region associated with the codeword xi and Λij is the decision associated to the codeword xi ignoring the other codewords except xj.

The probability Pr(y ∈ Λij |xi) that y be closest to xj than from xi is called the pairwise error probability (PEP) and is noted Pr(xi xj):

[3.97] images

Knowing that xi has been transmitted, the probability that y is not in the decision region Λi is less or equal to the sum of the pairwise error probabilities Pr(xi → xj) for all ji.

The word error rate (WER) is given by:

[3.98] images

Finally the word error rate can be upper bounded by:

where Pr(xi → xj) is the pairwise error probability i.e. the probability to decode xj knowing that xi has been transmitted ignoring the other codewords.

Generally speaking the union bound given in [3.99] is a practical tool to evaluate the performance of coded transmission systems but is only accurate for high signal to noise ratio.

Example 3.19.– Let us consider the set of four codewords x1,x2,x3 and x4 and their associated decision regions Λ1, Λ2, Λ3 and Λ4 as shown in Figure 3.23.

image

Figure 3.23. Decision regions associated with the codewords

We can upper bound the word error probability assuming that the codeword x1 has been transmitted as follows:

images

This example is illustrated in Figure 3.24.

3.5.3. Performances of linear block codes with soft input decoding

We now consider the case of a bipodal modulation transmitted over an AWGN channel. Since we have Es = REb, the amplitude of the coded samples can only be image or image (where Eb, is the energy per bit and image is the code rate). At the output of the matched fiter, we have:

images
image

Figure 3.24. Example

For an AWGN channel, the pairwise error probability Pr(xi → xj) is given by:

where d(xi, xj) is the Euclidian distance between the two vectors xi and xj. The proof of the relation [3.100] is given in the second volume of this book.

Let ci and cj be the two codewords associated with the vectors xi and xj. If the Hamming distance between ci and cj is equal to dH(ci cj) = d, the Euclidian distance between xi and xj will be:

[3.101] images

Then we have:

Example 3.20.– Parity check code (3, 2)

The Hamming distance between two codewords is 2 and the code rate is R = 2/3. Consequently, we have:

[3.103] images

We will now use the union bound to obtain an upper bound on the word error rate and bit error rate of the ML decoding of a linear block code (N, K) with an AWGN channel. We have shown that the union bound is performed by considering the codewords two by two. Assuming that the codewords are equiprobable, the word error rate can be bounded as follows:

where Ad is the number of codewords with Hamming weight d and dmin is the minimum distance of the block code.

Using the inequality:

[3.105] images

it is possible to obtain a first union-Bhattacharyya bound [BEN 99] on the bit error rate:

with:

[3.108] images

Bd can also be calculated using the IOWEF function:

[3.109] images

In equation [3.106] we can distinguish the contribution of the information word with weight w while in equation [3.107] we have gathered the contribution of the codewords with weight d by introducing the coefficient Bd. The coefficient Bd is the average ratio of non-zero information bits associated with the codewords of weight d.

A tighter bound on the bit error rate [BEN 99] can be obtained from [3.102] and using the following inequality:

images

Then we have:

images

Finally, this upper bound on the bit error rate is given as follows:

[3.110] images

By keeping only the first terms, we obtain the following upper bound:

The minimum distance dmin is the main parameter at high signal to noise ratio Eb/N0. However, the term Bd has also a significant impact on the BER.

This bound can be derived using the union bound over the set of information words and taking into account the contribution of the weight wi of each information word:

[3.112] images

By gathering the information words generating codewords with the same weight, we find again the upper bound [3.111].

Example 3.21.– For the parity check code (3, 2), from the WEF A(W, Z) or B(W,D) we obtain:

[3.113] images

and

[3.114] images

Example 3.22.– For the Hamming code (7, 4), using the WEF A(W, Z) or B(W,D) we obtain:

[3.115] images

3.5.4. Coding gain

In Chapter 1, we saw that for an AWGN channel, it is theoretically possible to perform a transmission without error if the ratio Eb/N0 = 0dB using a code with rate 1/2. The performance difference between a transmission chain using uncoded bipodal modulation and this theoretical limit is 9.6 dB (considering a word error rate of 10−5). By adding an error correcting code, we can get close to this theoretical limit.

Definition 3.14.– For a given (word, bit or symbol) error rate, we define the coding gain of an error correcting code as the Eb/N0 difference between the uncoded and coded transmission chain.

In Figure 3.25, we have shown the theoretical performance WER = f(Eb/N0) (obtained using equation [3.104]) of three different transmission chains using soft input decoding and the performance of the uncoded transmission chain (bipodal modulation). For a word error rate of 10–5, the coding gain of the parity check code (3, 2), the Hamming code (7, 4) and the Golay code (23, 12) are 1.3 dB, 2.3 dB and 3.8 dB, respectively.

image

Figure 3.25. Coding gain of different codes

In the previous section, we have obtained an upper bound on the word error rate:

[3.116] images

A reasonable approximation at a high signal-to-noise ratio consists of keeping only the contribution of the codewords with a weight equal to the minimum distance. Using this approximation, we have:

[3.117] images

If we do not take into account the term Admin the asymptotic coding gain can be given as follows:

[3.118] images

If we take into account the number of codewords at the minimum distance Admin, we can approximate the coding gain as follows [FOR 98]:

[3.119] images

The term 0.2 log2 (Admin) corresponds to the impact on the coding gain of the number of codewords at the minimum distance. The 0.2 factor is related to the slope of the erfc(.) function in the region 10−4 − 10−5. This equation is only an approximation.

For example, for the Golay code (23, 12), we have image, dmin = 7 and Admin = 253 corresponding to a coding gain of about 4 dB.

Instead of the WER, it is also possible to use the word error rate per information bit (WERB) to define the coding gain:

[3.120] images

In that case, we have the following relation:

[3.121] images

For the previous example, the coding gain is now 4.75 dB.

In Table 3.16, we have shown the coding gain for different error correcting codes.

Table 3.16. Coding gain for different linear block codes

code (N, K) R dmin Admin GC asympt GCWERB
Hamming (7, 4) 0.57 3 7 2.3 dB 2.2 dB
Reed–Muller (8, 4) 0.5 4 14 3 2.6
Hamming (15, 11) 0.733 3 15 3.4 3.3
Reed–Muller (16, 5) 0.31 8 30 4 3.5
Golay (23, 12) 0.52 7 253 5.6 4.7
Golay (24, 12) 0.5 8 759 6 4.8

Another approach is to use the BER but in that case, we have to compute Bdmin.

3.5.5. Performance comparison of hard and soft input decoders

In Figure 3.26, we show the performance curves WER = f(Eb/N0) of a transmission chain using a Hamming code (7, 4) and a Golay code (23, 12). The continuous curves have been obtained with a hard input decoder (using equation [3.94]) and the dashed curves have been obtained with a soft input decoder (using equation [3.104]). For the Golay code, we can observe that the soft input decoding brings a gain of 2 dB compared to hard input decoding.

3.6. Cyclic codes

3.6.1. Definition and properties

Note.– In this section, the most significant bit (MSB) of the vectors is on the right.

image

Figure 3.26. Performance comparison of hard and soft input decoding

The cyclic codes are a subset of the linear block codes. While for linear block codes, K codewords are required to determine the set of 2K codewords, for cyclic codes, only one codeword is enough.

The most important linear block codes such as Hamming code, Golay code, BCH and RS codes belong to this class. Due to their properties, the complexity of coding and decoding tasks is reduced. In this section, we will present cyclic codes defined in the finite field image, but they can be extended to image.

The main property of the cyclic codes is the following: if c = [c0 c1cN−2 cN−1] is a codeword, then the word obtained by performing a right cyclic shift of one position c’ = [cN−1 c0cN−3 cN−2] is also a codeword.

In order to describe a cyclic code (N, K), it is convenient to associate a polynomial c(p) of degree lower or equal to N − 1 to each codeword c =[c0 c1 … cN−2 cN−1]:

images

We will show that the properties of the cyclic codes can be obtained easily using the algebra of the polynomials modulo pN − 1.

Let us compute the polynomial pc(p):

images

No codeword is associated with this polynomial since its degree is higher than N − 1.

By adding cN−1 and substracting cN−1, this expression can be rewritten as follows:

images

Since the polynomial c′(p) associated with the codeword c’ is equal to:

images

we also have:

images

Let us compute pc(p) modulo pN 1. We obtain:

images

So, a right cyclic shift of one position is equivalent to the multiplication by p modulo pN 1.

More generally, a right cyclic shift of i positions corresponds to a multiplication by pi modulo pN 1.

3.6.2. Properties of the cyclic codes

Property.– If c(p) is a polynomial of degree N − 1 associated with a codeword of a cyclic code (N, K), then:

[3.122] images

is also a polynomial associated with a codeword.

This relation says that from one polynomial c(p), it is possible to find the set of the 2K codewords of the cyclic code.

Property.– It is possible to build a cyclic code (N, K) from the generator polynomial denoted as g(p) of degree N − K:

images

g(p) is the polynomial with minimum degree among the 2K codewords of the cyclic code.

Property.– The polynomial g(p) of degree N − K should be a factor of pN1.

Property.– The set of the 2K polynomials associated with the 2K codewords of a cyclic code (N, K) can be obtained by performing the multiplication of g(p) by the 2K polynomials of degree lower or equal to K − 1.

If we define the polynomial u(p) associated with the information word u with u = [u0 u1uk−2 uK−1]:

images

We have the following relation between the polynomial u(p) of degree K − 1 and the polynomial c(p) of degree N − 1:

[3.123] images

Property.– Each polynomial factor of pN 1 can generate a cyclic code.

In Table 3.3, we give the decomposition in product of irreducible polynomials of the polynomials with the form p2m1 − 1 for m ≤ 5. The irreducible polynomials of this table are the main polynomials used for the construction of cyclic codes. For example, p7 − 1 can be decomposed into the product of three irreducible polynomials. We can build the three cyclic codes for N = 7 using these irreducible polynomials:

  • – a code (7, 6) with g(p) = 1 + p;
  • – a code (7, 4) with g(p) = 1 + p + p3;
  • – a code (7, 4) with g(p) = 1 + p2 + p3.

The two obtained cyclic codes (7, 4) are the Hamming codes considered previously.

Example 3.23.– Let us consider the Hamming code built from the generator polynomial g(p) = 1 + p + p3. The 16 codewords are obtained by multiplying the polynomials associated with the information word with g(p). The list of these 16 codewords is given in Table 3.17.

Table 3.17. List of codewords for the Hamming code (7, 4) built with g(p) = 1 + p + p3

u0 u1 u2 u3 c0 c1 c2 c3 c4 c5 c6
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 1 0 1 0 0 0
0 1 0 0 0 1 1 0 1 0 0
1 1 0 0 1 0 1 1 1 0 0
0 0 1 0 0 0 1 1 0 1 0
1 0 1 0 1 1 1 0 0 1 0
0 1 1 0 0 1 0 1 1 1 0
1 1 1 0 1 0 0 0 1 1 0
0 0 0 1 0 0 0 1 1 0 1
1 0 0 1 1 1 0 0 1 0 1
0 1 0 1 0 1 1 1 0 0 1
1 1 0 1 1 0 1 0 0 0 1
0 0 1 1 0 0 1 0 1 1 1
1 0 1 1 1 1 1 1 1 1 1
0 1 1 1 0 1 0 0 0 1 1
1 1 1 1 1 0 0 1 0 1 1

Let u = [1 1 1 0] (notation MSB on the right) and its associated polynomial u(p) = 1 + p + p2. The polynomial c(p) is given by:

images

and the associated codeword c = [1000110].

The construction of the generator matrix from the generator polynomial is straightforward. We have seen previously that the generator matrix of a linear block code can be obtained from K independent codewords. In the case of the cyclic codes, we can choose the codewords associated with the following polynomials:

images

For example, for the Hamming code (7, 4) with g(p) = 1 + p + p3, the generator matrix is the following:

[3.124] images

This generator matrix is not systematic. From a practical point of view, it is advisable to have a systematic code. Let the information word be u = [u0 u1 … uK−2 uK−1] and u(p) = u0+u1p+…+uK−2pK−2+uK−1pK−1 its associated polynomial. We multiply u(p) by pN−K:

images

The polynomial c(p) associated with a codeword with a systematic form c = [c0 c1cN−1] = [c0 c1cNK−1 u0 u1uK−2 uK−1] can be written as:

[3.125] images
[3.126] images

Let us divide pN−K u(p) by g(p). We obtain:

images

where q(p) is the quotient and t(p) is the remainder of the division of degree less than N − K.

To summarize, to obtain a systematic codeword c(p) = q(p)g(p) we have to:

  • – multiply the polynomial u(p) by pN − K;
  • – perform the division of pN − Ku(p) by g(p) to obtain the remainder t(p);
  • – add the remainder t(p) to pN − Ku(p):
images

Example 3.24. Let us consider the information word u = [1 1 1 0] The polynomial associated with u is u(p) = 1 + 1p + 1p2 + 0p3. We have:

images

The remainder of the division of p3 + p4 + p5 by g(p) = 1 + p + p3 is p. So the polynomial c(p) is given by:

images

The associated codeword is the following:

images

This notation is slightly different from the systematic form that we have presented in section 3.3 but we find the same codeword as the one obtained using the systematic generator matrix [3.8].

3.6.3. Error detection using CRC and automatic repeat request

In order to obtain a high reliability transmission, it is possible to implement an automatic repeat request (ARQ) using acknowledgement (ACK)/negative acknowledgment (NACK) under the condition to have a return channel.

To detect the presence of errors in the received word, most of the protocols are using cyclic redundancy check (CRC). This technique is based on a cyclic code (N, K).

Let g(p) be the generator polynomial of a CRC. From the information polynomial u(p), the polynomial associated with the codeword c(p) is built in a systematic form:

images

At the reception, we divide the received word by g(p). If the remainder of the division is zero, then the received word is a codeword. We will then assume that it is the codeword that has been transmitted. If the remainder of the division is non-zero, then the received word is not a codeword and there are one or more errors in the received word. The receiver sends back a message ACK in the event of success. In case of failure, the receiver sends back a message NACK and the codeword is retransmitted.

Table 3.18 gives some of the most common generator polynomials used by the different systems.

The two classical protocols in ARQ systems are the stop-and-wait protocol and the go-back-N protocol.

The stop-and-wait protocol is the simplest protocol but is not very efficient. The receiver sends an ACK after each frame and the transmitter waits for an ACK or NACK before transmitting a new frame or retransmitting the previous frame. The stop-and-wait ARQ protocol is illustrated in Figure 3.27(a).

Table 3.18. Generator polynomials for CRC

CRC name Generator polynomial System
CRC4 g(p) = p4 + p + 1 G.704
CRC5 g(p) = p5 + p2 + 1 USB
CRC6 g(p) = p6 + p + 1 G.704
CRC8-CCITT g(p) = p8 + p2 + p + 1 ATM, RNIS
CRC16 g(p) = p16 + p15 + p2 +1 USB, ANSI
CRC16-CCITT g(p) = p16 + p12 + p5 + 1 X25, SDLC, HDLC
CRC16-DECT g(p) = p16 + p10 + p8 + p7 + p3 + 1 DECT
CRC24 g(p) = p24 + p22 + p20 + p19 + p18 + p16 + p14 + p13 +p11 +p10 +p8 +p7 +p6 +p3 +p+1
CRC32 g(p) = p32 + p26 + p23 + p22 + p16 + p12 + p11 + p10 + p8 + p7 + p5 + p4 + p2 + p + 1 V.42, Ethernet, MPEG2

In the go-back-N protocol, the transmitter is allowed to consecutively transmit a maximum number of Nf frames gathered in a window without receiving an ACK. Whenever the transmitter receives a NACK associated with the frame i, it stops transmitting new frames, and then proceeds to the retransmission of the frame i and of the Nf 1 next frames. At the receiving end, the receiver discards the erroneously received frame i and all the subsequent Nf 1 previously received frames no matter whether they were correct or not. Transmissions are repeated until the frame i is well received and the process is repeated again. The go-back-N protocol is illustrated in Figure 3.27(b) for Nf = 4.

To improve the reliability of the ARQ systems, these classical ARQ schemes can be applied in conjunction with error correcting codes. These schemes are called hybrid automatic repeat request (HARQ).

In a type I HARQ protocol, the error correcting code allows to reduce the frequency of retransmission by correcting the most frequent error patterns. When the receiver cannot decode the frame, the receiver requests a retransmission and discards the corrupted frame.

image

Figure 3.27. a) Stop-and-wait protocol b) go-back-N protocol with Nf = 4

If the receiver can buffer the previously received frames, the optimal solution is to use maximum ratio combining (MRC) to combine these multiple frames. This version with MRC is denoted as ”chase combining” (CC) scheme [CHA 73].

In the type II HARQ scheme also called incremental redundancy (IR) [MAN 74] when a NACK is received, instead of sending the same coded frames, the transmitter sends additional coded frames. The different received frames are exploited to provide a stronger error correction capability. The IR protocol is described as follows:

1) Compute the codeword from the information word using a linear block code (N, K);

2) Divide the codeword into M subcodewords of size Nk where N = image and store the complete codeword (systematic and parity parts) for potential transmissions;

3) Initialize m = 1;

4) Transmit the mth subcodeword;

5) At the reception, attempt to decode the code using all the symbols received so far. If the information is correctly decoded, send an ACK to the transmitter. Else, m = m +1 and send a NACK to the transmitter and return to step 4). If the information sequence is not successively decoded when m = M, then failure of the transmission.

The rate of the code after the mth transmission is image and the average rate is equal to:

[3.127] images

where Sk is the event “data correctly decoded” during the kth slot. image is the complement of Sk.

The complexity of the IR protocol is higher than the other ARQ systems but its performance is significantly better.

It should be said that the implementation of the ARQ systems is not always possible in particular in broadcast communication systems (using satellite, terrestrial television, etc.). Another drawback of the ARQ system is the delay due to the retransmissions.

3.6.4. Encoding of cyclic codes

In this section, we will study the hardware implementation of the cyclic encoders. We will show that the cyclic encoder is only composed of modulo 2 adders and shift registers.

3.6.4.1. Implementation of a division of polynomials

Let us consider the division of a dividend polynomial a(p) by a divisor polynomial g(p) of degree d in the finite field image. The remainder t(p) and the quotient q(p) of this division are given by the relation:

[3.128] images

Example 3.25.– Division of 1 + p2 + p6 by 1 + p + p2.

images

When we analyze the different steps to compute this division, we can make the following comments:

  • – During the division, the degree of the dividend decreases: the modifications of the dividend are performed from the left to the right (MSB s toward least significant bits).
  • – For a divisor of degree d, at each iteration, the modifications of the dividend concern the left (d + 1) terms. The other terms can be temporarily ignored.
  • – When a modification of the dividend occurs, it is an addition term by term of the d + 1 coefficients of the divisor.

From these comments, we can deduce a hardware scheme composed of shift registers to implement this division. The number of shift registers is equal to the degree of the divisor. The hardware structure is given in Figure 3.28 for the considered example.

The bits enter in the structure starting by the MSB. At each positive transition of the clock, a new coefficient of the polynomial a(p) enters in the structure. After d clock transitions, the first non-zero coefficient of the quotient occurs at the output of the last shift register. This coefficient is multiplied by g(p) and then subtracted like in a classical division.

image

Figure 3.28. Hardware structure of the division by 1 +p + p2

The sequencing of the division calculated previously is given in Figure 3.29 and in Table 3.19. We can check that at each time, the internal registers have only the part of the dividend. The quotient is obtained from the second clock transition at the output of the register s2, q(p) = 1p4 + 1p3 + 0p2 + 1p + 0. At the last positive clock transition, the remainder of the division t(p) = 1p + 1 is available at the output of the shift registers.

image

Figure 3.29. Sequencing of the division by 1 + p + p2

3.6.4.2. Hardware structures of cyclic encoder

In the case of a cyclic code (N, K), we have seen that before calculating the remainder t(p) of the division, it is necessary to premultiply the polynomial associated with the information word u(p) by pN−K. The multiplication by pN−K is equivalent to adding pN−K zeros.

Table 3.19. Sequencing of the division

Input Clock transition s1 S2
0 0 0
MSB 1 1 1 0
0 2 0 1
0 3 1 1
0 4 1 0
1 5 1 1
0 6 1 0
LSB 1 7 1 1

Example 3.26.– Let us consider again the Hamming cyclic code (7, 4) with g(p) = 1 + p + p3.

The hardware structure of the encoder is given in Figure 3.30. As previously, the bits are introduced into the encoder MSB first.

image

Figure 3.30. Hardware structure of the Hamming encoder with g(p) = 1 + p + p3

We consider the information word u = [1 1 1 0] associated with the polynomial u(p) = 1 + p + p2. After premultiplication, we have p3u(p) = p3 + p4 + p5 and the word transmitted to the divisor is [0001110].

Table 3.20. Hamming encoding

Input Clock transition s1 s2 s3
0 0 0 0
MSB 0 1 0 0 0
1 2 1 0 0
1 3 1 1 0
1 4 1 1 1
0 5 1 0 1
0 6 1 0 0
LSB 0 7 0 1 0

We find the same remainder as the one previously computed t(p) = 0 + 1p + 0p2. Using this structure, we need seven clock transitions to obtain the three bits of the remainder [0 1 0] as shown in Table 3.20.

It is possible to reduce the number of clock transitions by exploiting the fact that the last N − K bits of the polynomial pN−Ku(p) are zeros. Instead of modifying the most left N − K + 1 coefficients at each iteration, we can decompose the division into K successive divisions.

Let us consider the previous example to illustrate this principle: u(p) = 1 + p + p2 and g(p) = 1 + p + p3.

The associated hardware structure is given in Figure 3.31 and the sequencing of the encoder is given in Table 3.21.

image

Figure 3.31. Hardware structure of the Hamming encoder with premultiplication for g(p) = 1 + p + p3

Table 3.21. Sequencing of the Hamming encoder after premultiplication

Input Clock transition s1 s2 s3
0 0 0 0
MSB 0 1 0 0 0
1 2 1 1 0
1 3 1 0 1
1 4 0 1 0

This structure allows us to avoid N − K clock transitions corresponding to the last N − K bits of the polynomial pN − Ku(p).

We can now draw the complete hardware structure of the encoder for the Hamming code (7, 4) defined by the polynomial g(p) = 1 + p + p3 as shown in Figure 3.32.

image

Figure 3.32. Complete hardware structure of the Hamming encoder for g (p) = 1 + p + p3

The encoding is composed of two phases:

  • – the switch P0 is closed and P1 is in position A. K = 4 clock transitions are applied to send the K first bits of the codeword and calculate the remainder t(p);
  • – the switch P0 is open and P1 is in position B. N−K = 3 clock transitions are applied to send the last N − K bits of the codewords (corresponding to the remainder).

3.6.4.3. Trellis diagram of the cyclic code

We saw in section 3.3.9 how to obtain a trellis diagram of a linear block codes. From the hardware structure of the encoder, it is also possible to directly determine the trellis diagram of a cyclic code. The trellis diagram of the Hamming code defined by the polynomial g(p) = 1 + p + p3 is given in Figure 3.33.

image

Figure 3.33. Trellis diagram of the Hamming code defined by the polynomial g(p) = 1 + p + p3

The maximum number of states of this trellis diagram is equal to the number of states of the hardware structure i.e. 23 = 8 states. This trellis diagram is the same as the one obtained in section 3.3 if we reverse the order of the N bits. Indeed, the first three sections are related to the parity bits while the last four sections correspond to the information bits.

3.6.5. Decoding of cyclic codes

The decoding of cyclic codes is composed of two phases:

  • – syndrome calculation;
  • – error localization.

The syndrome calculation is performed by dividing the polynomial associated with the received word r(p) by g(p):

[3.129] images

The syndrome is the remainder of this division:

  • – if s(p) = 0, then the received word r(p) is a codeword;
  • – the syndrome s(p) is a polynomial of degree less or equal to N − K − 1.

From the value of s(p) it is possible to estimate the error vector ê(p).

In Figure 3.34, we show the hardware structure of the decoder associated with the Hamming code (7, 4) defined by the polynomial g(p) = 1 + p + p3.

image

Figure 3.34. Hardware structure of the decoder for the Hamming code with g(p) = 1 + p + p3

The decoding logic associates an estimated error vector ê(p) to each of the values of the syndrome s(p). Then, we add this error vector to the received vector r(p):

[3.130] images

The complexity of this decoder is related to this decoding logic. This solution is interesting when the number of errors to be corrected is small.

3.6.5.1. Correction of a single error

Let us consider a single error e(p) = pj. The associated syndrome has the following specific form:

[3.131] images

In Figure 3.35 we show the hardware decoding structure for the Hamming code (7, 4) defined by the generator polynomial g(p) = 1 + p + p3.

image

Figure 3.35. Hardware structure of the decoder for the Hamming code (7, 4)

For this example, the syndrome table between the error vectors and syndromes s(p) is given in Table 3.22.

Table 3.22. Syndrome table

Error vector e(p) s(p)
1000000 1 1
0100000 p p
0010000 p2 p2
0001000 p3 1+ p
0000100 p4 p + p2
0000010 p5 1+ p + p2
0000001 p6 1+ p2
LSB    MSB

The result of the multiplication of the syndrome s(p) by pNj is equal to 1:

[3.132] images

Consequently, to determine the position of the error, we just have to successively multiply s(p) by p until the result is equal to 1.

Example 3.27.– Let us consider the codeword c= [1 1 1 1 1 1 1] and the error vector e = [0 0 0 0 1 0 0] (e(p) = p4, i.e. a single error occurs on the fifth bit from the LSB):

[3.133] images

The content of the shift registers after each clock transition is given in Table 3.23.

Table 3.23. Content of the shift registers after each clock transition

Input Clock transition s1 s2 s3
0 0 0 0
MSB 1 1 1 0 0
1 2 1 1 0
0 3 0 1 1
1 4 0 1 1
1 5 0 1 1
1 6 0 1 1
LSB 1 7 0 1 1 → s(p) = p + p2
8 1 1 1 → s(p)p
9 1 0 1 → s(p)p2
10 1 0 0 → s(p)p3 = 1

we have j = 7 − 3 = 4

As expected, we have been able to correct this single error after three clock transitions.

3.6.6. BCH codes

3.6.6.1. Definition

The BCH codes are binary cyclic codes introduced by Hocquenghem [HOC 59] and independently by Bose and Ray–Chaudhuri [BOS 60]. These codes have the following parameters:

[3.134] images

Since BCH code is a cyclic code, it is defined by its generator polynomial g(p). The roots of g(p) are the 2e consecutive powers of α. Consequently, g(p) is the product of the minimal polynomials associated with α, α2, α3,…, α2e without repetition:

[3.135] images

where LCM is the least common multiple. Since in image with q = 2m, αi and α2i are roots of the same mimimal polynomial, for the determination of g(p), we can only consider the odd powers of α:

[3.136] images

Example 3.28.– Let us build a BCH code with N = 15 and able to correct two errors e = 2. Since N = 2m − 1, we obtain m = 4 and the minimal distance is higher or equal to 5. Its generator polynomial g(p) is as follows:

images

From the list of minimal polynomial for image, we obtain:

images

Since the degree of g(p) is N − K = 8, the obtained code will be a (15, 7) BCH code. Its minimum distance is exactly equal to 5.

Table 3.24 gives the generator polynomials of the BCH codes correcting up to 3 errors and N ≤ 63.

Table 3.24. List of generator polynomials of BCH codes with N ≤ 63

image

We know that α, α2,…, α2e are roots of the generator polynomial g(p) for a BCH code. Consequently, each codeword will also have α, α2, α3, …, α2e as roots.

images

This property can also be written as follows:

[3.137] images

Using the matrix notation, this relation can be also written as:

[3.138] images

Since we have the relation cHT = 0 between the parity check matrix and the codewords, the parity check matrix is equal to:

It is possible to demonstrate that this code can correct e errors.

Example 3.29.– The lines of the parity check matrix [3.139] corresponding to the same mimimal polynomial does not need to be repeated. Finally, we obtain for the BCH code (15, 7) defined with g(p) = 1 + p4 + p6 + p7 + p8, the following truncated parity check matrix by only keeping the first and third lines:

[3.140] images

In order to obtain a binary representation of the parity check matrix H, we have to replace each power of α by its binary representation as shown in section 3.2.4. We have:

[3.141] images

3.6.6.2. Decoding of BCH codes

The hard input decoding of BCH codes can be divided into three phases:

  • – calculation of the 2e components of the syndrome s = (s1s2 s2e);
  • – determination of the error-location polynomial σ(p);
  • – search of the roots of the polynomial σ(p).

Syndrome calculation

Let image be the transmitted codeword, image be the received word and image be the error vector. Then we have:

[3.142] images

The first phase consists in calculating the syndrome s. By definition, the syndrome is equal to the product of the received vector r by the transpose parity check matrix HT :

[3.143] images

The 2e elements of the syndrome can be computed as follows:

When there is no error (e(p) = 0), the 2e elements of the syndrome are zero.

Let us assume that the error vector is composed of v errors (with ve):

[3.145] images

where 0 ≤ j1j2 jvN − 1.

From [3.144], we obtain the following system of equations:

[3.146] images

By defining βk = αjk 1 ≤ kv, the system of equations becomes:

[3.147] images

We define the following polynomial:

[3.148] images

This polynomial σ(p) is called the error-location polynomial since the inverse of the roots of σ(p) are equal to βk = αjk and since it allows us to localize the position of the errors jk.

The coefficients σk of the polynomial σ(p) satisfy the following equations:

[3.149] images

In order to determine the coefficients σk, we will use the so-called Newton relation as follows:

and for i > v:

images

It should be said that in the binary case, i = 0 if i is even.

Determination of the error-location polynomial using the matrix approach

In the matrix approach, we solve directly the Newton system [3.150] to determine the error-location polynomial σ(p).

In the binary case, the Newton system [3.150] can be written as follows:

For example, if v = 2, we have the following system:

[3.152] images

and the solutions are:

images

The matrix approach can be summarized as follows. We first assume that v = e and we try to solve the Newton system [3.151] by replacing v with e. If v is equal to e or e − 1, a unique solution will be obtained. If v < e − 1, then the system will have no solution. In that case, we should assume that v = e − 2 and try to solve again the system [3.151] by replacing v with e − 2. We repeat this procedure until we obtain a solution. The matrix approach is interesting when the capability of the code is limited (typically when e ≤ 3).

We will now study a more general algorithm: the Berlekamp–Massey algorithm.

Determination of the error-location polynomial using the Berlekamp-Massey algorithm

This algorithm has been proposed by Berlekamp [BER 68] and developed by Massey [MAS 69]. In this book, we will only describe the algorithm without providing the proofs.

The first phase is to determine a polynomial σ(1)(p) of minimal degree satisfying the first inequality of the Newton system [3.150]. If the coefficients of the polynomial also satisfy the second inequality of the system, we write σ(2) (p) = σ(1) (p). Otherwise, we add a correction term to σ(1)(p) to obtain σ(2) (p). The polynomial σ(2) (p) is then the polynomial with minimum degree satisfying the two first equalities of the Newton system. We repeat this procedure to obtain σ(3)(p), σ(4)(p) until σ(2e)(p). This last polynomial is called the error-location polynomial σ(p).

Let image be the polynomial of degree lμ determined at the μth phase and verifying the μ first equations. To determine σ(μ+1) (p), we will first calculate dμ as follows:

images

If dμ = 0, then σ(μ+1) (p) = σ(μ) (p). Else, we go back to iteration ρ such that dp ≠ 0 and ρ − lp be maximum (with lp degree of the polynomial σ(p) (p)). We then obtain:

images

Example 3.30.– Let us consider the (15, 7) BCH code defined by the generator polynomial g(p) = 1 + p4 + p6 + p7 + p8 and that is able to correct up to two errors.

We assume that the null codeword c(p) = 0 has been transmitted and that the received word is r(p) = p4 + p9

We start by computing the elements of the syndrome:

[3.153] images

Since this example is rather simple, we can determine the error-location polynomial σ(p) using the matrix approach. We obtain:

[3.154] images

We will now perform the Berlekamp–Massey algorithm.

– for μ = 0. We initialize σ(0) (p) = 1 and we calculate d0:

images

We obtain the partial Table 3.25.

Table 3.25. Partial table for μ = 0

μ σ(μ) (p) dμ lμ μ− lμ
−1 1 1 0 −1
0 1 α14 0 0

Since d0 is non-zero, we must add a correcting term to σ(0) (p) to obtain σ(1)(p):

[3.155] images

– for μ = 1. We calculate d1.

[3.156] images

We obtain the partial Table 3.26 for μ = 1.

Table 3.26. Partial table for μ = 1

μ σ(μ) (p) dμ lμ μ−lμ
hline − 1 1 1 0 −1
0 1 α14 0 0
1 1+ α14p 0 1 0

Since d1 is zero we have:

images

– for μ = 2. We calculate d2:

[3.157] images

We obtain the partial Table 3.27 for μ = 2.

Table 3.27. Partial table for μ = 2

μ σ(μ) (p) dμ lμ μ− lμ
−1 1 1 0 −1
0 1 α14 0 0
1 1+ α14p 0 1 0
2 1+ α14p α12 1 1

Since d2 is non-zero, we add a correcting term to σ(2) (p) to obtain σ(3) (p):

[3.158] images

– for μ = 3. We calculate d3:

[3.159] images

The final table is given in Table 3.28.

Table 3.28. Final table

μ σ(μ) (p) dμ lμ μ− lμ
−1 1 1 0 −1
0 1 α14 0 0
1 1+ α14p 0 1 0
2 1+ α14p α12 1 1
3 1+ α14p + α13p2 0 2 1

Since d3 is zero, we have σ(4)(p) = σ(3)(p) and we finally obtain the following error-location polynomial σ(p):

images

We must now determine the positions of the errors.

Search of the roots of the error-location polynomial

We have seen previously that the inverse of the error-location polynomial roots are equal to βk and consequently allow us to localize the positions of the errors jk.

If the degree of σ(p) is equal to 1 or 2, the roots can be computed directly. When the degree of σ(p) is higher than 2, we can test each power of α but it is not an efficient solution. A practical solution is to apply the Chien search method [CHI 64].

– Initialization:

images

– for i = 1,…, N:

images

If image then there is an error in position N − i

The last line can be explained by the relation image.

Example 3.31.– We can check that:

[3.160] images

The roots of σ(p) are equal to α11 and α6. The inverse of the roots are β1 = α15−11 = α4 and β2 = α15−6 = α9. We deduce that the errors are situated on position j1 = 4 and j2 = 9.

We can now apply the Chien search method to localize the position of the errors. The results are given in Table 3.29.

We find again that the errors are located in positions j1 = 4 and j2 = 9.

Table 3.29. Finding the positions of the errors using the Chien search method

i Q1 Q2 Σi Qk
0 α14 α13 α2
1 1 1 0
2 α α2 α5
3 α2 α4 α10
4 α3 α6 α2
5 α4 α8 α5
6 α5 α10 1 → j1 =15 − 6=9
7 α6 α12 α4
8 α7 α14 α
9 α8 α α10
10 α9 α3 α
11 α10 α5 1 → j2 =15 − 11 = 4

3.6.7. Reed–Solomon codes

RS codes are constructed over the finite field image with q = 2m where m is an integer. The 2m − 1 non-zero elements of the field image are the successive powers of the primitive element α root of α2m−1 − 1 = 0. Each minimum polynomial associated with a non-zero element αi(i = 0,1,… , 2m − 2) can be written using the form mi(p) = pαi.

The generator polynomial g(p) of a RS (N, K) code of length N = 2m − 1 and dimension K is the product of the N − K minimum polynomials mi(p) with i = 1,2,…, N − K:

[3.161] images

Like for the other cyclic codes, the degree of g(p) is N − K. The minimum Hamming distance dmin of these codes is equal to N − K + 1. Consequently, the RS codes reach the Singleton bound.

Compared to the binary codes, the correction capability of RS codes is e = image symbols composed of m bits, i.e. a maximum of me bits. For example, the (255, 239,17) RS code in image can correct up to 8 octets independently of the number of erroneous bits per octet. This code will be able to correct up to 64 bits if the erroneous bits are localized inside only 8 octets. The generator polynomial of the (255, 239) RS code in image from the primitive polynomial p8 + p4 + p3 + p2 + 1 is the following:

[3.162] images

The word error rate is equal to:

[3.163] images

where SERI is the symbol error rate at the input. Consequently, the symbol error rate SERO at the output of the hard input RS is the following:

[3.164] images

The performance curve SERO = f(SERI) for the (255, 239,17) RS code is given in Figure 3.36.

The RS is decoded using the same methodology as the BCH codes. We need however to add one more operation consisting in evaluating the so-called amplitude of the errors, i.e. to determine the position of the erroneous bits within the erroneous bits.

3.7. Applications

The first field of applications of the block codes was spatial communications. The (32,6,16) Reed–Muller block codes with soft input decoders have been used in 1969 and 1977 for the Mariner and Viking missions (Mars).

image

Figure 3.36. SERO = f(SERI) for the (255,239,17) Reed–Solomon code

Among the other applications of block codes, we can quote the (8, 4) extended Hamming code that was used for teletext and the (273, 191) difference set cyclic code used for digital audio broadcast (DAB).

The Reed–Solomon codes are well adapted to the correction of bursts of errors in communication systems. The most popular RS codes are the (255, 239, 17), (255, 223, 33) and (255, 251, 5) in image. This last code is part of the cross interleave Reed-Solomon code (CIRC) used for compact disc (CD) application. The CIRC is composed of a (28, 24, 5) RS code, an interleaver and a (32, 28, 5) RS code [WIC 99]. The (28, 24) RS code is built from a (255, 251, 5) RS code. For that, we add 227 null octets to the 24 information octets in order to obtain a total of 251 octets, then apply a (255, 251, 5) RS code. After coding, we remove the 227 null octets to obtain a codeword composed of 28 octets including 4 redundancy octets. The same principle is applied for the (32, 28, 5) code and for the decoding of these codes.

For the CD-ROM, a (1170, 1032) product code is used to correct large bursts of errors. It is composed of line RS(26, 24) codes and vertical RS(45, 43) codes in image. We will come back to these concatenated codes in Chapter 5.

3.8. Exercises

3.8.1. Exercise 1: repetition code

We consider a binary equiprobable source and a BSC channel with p = 10−3. To limit the noise effect, we use a repetition code: a “0” is encoded by the codeword [0 0 0] and a “1” is encoder by the codeword [1 1 1].

1) Calculate the probabilities that 1, 2 or 3 errors occur during the transmission of a codeword.

2) We use the majority logic decoding (the decoded bit will be “1” if we receive at least two “1”). A received word is correctly decoded if two of the three bits of the word are without error. Compute the new error probability.

We generalize this principle: “0” is encoded by 2m + 1 “0”

“1“ is encoded by 2m + 1 “1”

3) What is the minimum Hamming distance between two codewords when m = 2, then for any m?

4) As previously, we use the majority logic decoding. Deduce the maximum number of errors that we can correct.

5) What is the word error probability Pe after decoding? We will write this probability in function of m. We can show that when p << 1, the error probability can be approximated by Peαpj. We will give the expressions of α and j in function of m. Do the numerical application for m = 1, m = 2 and m = 3.

6) We want to achieve a word error probability after decoding and correction equal or better than Pe = 5.10−7. What is the minimum value of m to reach this performance?

3.8.2. Exercise 2: error probability on a binary symmetric channel

We consider a binary source and a binary symmetric channel with a probability transition p equal to 0.1. We have seen that the capacity of this channel is 0.52 Shannon/symb.

1) Determine the probability that a codeword of 23 bits is received without error at the output of the channel.

2) Perform the same calculation for 1, 2 and 3 errors.

3) We use a Golay code (23, 12) able to correct up to three transmission errors. Determine the word error probability after decoding.

4) Comment your results with respect to the capacity of the BSC channel.

3.8.3. Exercise 3: re-emission strategy

We transmit a binary message or word of length N with equiprobable bits over a binary symmetric channel.

Let p be the conditional probability p = Pr(Y = 0|X = 1) = Pr(Y = 1|X = 0).

1) Determine the probability to have no error during the transmission message.

2) Determine the probability to have exactly one error during the transmission message.

3) Determine the probability to have at least one error during the transmission message.

4) Perform the numerical application for p = 0.1 and N = 12.

If the message is erroneous, we repeat it until the message is received without errors.

5) What is the probability that the message will be correct without reemission?

6) What is the probability that the message will be correct after 1, 2, 3, …, n re-emissions?

The repetition of the message three times allows us to use the majority logic decoding (at each position, the decoded bit will be “1” if we receive at least two “1”). We can show that this strategy is equivalent to considering that the message has been transmitted over a binary symmetric channel with conditional probability p′ = 3p2.

7) Calculate the probability to decode the message of N bits without error.

Another solution is to use an error correcting code. We consider the Golay code (24, 12) with minimum distance 8.

8) Calculate the word error probability after decoding.

3.8.4. Exercise 4: simplex code (7, 3)

Let us define the generator matrix of the simplex code (7, 3):

images
  • 1) Transform G into a systematic generator matrix and determine the parity check matrix H.
  • 2) What is the link between this code and the Hamming code (7, 4)?
  • 3) Compute the WEF A(D) and deduce the minimum distance of this code.
  • 4) Build the syndrome table.
  • 5) Show that the codeword corresponding to the information word u = [1 0 1] is orthogonal to H.

3.8.5. Exercise 5: linear block code (6,3)

Let us consider the generator matrix of the following linear blocks code (6, 3):

images
  • 1) Transform G into a systematic generator matrix and determine the parity check matrix H.
  • 2) Compute the WEF A(D) and deduce the minimum distance of this code.
  • 3) Build the syndrome table.
  • 4) Let us consider the following information word u = [1 0 1]. Determine the corresponding codeword. We assume that e = [0 0 1 0 0 0]. Can the decoder correct this error?

3.8.6. Exercise 6: Hamming code (7,4)

Let us consider the generator matrix of the Hamming code (7, 4):

images
  • 1) Transform G into a systematic generator matrix and determine the parity check matrix H.
  • 2) Compute the WEF A(D) and deduce the minimum distance of this code.
  • 3) Build the syndrome table.
  • 4) Let us consider the following information word u = [1 1 1 1]. Determine the corresponding codeword. We assume that e = [1 0 0 0 0 0 0]. Deduce the received word r = c + e. Estimate the codeword using the syndrome table.

3.8.7. Exercise 7: Reed–Muller code (8,4)

Let us consider the generator matrix of the Hamming code (7, 4):

images
  • 1) Determine the systematic generator matrix and the parity check matrix H. We add a redundancy bit c7 satisfying the following parity check equation: c7 = c0 + c1 + c2 + c3 + c4 + c5 + c6. The new code is a first order Reed–Muller (8, 4) or extended Hamming code.
  • 2) Determine the systematic generator matrix G′ of this code.
  • 3) Show that this code is autodual (G′ = H′).
  • 4) Determine the WEF A(D) of the code (8, 4) and deduce its minimum distance.
  • 5) Is it possible to link this code with the trellis diagram given in Figure 3.37?
image

Figure 3.37. Trellis diagram of the (8, 4) Reed–Muller code

3.8.8. Exercise 8: hard trellis based decoding of block code

Let us consider the trellis diagram given in Figure 3.38 and corresponding to the Hamming code defined by the following systematic generator matrix:

images

Let u = [1 1 1 1] be an information word

1) Determine the associated codeword.

We consider that an error occurs during the transmission e = [1 0 0 0 0 0 0].

image

Figure 3.38. Trellis diagram of the (7, 4) code

2) Determine the received word r = c + e.

3) Perform the hard decoding using the Viterbi algorithm.

3.8.9. Exercise 9: soft trellis based decoding of block code

Let us consider the trellis given in Figure 3.38 and corresponding to the Hamming code defined by the following systematic generator matrix:

images

Let u = [1 1 1 1] be an information word

  • 1) Determine the associated codeword.

    Before the transmission the bits are converted into a bipodal signal bit 0 ⇒ −1 and bit 1⇒ +1. At the output of the AWGN channel after filtering and sampling, the received vector is y = [−0.2 + 0.8 + 1.0 + 1.4 −0.5 + 2.5 + 0.7].

  • 2) Perform the soft decoding using the Viterbi algorithm.

3.8.10. Exercise 10: soft input decoding

We consider the systematic linear block code (3, 2) defined by its parity check equation u0 + u1 + c2 = 0 and a transmission over an additive white Gaussian channel with variance σ2 = N0/2.

  • 1) Give the systematic generator matrix, the list of codewords and the WER Deduce the minimum distance, the error correction capability and the error detection capability.
  • 2) Draw graphically the different words x = [x0,x1,x2] in a three dimension space. Determine the Euclidian distance between two words.

The pairwise error probability is given as:

images
  • 3) Determine the word error rate after soft decoding by using the union bound.

The trellis diagram associated with this code is given in Figure 3.39.

image

Figure 3.39. Trellis diagram of the (3, 2) code

We receive the word y = [+0.9 + 0.1 + 1.4].

  • 4) Apply the soft Viterbi decoding (we will assume that Es = +1).
  • 5) From the received word, compute the conditional probabilities Pr(u1 = 1|y) and Pr(u1 = 0|y) assuming that N0 = 4.

3.8.11. Exercise 11: comparison of hard and soft decoding

Let us consider a linear block code defined by the following generator matrix:

images

and its associated trellis diagram given in Figure 3.40.

image

Figure 3.40. Trellis diagram

  • 1) Determine the list of codewords and the WEE Deduce the minimum distance, the error correction capability and the error detection capability.

We use this code for a transmission over an AWGN with variance σ2. Before the transmission we convert the bits into a bipodal signal bit 0 ⇒ −1 and bit 1⇒ +1.

We select the information word u = [0 0 0].

  • 2) Determine the associated codeword c and the transmitted vector x = [x0 x1 x2 x3 x4 x5 x6 x7].

The error vector is as follows:

images
  • 3) Determine the received vector y with and without thresholding.
  • 4) Perform a hard input decoding by applying the Viterbi algorithm on the trellis diagram. Conclusion?
  • 5) Prove that for an AWGN, we have:

log (∏p(yi|xi)) ∝ − Σ (yixi)2 where the sign ∝ means “proportional to”.

  • 6) Perform a soft input decoding (instead of using the Euclidian metric (yi − xi)2 we will use the metric |yi − xi| to simplify the calculations). Conclusion?

3.8.12. Exercise 12: Hamming code

We consider a transmission over a binary symmetric channel using an Hamming code (15, 11).

  • 1) The generator polynomial of this code is g(x) = 1 + p + p4. Justify this choice.
  • 2) The information word is u = [11010010100]. Determine the associated systematic codeword.
  • 3) Give the hardware structure of this coder with premultiplication. Determine the remainder of the polynomial division using the hardware structure.
  • 4) We receive the word r = [011011010101000]. Check that this word is not a codeword.
  • 5) Give the hardware structure of the decoder.

3.8.13. Exercise 13: cyclic code

Let us consider the following generator matrix G:

images
  • 1) Determine K and N.
  • 2) Transform G in a systematic generator matrix and deduce the parity check matrix H.
  • 3) The minimum distance of the code is 4. Determine the correction and detection error capabilities.

Let us consider the cyclic code defined by the generator polynomial g(p) = (1 + p)(1 + p + p3). The polynomials 1 + p and 1 + p + p3 are irreducible factors of 1 + p7.

  • 4) Determine K and N.
  • 5) Let u = [0 0 1] be an information word. Calculate the associated systematic codeword.
  • 6) Give the hardware structure of the encoder.
  • 7) We receive the following word [0 1 0 1 1 1 1]. Check if this word is a codeword.

For this code, each codeword must be divisible by 1 + p + p3 and 1 + p (and consequently by g(p)).

  • 8) Fill in the table by indicating the number of errors corresponding to each case situation.
1+ p + p3 1+ p Number of errors
divisible divisible
non-divisible non-divisible
non-divisible divisible

We assume now that the decoder will verify only the divisibility by

1+ p + p3

  • 9) Calculate the remainder of the division of the polynomial associated with the receive word by 1 + p + p3.
  • 10) Give the hardware structure of the associated polynomial divisor. Give the output of the shift registers after each clock transition.
  • 11) Deduce the position of the error.
  • 12) Make the link between the error correcting code defined by the systematic matrix generator and the code defined by g(p).

3.8.14. Exercise 14: Hamming code

We consider the Hamming code (7, 4) defined by the generator polynomial g(p) = 1 + p + p3.

  • 1) Let u = [0 1 1 1] be an information word. Calculate the associated codeword c.
  • 2) Draw the hardware structure of the coder with premultiplication.
  • 3) Verify the result obtained in the first question.

3.8.15. Exercice 15: performance of Reed–Solomon code (255,291)

Let us consider the (255, 191) RS code defined in image. This code is used for example in the DVB-H standard.

  • 1) Since this code is MDS (dmin = N − K + 1), deduce the error correction capability.
  • 2) What is the maximum and minimum number of bits that this code is able to correct?
  • 3) Give the relation word error rate at the output of the hard decoding decoder in function of the symbol error rate at the input (SERI).
  • 4) Deduce the relation symbol error rate at the output (SERO) of the hard decoding decoder in function of the symbol error rate at the input (SERI).
  • 5) Numerical application; SERI = 2.10−2, determine the symbol error rate at the output (we will only consider the first-term in the sum). We can use the following relations :image

3.8.16. Exercise 16: CRC

In the radio frequency identification standard (RFID), a CRC of length 5 bits is added at each frame to detect the frame error at the receiver side. The generator polynomial used in this standard is g(p) = p5 + p2 + 1.

  • 1) Let us consider the information frame of length K = 6 u = [101001]. Determine the five CRC bits to be added in order to build the coded frame. Check that the coded frame is a codeword.
  • 2) Draw the hardware structure of the encoder.
  • 3) Let 10110001001 be the received sequence. Is it a codeword?

3.8.17. Exercise 17: Reed-Solomon code

Let us consider a (255, 239) RS code in image.

  • 1) Give the minimum distance and error capability of this error correcting code.
  • 2) Give the relation symbol error rate at the output SERO of the hard decoding decoder in function of the symbol error rate at the input SERI.
  • 3) Plot the curve SERO = f(SERI).
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset