Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

3
Linear Block Codes

3.1. Introduction

The aim of channel coding is to protect the data delivered by the source coder against the transmission errors.

We have shown that the random coding allows us to reach the limit of the channel coding theorem when the size of the codewords N tends to + ∞. However, this technique is not feasible due to the complexity of the associated coder and decoder. Indeed, let us consider a random binary code composed of 2^k codewords where K is the number of bits of the information word and each information word is associated with a codeword of N bits. To construct a random encoder, it is necessary to first build a set 2^k codewords drawn randomly. The encoding will correspond to the association of an information word with a unique codeword using a look up table. Since the code has no specific structure, the decoding will consist of an exhaustive search between the received word and all the 2^k codewords of the set in order to determine the maximum likelihood (ML) codeword. The complexity of this decoder increases exponentially with K and is almost always unfeasible in practice.

As a consequence, we have to use codes with an algebraic structure such as the linearity property in order to simplify the coder and decoder. These codes will have to be adapted to the class of errors (random, isolated, bursty, etc.). In this book, we will focus on the three following families of codes:

– linear block codes: a q-ary block code (N, K) is a set composed of q^k codewords. We associate a q-ary codeword composed of N symbols with each q-ary information word composed of K symbols. The linearity means that the N symbols of the codeword are obtained by linear combination of the K symbols of the information word.
– convolutional codes: compared to block codes, for a convolutional code of rate k/n, the codeword composed of n symbols is a function of the current information word composed of k symbols but also of a given number of previous information words. The length of the input and output sequences is infinite.
– concatenated codes: theses codes are obtained by concatenation of linear block codes or convolutional codes.

In this chapter, we will study the main properties of linear block codes applied to error correction and error detection. After introducing some fundamental notions on finite fields in section 3.2, we will study the linear block codes, their structures, properties and their matrix representations in section 3.3. In section 3.4, we will develop the hard and soft input decoding algorithms for those codes. In section 3.5, we will study their theoretical performance and finally, we will focus on the class of cyclic codes and their properties in section 3.6.

3.2. Finite fields

3.2.1. Fields

A field F is a non-empty set that has two operations defined on it, namely addition and multiplication and satisfying the following axioms:

– F is an Abelian group under addition. It has the properties of associativity, identity element written 0, symmetry and commutativity;
– the multiplication is associative: if a,b,c ∈ F, then a(bc) = (ab)c;
– the multiplication is commutative: if a, b ∈ F, then ab = ba;
– the multiplication is right distributive and left distributive with respect to addition: if a,b,c ∈ F, then a(b + c) = ab + ac and (a + b)c = ac + bc;
– the field has an identity element denoted as 1 for the multiplication;
– each non-zero element of F is invertible; if a ∈ F (a = 0), a⁻¹ is its inverse with aa⁻¹ = 1.

3.2.2. Finite fields

A finite field or Galois Field is a field with q elements and is denoted by or GF(q) in memory of Evariste Galois¹. It is possible to build a finite field if q is a prime number or if q = p^m with p as a prime number. When q is a prime number, the addition and multiplication in the finite field is the addition and multiplication modulo q. Since each finite field should contain the identity elements 0 and 1, the simplest finite field is . For digital communications, we will mainly use the finite fields and with q = 2^m since we mainly consider binary elements. In this chapter, we will mainly restrict ourselves to these two finite fields.

Example 3.1.– Addition and multiplication in .

**Table 3.1**. *Addition and multiplication in*

The addition in is equivalent to the XOR operation while the multiplication is a logical AND.

Example 3.2.– Addition and multiplication in .

**Table 3.2**. *Addition and multiplication in*

3.2.3. Irreducible and primitive polynomials

Let f(p) be a polynomial the coefficients of which are elements of , f(p) = f₀ + f₁x + f₂p² + … + f_mp^m with f_i ∈ . The degree of the polynomial is the highest non-zero power of p. If f_m ≠ 0, then the polynomial is of degree m.

Definition 3.1.– A polynomial is irreducible in if it cannot be written as a product of polynomials b(p)c(p) with b(p) and c(p) of degree higher or equal to 1.

Theorem 3.1.– All irreducible polynomials in of degree m divide p^2m ⁻¹ − 1.

The proof of this theorem is given for example in [LIN 83].

In Table 3.3, we give the decomposition of the polynomial of the form p^2m ⁻¹ − 1 in a product of irreducible polynomials for m ≤ 5.

Table 3.3. Decomposition of the polynomial of the form p^2m−1 − 1 in a product of irreducible polynomials

m =2	p³ − 1 = (1 +p)(1+p + p²)
m =3	p⁷ − 1 = (1 + p)(1 + p + p³)(1 + p² +p³)
m =4	p¹⁵ − 1 = (1 + p)(1 + p + p²)(1 + p³ +p⁴)(1 + p + p⁴)(1 +p + p² + p³ +p⁴)
m =5	p³¹ − 1 = (1 + p)(1 + p³ + p⁵)(1 + p² + p⁵)(1 + p² + p³ + p⁴ + p⁵) (1 + p + p³ + p⁴ + p⁵)(1 + p + p² + p⁴ + p⁵)(1 + p + p² + p³ + p⁵)

Definition 3.2.– An irreducible polynomial f(p) of degree m is a primitive polynomial if the smallest positive integer n for which f(p) divide pⁿ+ 1 is n = 2^m − 1. A non-exhaustive list of the primitive polynomials is given in Table 3.4.

**Table 3.4**. *List of primitive polynomials for m* ≤ 16

Table 3.5. List of the elements of the finite field

Element	Polynomial	Binary representation
0	0	00
1	1	01
α	p	10
α²	1+ p	11

3.2.4. Finite field with 2^m elements

A finite field with q = 2^m is isomorph to the field of polynomials with coefficients in modulo an irreducible polynomial f(p) in and with degree m. is called the basis field. Let α be a root of this polynomial (f(α) = 0). We can show that the successive powers of α generate the 2^m − 1 non-zero elements of the finite field .

Example 3.3.– Addition and multiplication in .

Let us study the finite field (m = 2 case) built from the primitive polynomial 1 + p + p². Let α be a root of this polynomial: 1 + α + α² = 0. We can check that the successive powers of α generate the 2² − 1 non-zero elements of . Table 3.5 gives the list of the elements of the finite field .

**Table 3.6**. *Addition and multiplication in*

We have α² = 1 + α, α³ = α + α² = 1 and α⁴ = α. We can check from Table 3.6 that is indeed a finite field.

Example 3.4.– List of the elements in . Let m = 4 and the irreducible polynomial f(p) = 1 + p + p⁴. We can check in Table 3.3 that this polynomial of degree 4 is a factor of p¹⁵ − 1 (the other irreducible polynomial of degree 4 is 1 + p + p² + p³ + p⁴). Let α be a root of this polynomial: 1 + α + α⁴ = 0. As previously, the successive powers of α generate the 2⁴ − 1 = 15 non-zero elements of . Table 3.7 gives the list of these elements.

Table 3.7. List of elements of finite field built from the irreducible polynomial 1 + p + p⁴

Element	Polynomial	Binary representation
0	0	0000
1	1	0001
α	p	0010
α²	p²	0100
α³	p³	1000
α⁴	1 + p	0011
α⁵	p + p²	0110
α⁶	p² + p³	1100
α⁷	1 + p + p³	1011
α⁸	1 + p²	0101
α⁹	p + p³	1010
α¹⁰	1 + p + p²	0111
α¹¹	p + p² + p³	1110
α¹²	1 + p + p² + p³	1111
α¹³	1 + p² + p³	1101
α¹⁴	1 + p³	1001

At each non-zero element of the finite field with q = 2^m, we associate a minimal polynomial.

Definition 3.3.– The minimal polynomial m_i(p) is the polynomial with the lowest degree for which αⁱ is a root.

For the previous example, the 15 minimal polynomials are the following:

[3.1]

Theorem 3.2.– Each minimal polynomial m_i(p) is irreducible.

The proof is given in [LIN 83]. We can check that the 5 polynomials above are irreducible and factors of p¹⁵ − 1. We will see further in the chapter that the irreducible polynomials of this table are among the main polynomials used for the construction of cyclic codes.

3.3. Linear block codes

3.3.1. Introduction

A q-ary linear block code C (N, K) is a set composed of q^K codewords.

We associate a q-ary codeword composed of N symbols with each q-ary information word composed of K symbols. The linearity means that the N symbols of the codeword are obtained by linear combination of the K symbols of the information word. This property allows us, in particular, to describe the coding operation in a matrix form.

In this section, we will only consider the binary linear block codes for which q = 2.

It is convenient to represent the information word and codeword using vectors. Let u = [u₀, u₁,…, u_K−₁] be an information word composed of K elements of information and c = [c₀, c₁, …, c_N₋₁] be the associated codeword composed of N elements . We have the matrix relation between the information word u and the associated codeword c:

[3.2]

G is the generator matrix of the coder with dimension K × N.

[3.3]

Each codeword is a linear combination of the vectors gⁱ de G composed of elements . Thus, a linear block code can be defined as a vector subspace with K < N dimensions built according to [3.3].

It is always possible by combining the lines to obtain the generator G under a systematic form as follows:

[3.4]

where I_K is the identity matrix of dimension K × K. When the generator matrix is systematic, the first K bits of the codeword are the information bits. We have: c = [u₀, u₁, …, u_K−1c_K, c_K+1,…, c_N−1]

Example 3.5.– Repetition code C₁ (3, 1) in :

[3.5]

The information bit is repeated three times:

Example 3.6.– Parity check code C₂ (3, 2) in :

[3.6]

c₂ is the so called parity bit:

Each codeword is composed of an even number of 1. The codewords of this code C₂ are 000, 011, 110 and 101. Figure 3.1 gives a graphical representation of this code in a three-dimension (3D) space.

Example 3.7.– Hamming code C₃ (7, 4) in is defined by the following generator matrix:

[3.7]

**Figure 3.1**. *Parity check code C*₂ (3, 2)

Here, we can obtain a systematic generator matrix simply as follows: we add lines 1, 2 and 3 to obtain line 1, lines 2, 3 and 4 to obtain line 2 and line 3 and 4 to obtain line 3. The last line remains unchanged. This technique allows us to convert any generator matrix into a systematic generator matrix.

The systematic form of this generator matrix is as follows:

[3.8]

The three parity or redundancy bits are equal to:

The 2⁴ codewords for this code are listed in Table 3.8.

Table 3.8. List of codewords for Hamming code C₃ (7, 4)

0000000	0001101	0010111	0011010
0100011	0101110	0110100	0111001
1000110	1001011	1010001	1011100
1100101	1101000	1110010	1111111

Definition 3.4.– The rate R of a block code (N, K) is equal to:

[3.9]

Definition 3.5.– Let c¹ and c² be two codewords of code C, α₁ and α₂ be two elements of the finite field. The linearity implies that α₁c¹ + α₂c² is also a codeword of C. Consequently, the word c⁰ = [00 … 0] is always a codeword of any linear code. This codeword is called the null codeword.

3.3.2. Minimum distance of a code

Definition 3.6.– Let c¹ and c² be two codewords of length N of the code C, the Hamming distance d_H(c¹, c²) is equal of the number of elements which are different.

Definition 3.7.– The Hamming weight w(c) of a binary codeword c is equal to the number of non-zero elements of this codeword.

Definition 3.8.– The minimum distance d_min of the code C is the Hamming distance between the pair of codewords with the smallest Hamming distance:

[3.10]

When the code is linear, the minimum distance d_min is equal to the minimum Hamming weight of the code C (by excluding the null codeword c⁰):

[3.11]

Example 3.8.– The minimum distance of the code C₁ (3, 1) and code C₃ (7, 4) is equal to 3; the minimum distance of the code C₂ (3, 2) is equal to 2.

Until recently, the minimum distance was the unique criterion to evaluate the performance of error correcting codes. This criterion has been partially challenged with the discovery of very efficient codes imitating random coding [BER 93, BAT 97].

3.3.3. Parity check matrix

There exists a block linear code (N, N − K) associated with each block linear code C (N, K). Let H be the generator matrix of this dual code. Each of the codewords c of the code C is orthogonal to all the codewords of the dual code:

[3.12]

Since this relation is true for all the codewords of the code C, we have the relation between the generator matrix G of the code C and H:

[3.13]

If the generator matrix G is systematic as described in equation [3.4], H can be described as follows:

[3.14]

The matrix H is called the parity check matrix of the code C.

Example 3.9.– The parity check matrix of the Hamming code C₃ (7, 4) is as follows:

[3.15]

We can observe that the parity check matrix of the Hamming code C₃ is composed of the seven non-zero different column vectors of dimension 3. Each of the three lines of the parity check matrix correspond to a parity check equation (addition modulo 2 in the case of binary code) involving the different bits of the codewords c = (c₀ c₁ c₂ c₃ c₄ c₅ c₆) = (u₀ u₁ u₂ u₃ c₄ c₅ c₆).

[3.16]

We find the same three parity check equations as in the previous example.

3.3.4. Weight enumerator functions

The weight enumerator functions (WEFs) allow us to study the performance of the linear block codes.

Definition 3.9.– The WEF of a systematic binary block coder (N, K) is defined as follows:

[3.17]

A_d is the number of codewords with length N and Hamming weight d.

Definition 3.10.– The input redundancy weight enumerator function (IRWEF) of a systematic binary block coder (N, K) is defined as follows:

[3.18]

A_w_,z is the number of codewords with length N and for which the weight of the information sequence is w and the weight of the redundancy sequence is equal to z.

The IRWEF function can also be written as follows:

[3.19]

with:

[3.20]

Definition 3.11.– The input output weight enumerator function (IOWEF) of a systematic binary block coder (N, K) is defined as follows:

[3.21]

B_w_,d is the number of codewords with length N and Hamming weight d and for which the weight of the information sequence is w.

Example 3.10.– Parity check code C₂ (3, 2) in .

Table 3.9. Weight enumeration of information sequence and codewords for the parity check code (3, 2)

u	c	w	z	d
00	00 0	0	0	0
01	01 1	1	1	2
10	10 1	1	1	2
11	11 0	2	0	2

The weights of the different codewords are detailed in Table 3.9. The WEF, IRWEF and IOWEF for the parity check code (3, 2) are the following:

Example 3.11.– Hamming code C₃ (7, 4). From the list of codewords, the WEF, IRWEF and IOWEF for the code C₃ (7, 4) are the following:

3.3.5. Error correction and error detection capabilities of linear block codes

Over a binary symmetric channel, the number of errors e that an error correction code is capable of correcting is directly related to the minimum distance of the code. The 2^K codewords of the code can be seen as the center of a Hamming sphere with radius e. In order that the 2^K spheres do not overlap, we should have the following relation:

[3.22]

If a received word is inside the Hamming sphere of the transmitted codeword, the decoder will be able to recover the codeword as shown in Figure 3.2. This is possible only if the Hamming distance between the transmitted codeword and the received word is less or equal to e. As a conclusion, a linear block code (N, K) with minimum distance d_min can correct up to e errors according to relation [3.22].

Furthermore, the same code can also be used as an error detection code. In that case, it will be able to detect up to d_min − 1 errors.

[3.23]

A code can correct up to e errors (error correction) or detect up to e_d errors (error detection) in a binary symmetric channel.

A code can correct up to d_min ₋ ₁ erasures on an erasure channel.

[3.24]

Example 3.12.– The codes C₁ (3, 1) and C₃ (7, 4) with minimum distance 3 can correct one error or detect up to two errors.

3.3.5.1. Hamming bound and perfect codes

Definition 3.12.– The Hamming bound is given by the relation² between K, N and e the number of errors that can correct the (N, K) in :

[3.25]

or by dividing the two terms by 2^K

[3.26]

Proof.– The number of words included in a Hamming sphere of radius e is equal to:

[3.27]

The total number of words included in the q^K Hamming spheres cannot be higher than q^N to avoid overlapping of the spheres. Consequently, a code correcting e errors should satisfy the inequality [3.25].

In the binary case, the inequality [3.25] becomes the following:

[3.28]

Definition 3.13.– A code is perfect if all the q^N possible words are included in the q^K Hamming spheres of radius e. The inequality in [3.28] is replaced by an equality.

3.3.6. Bounds on the minimum distance of the linear block codes

In this section, we will give the main upper and lower bounds on the minimum distance of the linear block codes in function of the rate R = K/N.

A first upper bound is the Singleton bound:

[3.29]

The codes for which the inequality is replaced by an equality in relation [3.29] are called maximum distance separable (MDS) codes.

By dividing the two terms of [3.29] by N, we have:

[3.30]

When N → +∞ we obtain:

[3.31]

A second upper bound is the Plotkin bound that is defined when N → +∞:

[3.32]

A tighter upper bound is the Elias–Bassalygo bound [BAS 65]

[3.33]

where H_q(α) = α log_q(q − 1) − α log_q(α) − (1 − α) log_q(1 − α).

For the binary codes, the Elias-Bassalygo bound is given by:

[3.34]

where H₂(α) = − log₂ α − (1 − α) log₂(1 − α)

The upper bound of Hamming is obtained from the Hamming bound developed in section 3.3.5.1:

[3.35]

Finally, we will give the most popular lower bound of Gilbert–Varshamov:

[3.36]

For binary codes, we have:

[3.37]

In Figure 3.3, we show the curves R = f(d_min/N) relative to the lower bound of Gilbert–Varshamov and the upper bounds of Plotkin, Hamming and Elias–Bassalygo in the binary case. All the points of coordinate (d_min/N, R) under the Gilbert–Varshamov bound are reachable and all the points above the Elias–Bassalygo bound are unreachable. We can see that the upper bound of Hamming is better than the Plotkin bound when d_min/N < 0.3.

**Figure 3.3**. *Bounds on the minimum distance for linear block codes with q* = 2

In Figure 3.4, we show the curves R = f(d_min/N) relative to the lower bound of Gilbert–Varshamov and the upper bounds of Plotkin, Hamming and Elias–Bassalygo in the case q = 2⁸. We can see that the results are quite different from the binary case. The Plotkin bound (equivalent to the Singleton bound in that case) is tighter than the upper bound of Elias–Bassalygo when q ≥ 16.

In the non-binary case, there exist MDS codes such as the Reed–Solomon (RS) codes while in the binary codes, only trivial codes such as the parity check codes (N, N − 1)) achieve the singleton bound.

3.3.7. Bounds on the rate in the non-asymptotic regime

From Figure 1.18 given in Chapter 1, we can write the rate R as the ratio H(U)/H(X). When the input of the channel is binary, we have H(X) = 1 Shsymb and R = H(U). The theorem of channel coding given by relation [1.105] becomes R ≤ C. The information theory gives the expression of the capacity of the channel transmission C which is the achievable rate under the assumption that the length of the codewords tends asymptotically toward infinite. From these results, different researchers including Feinstein and Elias [ELI 55] have evaluated the word error rate of communication systems in the non-asymptotical regime, i.e. when the length of the codewords N is not infinite. This probability decreases exponentially with the length N P_e ≈ exp(−NE(R)) where E(R) is a function of the rate R called error exponent as shown in Figure 3.5. E(R) is a positive function for all the values of R less than the capacity.

**Figure 3.4**. *Bounds on the minimum distance for linear block codes with q* = 256

In 1967, Shannon, Gallager and Berlekamp [SHA 67] introduced a lower bound on the error exponent called sphere packing bound. For the memoryless discrete channel, this bound can be written as follows:

[3.38]

[3.39]

with

[3.40]

and

[3.41]

**Figure 3.5**. *Error exponent function versus rate*

The maximum in equation [3.41] is calculated over the set of a priori probability vectors q = [q₁, …, q_M].

For the BSC channel, a tighter upper bound based on the random coding has been proposed by Poltyrev [POL 94] and more recently by Polyanskiy et al. [POL 10]. The authors have shown that for a BSC channel with transition probability p, there exists a linear code (N, K) that can guarantee a given word error probability P_e. We have:

[3.42]

In Figure 3.6, we show the curves rate versus length N obtained using the Poltyrev bound for different word error probabilities P_e.

**Figure 3.6**. *Poltyrev bound rate versus length N*

For the additive white Gaussian noise (AWGN), Shannon [SHA 59a] has determined a sphere packing bound as follows:

[3.43]

where . This lower bound is computed under the hypothesis that the 2^K vectors associated with the codewords are distributed on the surface of a hypersphere of radius . Consequently q(Θ_s, A) is the probability that the received vector is outside the cone corresponding to the transmitted codeword. This probability can be written as:

[3.44]

where the half angle Θ_s of the elementary cone is computed such that the fraction between the surface of the spherical cap and the surface of the associated hypersphere is equal to:

[3.45]

with

[3.46]

In Figure 3.7, we have given the ratio E_b/N₀ as a function of N for R = 1/2 and R = 1/3 obtained using the sphere packing bound.

3.3.8. Main block codes

In this section, we will present some important classes of block codes. Other families of block codes will be developed in section 3.6 on cyclic codes. We will first describe the perfect block codes introduced in section 3.3.5.1. Two classes of perfect block codes exist: the Hamming codes and Golay codes. Finally, we will introduce the class of Reed–Muller codes covering a large range of dimension and minimum distance.

3.3.8.1. Hamming codes

Hamming codes are perfect binary linear block codes (N, K) with N = 2^J−1 and K = 2^J− 1 − J. Hamming code (N, K) can be simply described by its parity check matrix H of dimension J × N since J = N − K. Indeed, the columns of H are the N binary non-null vectors containing J elements. For example, for J = 3, the Hamming code is a (7, 4) code and its parity check matrix is given by [3.15].

The minimum distance of these codes is equal to 3 and can correct up to one error. We will see in section 3.6 that the Hamming codes belong to the class of cyclic codes and can be built from a polynomial generator.

**Figure 3.7**. *Sphere packing bound ratio E_b*/N₀ *versus N*

3.3.8.2. Golay code

The binary Golay code is a binary linear block code (23, 12) and its minimum distance is equal to 7. This code is perfect since for N = 23, K = 12 and e = 3, we have the equality:

The Golay code is also a cyclic code. The systematic generator matrix of the (23, 12) Golay code is the following:

[3.47]

Its WEF is equal to:

[3.48]

From the (23, 12) Golay code, it is possible to build an extended Golay code (24, 12) by adding a parity bit. The minimum distance of this code is equal to 8. The non-systematic generator matrix of the extended Golay code (24, 12) is the following:

[3.49]

The WEF is equal to:

[3.50]

The ternary Golay code ³ is a linear code (11, 6) in and its minimum distance is equal to 5. The non-systematic generator matrix of this code is the following:

[3.51]

It is a perfect code since we have:

3.3.8.3. Reed–Muller codes

The Reed–Muller codes are a class of codes covering a large range of dimension and minimum distance. They can be graphically described using simple trellis allowing efficient soft decoding. For each integer m and r < m, there is a Reed–Muller code (N, K, d_min) with:

[3.52]

The generator matrix of a Reed–Muller code of order r is built by the concatenation of r + 1 matrices as follows:

[3.53]

where G₀ = [1 1 1 … 1 1] of dimension 1×N, G₁ is the matrix of dimension m × N composed of the set of the N different non-null binary column vectors of dimension m. The dimension of the matrix G₂ is . Each line of G₂ is obtained by performing the product element by element of the two lines of G₁. More generally, the dimension of the matrix G_i is and its lines are obtained by performing the product element by element of the i lines of the matrix G₁.

Example 3.13.– Let us construct a Reed–Muller (8,4,4) of order 1 with m = 3. We have G₀ = [1 1 1 1 1 1 1 1] and:

[3.54]

Consequently, we have:

[3.55]

In order to construct the Reed–Muller code (8,7,2) of order 2, it is just necessary to add to the previous matrix generator the matrix G₂ of dimension 3 × 7 with

[3.56]

In Table 3.10, we give the list of Reed–Muller codes that can be obtained for m = 3, 4 and 5

3.3.9. Graphical representations of binary linear block codes

We will now present different graphical representations such as the Tanner graph or the trellis diagram which are practical tools to study error correcting codes and derive efficient decoding algorithms.

Table 3.10. List of Reed–Mullercodes (N, K, d_min) for m = 3, 4 and 5

	m=3	m =4	m =5
order 0	(8,1,8)	(16,1,16)	(32,1,32)
order 1	(8,4,4)	(16,5,8)	(32,6,16)
order 2	(8,7,2)	(16,11,4)	(32,16,8)
order 3		(16,15,2)	(32,26,4)
order 4			(32,31,2)

3.3.9.1. Tanner graph

Generally speaking, any binary linear code can be graphically represented by a Tanner graph. A Tanner graph is a bipartite graph composed of two types of nodes: the binary variable nodes represented by a circle and the parity check nodes represented by a square. Each branch means a dependency between the variable node and the parity check node that it connects.

The Tanner graph can be deduced directly from the parity check matrix H of the code: each parity check node T_i of the Tanner graph corresponds to the i-th line of the parity check matrix H and each variable node c_j corresponds to the j-th column of H. Indeed, each line of H defines a parity check equation between different variables. A branch will connect the parity check node T_i with the variable node c_j if and only if h_ij = 1.

Example 3.14.– Let us consider again the Hamming code C₃ (7, 4) the systematic generator matrix of which is given in equation [3.8]. We have shown that each of the 3 lines of the parity check matrix H corresponds to a parity check equation that defines links between the different bits of the codeword c = (c₀ c₁ c₂ c₃ c₄ c₅ c₆) = (u₀ u₁ u₂ u₃ c₄ c₅ c₆).

[3.57]

The associated Tanner graph is given in Figure 3.8. In version (b), the variable nodes are situated on the left side and the parity check nodes are situated on the right side.

**Figure 3.8**. *Two versions of the Tanner graph for the Hamming code (7*, 4)

3.3.9.2. Trellis diagram for linear block codes

Wolf [WOL 78], Massey [MAS 78] in 1978 and then Forney [FOR 88] in 1988 have shown that it is possible to represent any block code using a trellis diagram composed of N elementary sections. This trellis fully describes the set of codewords.

The trellis diagram can be deduced from the relation with H = [h₀h₁ … h_N−1] the parity check matrix of the code. Let s_r+1 be the vector of (N − K) × 1 representing the state of the trellis at position i. The state vector s_i₊₁ is also called the partial syndrome. We have the following recursive relation:

[3.58]

c_i is chosen under the constraint that the vectors associated with the paths in the trellis diagram constructed until the position i + 1 are partial codewords. If the arrival state is s_i+1 and the departure state is s_i, then a branch will connect these two states in the trellis diagram. In the binary case, by convention, a dotted line will correspond to c_i = 0 and a plain line will correspond to c_i = 1 as shown in Figure 3.9.

**Figure 3.9**. *Branch of a trellis section*

Let us consider the previous example of the Hamming code C₃ (7, 4). From the parity check matrix given in [3.15], we can draw the trellis diagram of this code as shown in Figure 3.10.

**Figure 3.10**. *Trellis diagram obtained from the parity check matrix of the code C*₃

The number of branches in this trellis diagram is equal to 2¹ + 2² + 2³ + 2⁴ + 2³ + 2² + 2¹ = 44. We will show in the next section that it is also possible to build an equivalent trellis diagram when the code is cyclic.

Another method to build a trellis diagram is to use the generator matrix [FOR 88]. We must first adapt the generator matrix for the trellis representation i.e. to build a trellis diagram with the minimum number of branches and nodes. We will now describe this method.

Let us consider a codeword c = (c₀ c₁, … c_N₋₁). We define L(c) as the smallest index i such that c_i ≠ 0 and R(c) as the highest index i such that c_i ≠ 0. The envelope of c is the sequence of bits 1 starting from the first non-zero bit of the codeword and finishing with the last non-zero bit of the codeword. For example, the envelope associated with the codeword 1101000 is 1111000. The span of c is the number of “1” in the envelope of c span(c) = R(c) − L(c).

A greedy algorithm proposed by Kschischang and Sorokine [KSH 95] allows to determine the generator matrix with minimum span, i.e. minimizing the number of branches in the trellis diagram. The pseudo-code is given as follows:

– Step 1: find a pair of lines cⁱ and c^j in the generator matrix G such that L(cⁱ) = L(c^j) and R(cⁱ) ≤ R(c^j), or R(cⁱ) = R(c^j) and L(cⁱ) ≥ L(c^j);
– Step 2: if no pair is found at step 1, go to step 4;
– Step 3: apply cⁱ = cⁱ + c^j, i.e. replace line cⁱ of the generator matrix by the sum of the two lines;
– Step 4: the obtained matrix G is a minimum span generator matrix.

Example 3.15.– Let us consider the Hamming code C₃ (7, 4) defined by the systematic matrix generator given by [3.8] and its associated envelope matrix:

[3.59]

By applying the above algorithm (for example by summing line 2 and 3, then line 3 and 4, then line 1 and 3 and finally line 1 and 2), we obtain the following generator matrix and envelope matrix:

[3.60]

In order to determine the number of branches of the associated trellis diagram we count the number of 1 per column of the envelope matrix. Let n_i be the number of 1 of the i-th column; the total number of branches is equal to . In this example, the total number of branches in the trellis diagram is equal to 44.

Then to construct the trellis diagram of this code, we have to consider each line of the generator matrix as a subcode (N, 1) and composed of only 2 codewords.

Let us construct the trellis diagram from the obtained generator matrix G. The different construction phases of the trellis diagram are given in Figure 3.11. For the first line, the subcode contains the codewords 0000000 and 11010000. The trellis diagram associated contains 2 paths. We then construct the trellis diagram corresponding to the two first lines of the generator matrix. This trellis diagram is simply the product of the trellis diagram of the subcodes associated with the first and second line. This procedure is repeated until the end of the construction of the trellis diagram. We can check that the obtained trellis diagram is composed of 44 branches. We have the same low complexity than when using the construction based on the parity check matrix [MCE 96].

Since the complexity of the decoding algorithms based on trellis diagram is proportional to its number of branches, it is important to build a trellis with the smallest number of branches. In this section, we have presented two approaches to obtain such a trellis diagram.

**Figure 3.11**. *Trellis diagram of Hamming code (7,4*)

3.4. Decoding of binary linear block codes

3.4.1. Introduction

In Chapter 1, we introduced different transmission channel models. In Figure 3.12, we show a point-to-point transmission chain with an AWGN channel.

**Figure 3.12**. *Block diagram of a transmission chain with an additive white Gaussian noise channel*

So, we can distinguish two cases:

– If the input of the decoder is binary, we will perform hard input decoding. The set modulation-additive white Gaussian channel-decision function can be seen as an equivalent BSC channel. The error probability p of this BSC channel is equal to the error probability in an uncoded additive white Gaussian channel as given by the following equation:

[3.61]

where:

[3.62]

In [3.61], the energy per bit E_b of the non-coded case is replaced by RE_b since we have to take into account the impact of the rate R of the channel encoder. In that case, the hard input decoder will compute Hamming distances. The relation between the received word r and the codeword c can be written as:

[3.63]

where e is the binary error vectors (here the addition is performed by modulo 2).

– If the input of the decoder is real, we will perform a soft input decoding. In practice, the input will be quantized using few bits (3 to 4 bits are usually sufficient not to degrade the performance of the decoding). Compared to the hard input decoding, the soft input decoding will compute Euclidean distances. The input output relation is the following:

[3.64]

where x is the transmitted word, n is the noise vector and y is the received word.

In this section, we will also study the decoding of block code on erasure channel.

3.4.2. Optimal decoding

The aim of the optimal decoding is to determine the ML sequence denoted as .

Let x = (x₀, x₁, x₂, … ,x_N₋₁) be the sequence of length N transmitted in a stationary memoryless discrete with conditional density probability p(y/x_i) and y = (y₀, y₁, y₂, …, y_N₋₁) be the received sequence.

In this section, we will consider two decoding criterions: the maximum a posteriori (MAP) criterion and the ML criterion. The MAP decoder searches among all the possible sequences x, the estimated sequence for which the conditional probability Pr(x|y) is the highest.

[3.65]

We can write:

[3.66]

Let us assume that the probability of occurrence of each codeword is the same Pr(x) = 1/2^K. Under this hypothesis and since the denominator p(y) is common for all the sequences, the estimated sequence is the sequence for which the conditional probability p(y|x) is the highest.

[3.67]

A decoder using this criterion is called an ML decoder. So, when the codewords are equiprobable, the MAP and ML decoders are the same.

If the channel is memoryless, the transmitted signals are perturbed independently and the conditional probability p(y|x is equal to the product of the conditional probabilities p(y_i|x_i):

[3.68]

For the BSC channel with conditional probability p, the probability Pr(r|c) is the following:

[3.69]

where d_H(r, c) is the Hamming distance between the binary received sequence r and the transmitted codeword c. Since p is between 0 and 0.5, we have 0 < . We have proved that maximizing Pr(r|c) is equivalent to minimize the Hamming distance between r and c. Consequently:

[3.70]

For the BSC channel, the ML criterion implies the computation of the Hamming distance between the received sequence and all the possible codewords.

Let us consider now the case of soft input decoding of a sequence received from an additive white Gaussian channel after matched filtering. We have seen that at time i, the output y_i can be written as follows:

[3.71]

where the average energy per symbol and n_i is a sample of white centered Gaussian noise with variance . Consequently, the density probability of y_i conditionally to x_i is:

[3.72]

Since the logarithm function is increasing, instead of using the relation [3.67] to determine , we can compute:

[3.73]

[3.74]

We obtain a first version of the ML decoder:

[3.75]

In this case, the ML criterion implies the computation of the Euclidian distance between the received sequence and all the possible sequences.

A second version is obtained by replacing the Euclidian distance by the Manhattan or L₁ distance as follows:

[3.76]

This suboptimal version achieves a rather good performance in practice.

When the modulation is bipodal , a third version can be obtained:

[3.77]

Indeed, since and are common for each sequence, we can simplify the calculations without any performance degradation.

3.4.3. Hard input decoding of binary linear block codes

The received word r is the sum modulo 2 of the transmitted codeword c and an error vector e:

[3.78]

A straightforward approach for the decoding is to compare the received word r with all the 2^K codewords of the code C. To minimize the word error probability, the decoder will select as estimated codeword , the codeword with the smallest Hamming distance of the received word r. This approach is however complex to implement and has a limited practical interest.

When multiplying the received word r by the transposed parity check matrix H^T , we obtain the so-called error syndrome s of dimension 1 × (N − K):

[3.79]

[3.80]

[3.81]

If there is no transmission error, the error syndrome s is the null vector.

We will now present three hard decoding methods: the standard array, the syndrome decoding and the Viterbi algorithm.

3.4.3.1. Standard array method

Since the error syndrome can take 2^N−K different values, each error syndrome s is associated with 2^N/2^N−K = 2^K different error vectors. The standard array method performs a partition of the N dimensional vector space into 2^K disjoint classes. Each class is associated with a codeword.

The standard array as shown in Table 3.11 is built as follows:

– the first line is composed of the 2^K codewords starting with the null codeword;
– under the null codeword c⁰, we list the set of error vectors starting with the patterns with weight 1, then weight 2 (if N ≤ 2^N^−K), until 2^N^−K elements of the first column are filled;
– under each codeword of the first line, we compute the sum of the codeword and the associated error vector.

Each row is a subset of words or coset corresponding to an error syndrome and a commun representative called the coset leader. The coset leader is the ML error vector for the words of this subset.

The decoding consists of searching the column corresponding to the received word. The decoded codeword will be the first element of the column, i.e. the codeword associated with this column. This method is too complex and is only interesting from a pedagogical point of view.

3.4.3.2. Syndrome decoding

Syndrome decoding is a direct extension of the standard array method. We have seen that the error syndrome can take 2^N−K different values. We first compute the syndrome using equation [3.79]. Then, we associate with this syndrome the corresponding error vector ê. This method only requires a look-up table that associates the error syndromes with the error vectors.

The estimated codeword ĉ is then:

[3.82]

This method is less complex than the standard array method, but is only limited to the short block codes (codes with an error capability of a few errors). Indeed, we have to perform the product rH^T and memorize 2^N−K error vectors. For example, the decoding of the Golay code (23, 12) using the syndrome decoding method requires a memory of 2048 words of 23 bits.

Example 3.16.– Let us consider the code C₄ (5, 2) with the following generator matrix:

[3.83]

and the associated parity check matrix H:

[3.84]

The standard array is given in Table 3.13. We can notice that there are different ways to fill the last two lines of the standard array.

Table 3.13. Standard array for the code (5,2)

00000	01011	10101	11110	000
00001	01010	10100	11111	001
00010	01001	10111	11100	010
00100	01111	10001	11010	100
01000	00011	11101	10110	011
10000	11011	00101	01110	101
11000	10011	01101	00110	110
10010	11001	00111	01100	111

Let us consider the information word u = [11]. The associated codeword is c = [11110]. We can check that the syndrome associated with this codeword is the null vector: s = [000]. If we suppose that an error occurs during the transmission of the fourth bit of the transmitted word: e = [00010], the received vector is then r = [11100]. Using the standard array method, we obtain directly ĉ = [11110]. The decoder has been able to recover the error.

We will now use the syndrome decoding. We will first calculate the error syndrome associated with the most probable error vectors (1 error, then 2 errors, etc.). The syndrome decoding table for this code is given in Table 3.14.

Table 3.14. Syndrome decoding table for the (5, 2) code

ê	s
00000	000
00001	001
00010	010
00100	100
01000	011
10000	101
11000	110
10010	111

The syndrome associated with the received word r is s = rH^T = [010]. Using the syndrome decoding table, we find ê = [00010]. After adding the estimated error vector ê to the received word r, we have ĉ = [11110] and the error has been corrected.

Since the correction capability of this code is equal to e = 1, this code can correct any error pattern with one error. The last two lines do not guarantee a correct decoding since more than one pattern with two errors is associated with these syndromes. Consequently, it is generally better not to use those syndromes for the decoding.

For the Hamming codes (2^J − 1, 2^J − 1 − J), the syndrome decoding table is composed of 2^J − 1 non-null error syndromes associated with the 2^J − 1 error vectors with one error.

3.4.3.3. Viterbi algorithm

We have seen previously that any linear block code can be graphically described using a trellis diagram. The search of the ML codeword ĉ is equivalent to the search of the ML path in the trellis diagram. The estimated codeword ĉ can then be immediately deduced.

The general principle of the Viterbi algorithm consists, at each section of the trellis diagram, of eliminating the paths (and the associated codewords) that cannot be the highest likelihood path. At each node of the trellis diagram, we will keep only one path. At each section, the Viterbi algorithm performs the following operations:

– computation of the branch metric (in the case of hard input decoding, Hamming distance d_H(r_i, c_i) between the received bit r_i and the bit associated with the considered branch c_i);
– at each state, computation of the cumulated metric for each branch arriving to this state (summation of the cumulated metric of the starting state and the branch metric);
– for each state, selection of the survival path corresponding to the path arriving to this state with the smallest cumulated metric. The other paths, called concurrent paths, are eliminated.

Finally, the ML path and the associated codeword are obtained by performing trace-back from the right to the left by starting with the node with the smallest cumulated metric.

We will consider an example of a point-to-point transmission over an AWGN channel. We will use the Hamming code C₃ (7, 4) defined by the systematic generator matrix given in [3.8] and its associated trellis. Let u = [1110] be the information word and c = [1110010] be the associated codeword. After thresholding, we receive the word r = [1010010]. An error has occurred during the transmission of the second bit of the codeword.

**Figure 3.13**. *Branch metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip*

At each step of the Viterbi algorithm, we compute the branch metrics and then the cumulated metrics. At time 0, the cumulated metric is initialized to 0. After calculating the cumulated metrics on the trellis diagram, we obtain the estimated codeword: ĉ = [1110010] which corresponds to the bold path on the trellis diagram given in Figure 3.17. We have been able to recover the initial codeword since only one error occurs in the transmission.

**Figure 3.14**. *Cumulated metric calculation after the reception of the* 1^st bit of the word r. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

3.4.4. Soft input decoding of binary linear block codes

The Viterbi algorithm is mainly used for soft input decoding of binary linear block codes. It allows us to determine the ML sequence ĉ while avoiding to compute the cumulated metrics associated with each possible transmitted sequence. Compared to the hard input decoding, the branch metrics are square Euclidian distance (y_i − x_i)² between the received sample y_i and the symbol associated with the considered branch x_i.

**Figure 3.15**. *Cumulated metric calculation after the reception of the* 4^th bit of the word r. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip

We consider again a point-to-point transmission over an AWGN channel and using the Hamming code C₃ (7, 4). Let u = [1110] be the information code, the associated codeword c = Gu = [1110010] and the transmit sequence x = [+1 + 1 + 1 − 1 − 1 + 1 − 1]. The received sequence after filtering and sampling is y = [−0.2 + 0.91.1 − 1.3 + 0.4 + 2.5 − 0.7]. The branch metrics associated with this received sequence are given in the Table 3.15.

After calculating the cumulated metrics on the trellis diagram, we obtain the estimated transmitted sequence: and the associated codeword: ĉ = [1110010] which corresponds to the bold path on the trellis diagram given in Figure 3.21. We have been able to recover the initial codeword. We can also check that .

**Figure 3.16**. *Cumulated metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip*

**Figure 3.17**. *Determination of the estimated sequence. Fora color version of the figure, see www.iste.co.uk/leruyet/communications1.zip*

3.4.5. Erasure decoding of binary linear block codes

We have seen in section 1.6.3 that the erasure channel is often used to model the packet loss in high-level communication protocols. In an erasure channel, a fraction p of the transmitted symbols is not received at the receiver. These symbols are erased and denoted by X.

Different approaches are possible to recover the erased symbols. A first method is to test all the possible words by replacing the erased symbols by 0 and 1. If only one of these word is a codeword, then we are sure that this is the transmitted codeword. If we find no codeword or more than one codeword among the possible words, then the decoding is a failure.

**Figure 3.18**. *Branch metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip*

Example 3.17.– We consider again the Hamming code C₃ (7, 4). Let us assume that the transmitted codeword is c = [1111111] and that the received word is r = [1XX1111]. The set of all possible codewords is {1001111; 1011111; 1101111; 1111111} and their respective error syndrome is {100; 011; 111; 000}. Since only one word of the set is a codeword, we can recover the transmitted message. The decoded codeword is ĉ = [1111111].

**Figure 3.19**. *Cumulated metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip*

A second method consists of solving the system composed of the N − K parity check equations by replacing the binary variables with the associated received bits. As previously, if there exists more than one solution, the decoding is a failure.

**Figure 3.20**. *Cumulated metric calculation. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip*

Example 3.18.– From the three parity check equations of the code [3.57], we have:

[3.85]

**Figure 3.21**. *Determination of the estimated sequence. For a color version of the figure, see www.iste.co.uk/leruyet/communications1.zip*

After solving this system, we have a unique solution

[3.86]

and the decoded word is ĉ = [1111111] as previously.

A third suboptimal method uses the Tanner graph of the code to recover iteratively the erasures. At each iteration, we build a vector e = [e₀ e₁ … e_N₋₁] containing the position of the erasures:

[3.87]

From the vector e, we can compute the number of erased bits i_j for each parity check equation T_j. At each iteration, the decoding algorithm searches for a parity equation with only one erasure. If such an equation exists, then the erased symbol can be simply corrected. This procedure is repeated after updating the number of erased bits i_j for each equation T_j. The algorithm is ended when there is no more erasure or in case of failure, i.e. if i_j > 1 ∀j. This algorithm is described in Figure 3.22 considering the previous example.

**Figure 3.22**. *Iterative decoding on the Tanner graph*

We can define the erasure enumeration function T(x) as follows:

[3.88]

where T_i is the number of non-recoverable erasure patterns of weight i (i.e. containing i erasures). In an erasure channel, the probability to have i erasures is equal to pⁱ(1 − p)^N⁻ⁱ where p is the erasure probability of the erasure channel.

The word error probability can be easily obtained from T(x):

[3.89]

For the Hamming code C₃ (7, 4), the erasure enumeration function calculated using the first and second methods is equal to:

[3.90]

Consequently, the decoder can correct all the patterns composed of one or two erasures. Seven of the patterns composed of three erasures are not corrected and also all the pattern with 4, 5… 7 erasures.

When using the suboptimal iterative method, the erasure enumeration function is slightly different:

[3.91]

While the decoder is able to correct all the words containing one or two erasures, 10 of the patterns composed of three erasures are not corrected. Schwartz and Vardy [SCH 06] have shown that by adding some complementary parity check equations in the Tanner graph, it is possible to correct more patterns and consequently to obtain results closed to the optimal solution. These equations are linear combinations of the parity check equations of the matrix H. Another solution is to use the Viterbi algorithm.

3.5. Performances of linear block codes

3.5.1. Performances of linear block codes with hard input decoding

For the binary symmetric channel with error probability p, we have shown that the probability Pr(r|c) is given by:

[3.92]

where d_H(r, c) is the Hamming distance between the received sequence r and the transmitted codeword c. For a binary symmetric channel, the probability that a received sequence of length N bits contains i errors is the following:

[3.93]

where p is the error probability of the binary symmetric channel.

Consequently, for a linear block code with an error correction capability of e errors, we can derive an upper bound on the word error rate after decoding by summing the probabilities P_Ni corresponding to the cases where the number of errors at the output of the channel is higher than the error correction capability of the code. We have:

[3.94]

The inequality can be replaced by an equality only when a perfect code is used.

3.5.2. Union bound

The union bound is a simple technique to obtain an upper bound on the error probability.

We consider a ML decoder searching the most probable codeword, i.e. the closest codeword according to the Euclidian distance:

[3.95]

The probability of decoding a wrong codeword knowing that the codeword xⁱ was transmitted can be upper bound as follows:

[3.96]

where Λ_i is the decision region associated with the codeword xⁱ and Λ_ij is the decision associated to the codeword xⁱ ignoring the other codewords except x^j.

The probability Pr(y ∈ Λ_ij |xⁱ) that y be closest to x^j than from xⁱ is called the pairwise error probability (PEP) and is noted Pr(xⁱ → x^j):

[3.97]

Knowing that xⁱ has been transmitted, the probability that y is not in the decision region Λ_i is less or equal to the sum of the pairwise error probabilities Pr(xⁱ → x^j) for all j ≠ i.

The word error rate (WER) is given by:

[3.98]

Finally the word error rate can be upper bounded by:

[3.99]

where Pr(xⁱ → x^j) is the pairwise error probability i.e. the probability to decode x^j knowing that xⁱ has been transmitted ignoring the other codewords.

Generally speaking the union bound given in [3.99] is a practical tool to evaluate the performance of coded transmission systems but is only accurate for high signal to noise ratio.

Example 3.19.– Let us consider the set of four codewords x₁,x₂,x₃ and x₄ and their associated decision regions Λ₁, Λ₂, Λ₃ and Λ₄ as shown in Figure 3.23.

**Figure 3.23**. *Decision regions associated with the codewords*

We can upper bound the word error probability assuming that the codeword x¹ has been transmitted as follows:

This example is illustrated in Figure 3.24.

3.5.3. Performances of linear block codes with soft input decoding

We now consider the case of a bipodal modulation transmitted over an AWGN channel. Since we have E_s = RE_b, the amplitude of the coded samples can only be or (where E_b, is the energy per bit and is the code rate). At the output of the matched fiter, we have:

For an AWGN channel, the pairwise error probability Pr(xⁱ → x^j) is given by:

[3.100]

where d(xⁱ, x^j) is the Euclidian distance between the two vectors xⁱ and x^j. The proof of the relation [3.100] is given in the second volume of this book.

Let cⁱ and c^j be the two codewords associated with the vectors xⁱ and x^j. If the Hamming distance between cⁱ and c^j is equal to d_H(cⁱ c^j) = d, the Euclidian distance between xⁱ and x^j will be:

[3.101]

Then we have:

[3.102]

Example 3.20.– Parity check code (3, 2)

The Hamming distance between two codewords is 2 and the code rate is R = 2/3. Consequently, we have:

[3.103]

We will now use the union bound to obtain an upper bound on the word error rate and bit error rate of the ML decoding of a linear block code (N, K) with an AWGN channel. We have shown that the union bound is performed by considering the codewords two by two. Assuming that the codewords are equiprobable, the word error rate can be bounded as follows:

[3.104]

where A_d is the number of codewords with Hamming weight d and d_min is the minimum distance of the block code.

Using the inequality:

[3.105]

it is possible to obtain a first union-Bhattacharyya bound [BEN 99] on the bit error rate:

[3.106]

[3.107]

with:

[3.108]

B_d can also be calculated using the IOWEF function:

[3.109]

In equation [3.106] we can distinguish the contribution of the information word with weight w while in equation [3.107] we have gathered the contribution of the codewords with weight d by introducing the coefficient B_d. The coefficient B_d is the average ratio of non-zero information bits associated with the codewords of weight d.

A tighter bound on the bit error rate [BEN 99] can be obtained from [3.102] and using the following inequality:

Then we have:

Finally, this upper bound on the bit error rate is given as follows:

[3.110]

By keeping only the first terms, we obtain the following upper bound:

[3.111]

The minimum distance d_min is the main parameter at high signal to noise ratio E_b/N₀. However, the term B_d has also a significant impact on the BER.

This bound can be derived using the union bound over the set of information words and taking into account the contribution of the weight w_i of each information word:

[3.112]

By gathering the information words generating codewords with the same weight, we find again the upper bound [3.111].

Example 3.21.– For the parity check code (3, 2), from the WEF A(W, Z) or B(W,D) we obtain:

[3.113]

and

[3.114]

Example 3.22.– For the Hamming code (7, 4), using the WEF A(W, Z) or B(W,D) we obtain:

[3.115]

3.5.4. Coding gain

In Chapter 1, we saw that for an AWGN channel, it is theoretically possible to perform a transmission without error if the ratio E_b/N₀ = 0dB using a code with rate 1/2. The performance difference between a transmission chain using uncoded bipodal modulation and this theoretical limit is 9.6 dB (considering a word error rate of 10⁻⁵). By adding an error correcting code, we can get close to this theoretical limit.

Definition 3.14.– For a given (word, bit or symbol) error rate, we define the coding gain of an error correcting code as the E_b/N₀ difference between the uncoded and coded transmission chain.

In Figure 3.25, we have shown the theoretical performance WER = f(E_b/N₀) (obtained using equation [3.104]) of three different transmission chains using soft input decoding and the performance of the uncoded transmission chain (bipodal modulation). For a word error rate of 10^–5, the coding gain of the parity check code (3, 2), the Hamming code (7, 4) and the Golay code (23, 12) are 1.3 dB, 2.3 dB and 3.8 dB, respectively.

**Figure 3.25**. *Coding gain of different codes*

In the previous section, we have obtained an upper bound on the word error rate:

[3.116]

A reasonable approximation at a high signal-to-noise ratio consists of keeping only the contribution of the codewords with a weight equal to the minimum distance. Using this approximation, we have:

[3.117]

If we do not take into account the term A_dmin the asymptotic coding gain can be given as follows:

[3.118]

If we take into account the number of codewords at the minimum distance A_dmin, we can approximate the coding gain as follows [FOR 98]:

[3.119]

The term 0.2 log₂ (A_dmin) corresponds to the impact on the coding gain of the number of codewords at the minimum distance. The 0.2 factor is related to the slope of the erfc(.) function in the region 10⁻⁴ − 10⁻⁵. This equation is only an approximation.

For example, for the Golay code (23, 12), we have , d_min = 7 and A_dmin = 253 corresponding to a coding gain of about 4 dB.

Instead of the WER, it is also possible to use the word error rate per information bit (WERB) to define the coding gain:

[3.120]

In that case, we have the following relation:

[3.121]

For the previous example, the coding gain is now 4.75 dB.

In Table 3.16, we have shown the coding gain for different error correcting codes.

Table 3.16. Coding gain for different linear block codes

code	(N, K)	R	d_min	A_dmin	GC _asympt	GC_WERB
Hamming	(7, 4)	0.57	3	7	2.3 dB	2.2 dB
Reed–Muller	(8, 4)	0.5	4	14	3	2.6
Hamming	(15, 11)	0.733	3	15	3.4	3.3
Reed–Muller	(16, 5)	0.31	8	30	4	3.5
Golay	(23, 12)	0.52	7	253	5.6	4.7
Golay	(24, 12)	0.5	8	759	6	4.8

Another approach is to use the BER but in that case, we have to compute B_dmin.

3.5.5. Performance comparison of hard and soft input decoders

In Figure 3.26, we show the performance curves WER = f(E_b/N₀) of a transmission chain using a Hamming code (7, 4) and a Golay code (23, 12). The continuous curves have been obtained with a hard input decoder (using equation [3.94]) and the dashed curves have been obtained with a soft input decoder (using equation [3.104]). For the Golay code, we can observe that the soft input decoding brings a gain of 2 dB compared to hard input decoding.

3.6. Cyclic codes

3.6.1. Definition and properties

Note.– In this section, the most significant bit (MSB) of the vectors is on the right.

**Figure 3.26**. *Performance comparison of hard and soft input decoding*

The cyclic codes are a subset of the linear block codes. While for linear block codes, K codewords are required to determine the set of 2^K codewords, for cyclic codes, only one codeword is enough.

The most important linear block codes such as Hamming code, Golay code, BCH and RS codes belong to this class. Due to their properties, the complexity of coding and decoding tasks is reduced. In this section, we will present cyclic codes defined in the finite field , but they can be extended to .

The main property of the cyclic codes is the following: if c = [c₀ c₁ … c_N−₂ c_N−1] is a codeword, then the word obtained by performing a right cyclic shift of one position c’ = [c_N−₁ c₀ … c_N−₃ c_N−₂] is also a codeword.

In order to describe a cyclic code (N, K), it is convenient to associate a polynomial c(p) of degree lower or equal to N − 1 to each codeword c =[c₀ c₁ … c_N−₂ c_N−₁]:

We will show that the properties of the cyclic codes can be obtained easily using the algebra of the polynomials modulo p^N − 1.

Let us compute the polynomial pc(p):

No codeword is associated with this polynomial since its degree is higher than N − 1.

By adding c_N−₁ and substracting c_N−₁, this expression can be rewritten as follows:

Since the polynomial c′(p) associated with the codeword c’ is equal to:

we also have:

Let us compute pc(p) modulo p^N − 1. We obtain:

So, a right cyclic shift of one position is equivalent to the multiplication by p modulo p^N − 1.

More generally, a right cyclic shift of i positions corresponds to a multiplication by pⁱ modulo p^N − 1.

3.6.2. Properties of the cyclic codes

Property.– If c(p) is a polynomial of degree N − 1 associated with a codeword of a cyclic code (N, K), then:

[3.122]

is also a polynomial associated with a codeword.

This relation says that from one polynomial c(p), it is possible to find the set of the 2^K codewords of the cyclic code.

Property.– It is possible to build a cyclic code (N, K) from the generator polynomial denoted as g(p) of degree N − K:

g(p) is the polynomial with minimum degree among the 2^K codewords of the cyclic code.

Property.– The polynomial g(p) of degree N − K should be a factor of p^N−1.

Property.– The set of the 2^K polynomials associated with the 2^K codewords of a cyclic code (N, K) can be obtained by performing the multiplication of g(p) by the 2^K polynomials of degree lower or equal to K − 1.

If we define the polynomial u(p) associated with the information word u with u = [u₀ u₁ … u_k₋₂ u_K−₁]:

We have the following relation between the polynomial u(p) of degree K − 1 and the polynomial c(p) of degree N − 1:

[3.123]

Property.– Each polynomial factor of p^N − 1 can generate a cyclic code.

In Table 3.3, we give the decomposition in product of irreducible polynomials of the polynomials with the form p^2m−¹ − 1 for m ≤ 5. The irreducible polynomials of this table are the main polynomials used for the construction of cyclic codes. For example, p⁷ − 1 can be decomposed into the product of three irreducible polynomials. We can build the three cyclic codes for N = 7 using these irreducible polynomials:

– a code (7, 6) with g(p) = 1 + p;
– a code (7, 4) with g(p) = 1 + p + p³;
– a code (7, 4) with g(p) = 1 + p² + p³.

The two obtained cyclic codes (7, 4) are the Hamming codes considered previously.

Example 3.23.– Let us consider the Hamming code built from the generator polynomial g(p) = 1 + p + p³. The 16 codewords are obtained by multiplying the polynomials associated with the information word with g(p). The list of these 16 codewords is given in Table 3.17.

Table 3.17. List of codewords for the Hamming code (7, 4) built with g(p) = 1 + p + p³

u₀	u₁	u₂	u₃	c₀	c₁	c₂	c₃	c₄	c₅	c₆
0	0	0	0	0	0	0	0	0	0	0
1	0	0	0	1	1	0	1	0	0	0
0	1	0	0	0	1	1	0	1	0	0
1	1	0	0	1	0	1	1	1	0	0
0	0	1	0	0	0	1	1	0	1	0
1	0	1	0	1	1	1	0	0	1	0
0	1	1	0	0	1	0	1	1	1	0
1	1	1	0	1	0	0	0	1	1	0
0	0	0	1	0	0	0	1	1	0	1
1	0	0	1	1	1	0	0	1	0	1
0	1	0	1	0	1	1	1	0	0	1
1	1	0	1	1	0	1	0	0	0	1
0	0	1	1	0	0	1	0	1	1	1
1	0	1	1	1	1	1	1	1	1	1
0	1	1	1	0	1	0	0	0	1	1
1	1	1	1	1	0	0	1	0	1	1

Let u = [1 1 1 0] (notation MSB on the right) and its associated polynomial u(p) = 1 + p + p². The polynomial c(p) is given by:

and the associated codeword c = [1000110].

The construction of the generator matrix from the generator polynomial is straightforward. We have seen previously that the generator matrix of a linear block code can be obtained from K independent codewords. In the case of the cyclic codes, we can choose the codewords associated with the following polynomials:

For example, for the Hamming code (7, 4) with g(p) = 1 + p + p³, the generator matrix is the following:

[3.124]

This generator matrix is not systematic. From a practical point of view, it is advisable to have a systematic code. Let the information word be u = [u₀ u₁ … u_K−₂ u_K−₁] and u(p) = u₀+u₁p+…+u_K−2p^K−2+u_K−₁p^K−¹ its associated polynomial. We multiply u(p) by p^N−K:

The polynomial c(p) associated with a codeword with a systematic form c = [c₀ c₁ … c_N₋₁] = [c₀ c₁ … c_N_−K−1 u₀ u₁ … u_K₋₂ u_K₋₁] can be written as:

[3.125]

[3.126]

Let us divide p^N−K u(p) by g(p). We obtain:

where q(p) is the quotient and t(p) is the remainder of the division of degree less than N − K.

To summarize, to obtain a systematic codeword c(p) = q(p)g(p) we have to:

– multiply the polynomial u(p) by p^{N − K};
– perform the division of p^{N − K}u(p) by g(p) to obtain the remainder t(p);
– add the remainder t(p) to p^{N − K}u(p):

Example 3.24.– Let us consider the information word u = [1 1 1 0] The polynomial associated with u is u(p) = 1 + 1p + 1p² + 0p³. We have:

The remainder of the division of p³ + p⁴ + p⁵ by g(p) = 1 + p + p³ is p. So the polynomial c(p) is given by:

The associated codeword is the following:

This notation is slightly different from the systematic form that we have presented in section 3.3 but we find the same codeword as the one obtained using the systematic generator matrix [3.8].

3.6.3. Error detection using CRC and automatic repeat request

In order to obtain a high reliability transmission, it is possible to implement an automatic repeat request (ARQ) using acknowledgement (ACK)/negative acknowledgment (NACK) under the condition to have a return channel.

To detect the presence of errors in the received word, most of the protocols are using cyclic redundancy check (CRC). This technique is based on a cyclic code (N, K).

Let g(p) be the generator polynomial of a CRC. From the information polynomial u(p), the polynomial associated with the codeword c(p) is built in a systematic form:

At the reception, we divide the received word by g(p). If the remainder of the division is zero, then the received word is a codeword. We will then assume that it is the codeword that has been transmitted. If the remainder of the division is non-zero, then the received word is not a codeword and there are one or more errors in the received word. The receiver sends back a message ACK in the event of success. In case of failure, the receiver sends back a message NACK and the codeword is retransmitted.

Table 3.18 gives some of the most common generator polynomials used by the different systems.

The two classical protocols in ARQ systems are the stop-and-wait protocol and the go-back-N protocol.

The stop-and-wait protocol is the simplest protocol but is not very efficient. The receiver sends an ACK after each frame and the transmitter waits for an ACK or NACK before transmitting a new frame or retransmitting the previous frame. The stop-and-wait ARQ protocol is illustrated in Figure 3.27(a).

Table 3.18. Generator polynomials for CRC

CRC name	Generator polynomial	System
CRC4	g(p) = p4 + p + 1	G.704
CRC5	g(p) = p⁵ + p² + 1	USB
CRC6	g(p) = p⁶ + p + 1	G.704
CRC8-CCITT	g(p) = p⁸ + p² + p + 1	ATM, RNIS
CRC16	g(p) = p¹⁶ + p¹⁵ + p² +1	USB, ANSI
CRC16-CCITT	g(p) = p¹⁶ + p¹² + p⁵ + 1	X25, SDLC, HDLC
CRC16-DECT	g(p) = p¹⁶ + p¹⁰ + p⁸ + p⁷ + p³ + 1	DECT
CRC24	g(p) = p²⁴ + p²² + p²⁰ + p¹⁹ + p¹⁸ + p¹⁶ + p¹⁴ + p¹³ +p¹¹ +p¹⁰ +p⁸ +p⁷ +p⁶ +p³ +p+1
CRC32	g(p) = p³² + p²⁶ + p²³ + p²² + p¹⁶ + p¹² + p¹¹ + p¹⁰ + p⁸ + p⁷ + p⁵ + p⁴ + p² + p + 1	V.42, Ethernet, MPEG2

In the go-back-N protocol, the transmitter is allowed to consecutively transmit a maximum number of N_f frames gathered in a window without receiving an ACK. Whenever the transmitter receives a NACK associated with the frame i, it stops transmitting new frames, and then proceeds to the retransmission of the frame i and of the N_f − 1 next frames. At the receiving end, the receiver discards the erroneously received frame i and all the subsequent N_f − 1 previously received frames no matter whether they were correct or not. Transmissions are repeated until the frame i is well received and the process is repeated again. The go-back-N protocol is illustrated in Figure 3.27(b) for N_f = 4.

To improve the reliability of the ARQ systems, these classical ARQ schemes can be applied in conjunction with error correcting codes. These schemes are called hybrid automatic repeat request (HARQ).

In a type I HARQ protocol, the error correcting code allows to reduce the frequency of retransmission by correcting the most frequent error patterns. When the receiver cannot decode the frame, the receiver requests a retransmission and discards the corrupted frame.

**Figure 3.27**. *a) Stop-and-wait protocol b) go-back-N protocol with N_f* = 4

If the receiver can buffer the previously received frames, the optimal solution is to use maximum ratio combining (MRC) to combine these multiple frames. This version with MRC is denoted as ”chase combining” (CC) scheme [CHA 73].

In the type II HARQ scheme also called incremental redundancy (IR) [MAN 74] when a NACK is received, instead of sending the same coded frames, the transmitter sends additional coded frames. The different received frames are exploited to provide a stronger error correction capability. The IR protocol is described as follows:

1) Compute the codeword from the information word using a linear block code (N, K);

2) Divide the codeword into M subcodewords of size N_k where N = and store the complete codeword (systematic and parity parts) for potential transmissions;

3) Initialize m = 1;

4) Transmit the m^th subcodeword;

5) At the reception, attempt to decode the code using all the symbols received so far. If the information is correctly decoded, send an ACK to the transmitter. Else, m = m +1 and send a NACK to the transmitter and return to step 4). If the information sequence is not successively decoded when m = M, then failure of the transmission.

The rate of the code after the m^th transmission is and the average rate is equal to:

[3.127]

where S_k is the event “data correctly decoded” during the k^th slot. is the complement of S_k.

The complexity of the IR protocol is higher than the other ARQ systems but its performance is significantly better.

It should be said that the implementation of the ARQ systems is not always possible in particular in broadcast communication systems (using satellite, terrestrial television, etc.). Another drawback of the ARQ system is the delay due to the retransmissions.

3.6.4. Encoding of cyclic codes

In this section, we will study the hardware implementation of the cyclic encoders. We will show that the cyclic encoder is only composed of modulo 2 adders and shift registers.

3.6.4.1. Implementation of a division of polynomials

Let us consider the division of a dividend polynomial a(p) by a divisor polynomial g(p) of degree d in the finite field . The remainder t(p) and the quotient q(p) of this division are given by the relation:

[3.128]

Example 3.25.– Division of 1 + p² + p⁶ by 1 + p + p².

When we analyze the different steps to compute this division, we can make the following comments:

– During the division, the degree of the dividend decreases: the modifications of the dividend are performed from the left to the right (MSB s toward least significant bits).
– For a divisor of degree d, at each iteration, the modifications of the dividend concern the left (d + 1) terms. The other terms can be temporarily ignored.
– When a modification of the dividend occurs, it is an addition term by term of the d + 1 coefficients of the divisor.

From these comments, we can deduce a hardware scheme composed of shift registers to implement this division. The number of shift registers is equal to the degree of the divisor. The hardware structure is given in Figure 3.28 for the considered example.

The bits enter in the structure starting by the MSB. At each positive transition of the clock, a new coefficient of the polynomial a(p) enters in the structure. After d clock transitions, the first non-zero coefficient of the quotient occurs at the output of the last shift register. This coefficient is multiplied by g(p) and then subtracted like in a classical division.

**Figure 3.28**. *Hardware structure of the division by* 1 *+p +* p²

The sequencing of the division calculated previously is given in Figure 3.29 and in Table 3.19. We can check that at each time, the internal registers have only the part of the dividend. The quotient is obtained from the second clock transition at the output of the register s₂, q(p) = 1p⁴ + 1p³ + 0p² + 1p + 0. At the last positive clock transition, the remainder of the division t(p) = 1p + 1 is available at the output of the shift registers.

**Figure 3.29**. *Sequencing of the division* by 1 + p + p²

3.6.4.2. Hardware structures of cyclic encoder

In the case of a cyclic code (N, K), we have seen that before calculating the remainder t(p) of the division, it is necessary to premultiply the polynomial associated with the information word u(p) by p^N−K. The multiplication by p^N−K is equivalent to adding p^N−K zeros.

Table 3.19. Sequencing of the division

Input	Clock transition	s₁	S₂
	0	0	0
MSB 1	1	1	0
0	2	0	1
0	3	1	1
0	4	1	0
1	5	1	1
0	6	1	0
LSB 1	7	1	1

Example 3.26.– Let us consider again the Hamming cyclic code (7, 4) with g(p) = 1 + p + p³.

The hardware structure of the encoder is given in Figure 3.30. As previously, the bits are introduced into the encoder MSB first.

**Figure 3.30**. *Hardware structure of the Hamming encoder with* g(p) = 1 + p + p³

We consider the information word u = [1 1 1 0] associated with the polynomial u(p) = 1 + p + p². After premultiplication, we have p³u(p) = p³ + p⁴ + p⁵ and the word transmitted to the divisor is [0001110].

Table 3.20. Hamming encoding

Input	Clock transition	s₁	s₂	s₃
	0	0	0	0
MSB 0	1	0	0	0
1	2	1	0	0
1	3	1	1	0
1	4	1	1	1
0	5	1	0	1
0	6	1	0	0
LSB 0	7	0	1	0

We find the same remainder as the one previously computed t(p) = 0 + 1p + 0p². Using this structure, we need seven clock transitions to obtain the three bits of the remainder [0 1 0] as shown in Table 3.20.

It is possible to reduce the number of clock transitions by exploiting the fact that the last N − K bits of the polynomial p^N−Ku(p) are zeros. Instead of modifying the most left N − K + 1 coefficients at each iteration, we can decompose the division into K successive divisions.

Let us consider the previous example to illustrate this principle: u(p) = 1 + p + p² and g(p) = 1 + p + p³.

The associated hardware structure is given in Figure 3.31 and the sequencing of the encoder is given in Table 3.21.

**Figure 3.31**. *Hardware structure of the Hamming encoder with premultiplication for* g(p) = 1 + p + p³

Table 3.21. Sequencing of the Hamming encoder after premultiplication

Input	Clock transition	s₁	s₂	s₃
	0	0	0	0
MSB 0	1	0	0	0
1	2	1	1	0
1	3	1	0	1
1	4	0	1	0

This structure allows us to avoid N − K clock transitions corresponding to the last N − K bits of the polynomial p^{N − K}u(p).

We can now draw the complete hardware structure of the encoder for the Hamming code (7, 4) defined by the polynomial g(p) = 1 + p + p³ as shown in Figure 3.32.

**Figure 3.32**. *Complete hardware structure of the Hamming* *encoder for* g (p) = 1 + p + p³

The encoding is composed of two phases:

– the switch P₀ is closed and P₁ is in position A. K = 4 clock transitions are applied to send the K first bits of the codeword and calculate the remainder t(p);
– the switch P₀ is open and P₁ is in position B. N−K = 3 clock transitions are applied to send the last N − K bits of the codewords (corresponding to the remainder).

3.6.4.3. Trellis diagram of the cyclic code

We saw in section 3.3.9 how to obtain a trellis diagram of a linear block codes. From the hardware structure of the encoder, it is also possible to directly determine the trellis diagram of a cyclic code. The trellis diagram of the Hamming code defined by the polynomial g(p) = 1 + p + p³ is given in Figure 3.33.

**Figure 3.33**. *Trellis diagram of the Hamming code defined by the* *polynomial g*(p) = 1 + p + p³

The maximum number of states of this trellis diagram is equal to the number of states of the hardware structure i.e. 2³ = 8 states. This trellis diagram is the same as the one obtained in section 3.3 if we reverse the order of the N bits. Indeed, the first three sections are related to the parity bits while the last four sections correspond to the information bits.

3.6.5. Decoding of cyclic codes

The decoding of cyclic codes is composed of two phases:

– syndrome calculation;
– error localization.

The syndrome calculation is performed by dividing the polynomial associated with the received word r(p) by g(p):

[3.129]

The syndrome is the remainder of this division:

– if s(p) = 0, then the received word r(p) is a codeword;
– the syndrome s(p) is a polynomial of degree less or equal to N − K − 1.

From the value of s(p) it is possible to estimate the error vector ê(p).

In Figure 3.34, we show the hardware structure of the decoder associated with the Hamming code (7, 4) defined by the polynomial g(p) = 1 + p + p³.

**Figure 3.34**. *Hardware structure of the decoder for the Hamming code with* g(p) = 1 + p + p³

The decoding logic associates an estimated error vector ê(p) to each of the values of the syndrome s(p). Then, we add this error vector to the received vector r(p):

[3.130]

The complexity of this decoder is related to this decoding logic. This solution is interesting when the number of errors to be corrected is small.

3.6.5.1. Correction of a single error

Let us consider a single error e(p) = p^j. The associated syndrome has the following specific form:

[3.131]

In Figure 3.35 we show the hardware decoding structure for the Hamming code (7, 4) defined by the generator polynomial g(p) = 1 + p + p³.

**Figure 3.35**. *Hardware structure of the decoder for the Hamming code (7, 4)*

For this example, the syndrome table between the error vectors and syndromes s(p) is given in Table 3.22.

Table 3.22. Syndrome table

Error vector	e(p)	s(p)
1000000	1	1
0100000	p	p
0010000	p²	p²
0001000	p³	1+ p
0000100	p⁴	p + p²
0000010	p⁵	1+ p + p²
0000001	p⁶	1+ p²
LSB MSB

The result of the multiplication of the syndrome s(p) by p^N^−j is equal to 1:

[3.132]

Consequently, to determine the position of the error, we just have to successively multiply s(p) by p until the result is equal to 1.

Example 3.27.– Let us consider the codeword c= [1 1 1 1 1 1 1] and the error vector e = [0 0 0 0 1 0 0] (e(p) = p⁴, i.e. a single error occurs on the fifth bit from the LSB):

[3.133]

The content of the shift registers after each clock transition is given in Table 3.23.

Table 3.23. Content of the shift registers after each clock transition

Input	Clock transition	s₁	s₂	s₃
	0	0	0	0
MSB 1	1	1	0	0
1	2	1	1	0
0	3	0	1	1
1	4	0	1	1
1	5	0	1	1
1	6	0	1	1
LSB 1	7	0	1	1 → s(p) = p + p²
	8	1	1	1 → s(p)p
	9	1	0	1 → s(p)p²
	10	1	0	0 → s(p)p³ = 1

we have j = 7 − 3 = 4

As expected, we have been able to correct this single error after three clock transitions.

3.6.6. BCH codes

3.6.6.1. Definition

The BCH codes are binary cyclic codes introduced by Hocquenghem [HOC 59] and independently by Bose and Ray–Chaudhuri [BOS 60]. These codes have the following parameters:

[3.134]

Since BCH code is a cyclic code, it is defined by its generator polynomial g(p). The roots of g(p) are the 2e consecutive powers of α. Consequently, g(p) is the product of the minimal polynomials associated with α, α², α³,…, α^2e without repetition:

[3.135]

where LCM is the least common multiple. Since in with q = 2^m, αⁱ and α²ⁱ are roots of the same mimimal polynomial, for the determination of g(p), we can only consider the odd powers of α:

[3.136]

Example 3.28.– Let us build a BCH code with N = 15 and able to correct two errors e = 2. Since N = 2^m − 1, we obtain m = 4 and the minimal distance is higher or equal to 5. Its generator polynomial g(p) is as follows:

From the list of minimal polynomial for , we obtain:

Since the degree of g(p) is N − K = 8, the obtained code will be a (15, 7) BCH code. Its minimum distance is exactly equal to 5.

Table 3.24 gives the generator polynomials of the BCH codes correcting up to 3 errors and N ≤ 63.

**Table 3.24**. *List of generator polynomials of BCH codes with N ≤* 63

We know that α, α²,…, α^2e are roots of the generator polynomial g(p) for a BCH code. Consequently, each codeword will also have α, α², α³, …, α^2e as roots.

This property can also be written as follows:

[3.137]

Using the matrix notation, this relation can be also written as:

[3.138]

Since we have the relation cH^T = 0 between the parity check matrix and the codewords, the parity check matrix is equal to:

[3.139]

It is possible to demonstrate that this code can correct e errors.

Example 3.29.– The lines of the parity check matrix [3.139] corresponding to the same mimimal polynomial does not need to be repeated. Finally, we obtain for the BCH code (15, 7) defined with g(p) = 1 + p⁴ + p⁶ + p⁷ + p⁸, the following truncated parity check matrix by only keeping the first and third lines:

[3.140]

In order to obtain a binary representation of the parity check matrix H, we have to replace each power of α by its binary representation as shown in section 3.2.4. We have:

[3.141]

3.6.6.2. Decoding of BCH codes

The hard input decoding of BCH codes can be divided into three phases:

– calculation of the 2e components of the syndrome s = (s₁s₂ … s_2e);
– determination of the error-location polynomial σ(p);
– search of the roots of the polynomial σ(p).

Syndrome calculation

Let be the transmitted codeword, be the received word and be the error vector. Then we have:

[3.142]

The first phase consists in calculating the syndrome s. By definition, the syndrome is equal to the product of the received vector r by the transpose parity check matrix H^T :

[3.143]

The 2e elements of the syndrome can be computed as follows:

[3.144]

When there is no error (e(p) = 0), the 2e elements of the syndrome are zero.

Let us assume that the error vector is composed of v errors (with v ≤ e):

[3.145]

where 0 ≤ j₁ ≤ j₂ … ≤ j_v ≤ N − 1.

From [3.144], we obtain the following system of equations:

[3.146]

By defining β_k = α^jk 1 ≤ k ≤ v, the system of equations becomes:

[3.147]

We define the following polynomial:

[3.148]

This polynomial σ(p) is called the error-location polynomial since the inverse of the roots of σ(p) are equal to β_k = α^jk and since it allows us to localize the position of the errors j_k.

The coefficients σ_k of the polynomial σ(p) satisfy the following equations:

[3.149]

In order to determine the coefficients σ_k, we will use the so-called Newton relation as follows:

[3.150]

and for i > v:

It should be said that in the binary case, iσ_i = 0 if i is even.

Determination of the error-location polynomial using the matrix approach

In the matrix approach, we solve directly the Newton system [3.150] to determine the error-location polynomial σ(p).

In the binary case, the Newton system [3.150] can be written as follows:

[3.151]

For example, if v = 2, we have the following system:

[3.152]

and the solutions are:

The matrix approach can be summarized as follows. We first assume that v = e and we try to solve the Newton system [3.151] by replacing v with e. If v is equal to e or e − 1, a unique solution will be obtained. If v < e − 1, then the system will have no solution. In that case, we should assume that v = e − 2 and try to solve again the system [3.151] by replacing v with e − 2. We repeat this procedure until we obtain a solution. The matrix approach is interesting when the capability of the code is limited (typically when e ≤ 3).

We will now study a more general algorithm: the Berlekamp–Massey algorithm.

Determination of the error-location polynomial using the Berlekamp-Massey algorithm

This algorithm has been proposed by Berlekamp [BER 68] and developed by Massey [MAS 69]. In this book, we will only describe the algorithm without providing the proofs.

The first phase is to determine a polynomial σ⁽¹⁾(p) of minimal degree satisfying the first inequality of the Newton system [3.150]. If the coefficients of the polynomial also satisfy the second inequality of the system, we write σ⁽²⁾ (p) = σ⁽¹⁾ (p). Otherwise, we add a correction term to σ⁽¹⁾(p) to obtain σ⁽²⁾ (p). The polynomial σ⁽²⁾ (p) is then the polynomial with minimum degree satisfying the two first equalities of the Newton system. We repeat this procedure to obtain σ⁽³⁾(p), σ⁽⁴⁾(p) until σ^(2e)(p). This last polynomial is called the error-location polynomial σ(p).

Let be the polynomial of degree l_μ determined at the μ^th phase and verifying the μ first equations. To determine σ^(μ+1) (p), we will first calculate d_μ as follows:

If d_μ = 0, then σ^(μ+1) (p) = σ^(μ) (p). Else, we go back to iteration ρ such that d_p ≠ 0 and ρ − l_p be maximum (with l_p degree of the polynomial σ^(p) (p)). We then obtain:

Example 3.30.– Let us consider the (15, 7) BCH code defined by the generator polynomial g(p) = 1 + p⁴ + p⁶ + p⁷ + p⁸ and that is able to correct up to two errors.

We assume that the null codeword c(p) = 0 has been transmitted and that the received word is r(p) = p⁴ + p⁹

We start by computing the elements of the syndrome:

[3.153]

Since this example is rather simple, we can determine the error-location polynomial σ(p) using the matrix approach. We obtain:

[3.154]

We will now perform the Berlekamp–Massey algorithm.

– for μ = 0. We initialize σ⁽⁰⁾ (p) = 1 and we calculate d₀:

We obtain the partial Table 3.25.

Table 3.25. Partial table for μ = 0

μ	σ^(μ) (p)	d_μ	l_μ	μ− l_μ
−1	1	1	0	−1
0	1	α¹⁴	0	0

Since d₀ is non-zero, we must add a correcting term to σ⁽⁰⁾ (p) to obtain σ⁽¹⁾(p):

[3.155]

– for μ = 1. We calculate d₁.

[3.156]

We obtain the partial Table 3.26 for μ = 1.

Table 3.26. Partial table for μ = 1

μ	σ⁽^μ⁾ (p)	d_μ	l_μ	μ−l_μ
hline − 1	1	1	0	−1
0	1	α¹⁴	0	0
1	1+ α¹⁴p	0	1	0

Since d₁ is zero we have:

– for μ = 2. We calculate d₂:

[3.157]

We obtain the partial Table 3.27 for μ = 2.

Table 3.27. Partial table for μ = 2

μ	σ^(μ) (p)	d_μ	l_μ	μ− l_μ
−1	1	1	0	−1
0	1	α¹⁴	0	0
1	1+ α¹⁴p	0	1	0
2	1+ α¹⁴p	α¹²	1	1

Since d₂ is non-zero, we add a correcting term to σ⁽²⁾ (p) to obtain σ⁽³⁾ (p):

[3.158]

– for μ = 3. We calculate d₃:

[3.159]

The final table is given in Table 3.28.

Table 3.28. Final table

μ	σ^(μ) (p)	d_μ	l_μ	μ− l_μ
−1	1	1	0	−1
0	1	α¹⁴	0	0
1	1+ α¹⁴p	0	1	0
2	1+ α¹⁴p	α¹²	1	1
3	1+ α¹⁴p + α¹³p²	0	2	1

Since d₃ is zero, we have σ⁽⁴⁾(p) = σ⁽³⁾(p) and we finally obtain the following error-location polynomial σ(p):

We must now determine the positions of the errors.

Search of the roots of the error-location polynomial

We have seen previously that the inverse of the error-location polynomial roots are equal to β_k and consequently allow us to localize the positions of the errors j_k.

If the degree of σ(p) is equal to 1 or 2, the roots can be computed directly. When the degree of σ(p) is higher than 2, we can test each power of α but it is not an efficient solution. A practical solution is to apply the Chien search method [CHI 64].

– Initialization:

– for i = 1,…, N:

If then there is an error in position N − i

The last line can be explained by the relation .

Example 3.31.– We can check that:

[3.160]

The roots of σ(p) are equal to α¹¹ and α⁶. The inverse of the roots are β₁ = α¹⁵⁻¹¹ = α⁴ and β₂ = α¹⁵⁻⁶ = α⁹. We deduce that the errors are situated on position j₁ = 4 and j₂ = 9.

We can now apply the Chien search method to localize the position of the errors. The results are given in Table 3.29.

We find again that the errors are located in positions j₁ = 4 and j₂ = 9.

Table 3.29. Finding the positions of the errors using the Chien search method

i	Q₁	Q₂	Σ_i Q_k
0	α¹⁴	α¹³	α²
1	1	1	0
2	α	α²	α⁵
3	α²	α⁴	α¹⁰
4	α³	α⁶	α²
5	α⁴	α⁸	α⁵
6	α⁵	α¹⁰	1 → j₁ =15 − 6=9
7	α⁶	α¹²	α⁴
8	α⁷	α¹⁴	α
9	α⁸	α	α¹⁰
10	α⁹	α³	α
11	α¹⁰	α⁵	1 → j₂ =15 − 11 = 4

3.6.7. Reed–Solomon codes

RS codes are constructed over the finite field with q = 2^m where m is an integer. The 2^m − 1 non-zero elements of the field are the successive powers of the primitive element α root of α^2m−1 − 1 = 0. Each minimum polynomial associated with a non-zero element αⁱ(i = 0,1,… , 2^m − 2) can be written using the form m_i(p) = p − αⁱ.

The generator polynomial g(p) of a RS (N, K) code of length N = 2^m − 1 and dimension K is the product of the N − K minimum polynomials m_i(p) with i = 1,2,…, N − K:

[3.161]

Like for the other cyclic codes, the degree of g(p) is N − K. The minimum Hamming distance d_min of these codes is equal to N − K + 1. Consequently, the RS codes reach the Singleton bound.

Compared to the binary codes, the correction capability of RS codes is e = symbols composed of m bits, i.e. a maximum of me bits. For example, the (255, 239,17) RS code in can correct up to 8 octets independently of the number of erroneous bits per octet. This code will be able to correct up to 64 bits if the erroneous bits are localized inside only 8 octets. The generator polynomial of the (255, 239) RS code in from the primitive polynomial p⁸ + p⁴ + p³ + p² + 1 is the following:

[3.162]

The word error rate is equal to:

[3.163]

where SER_I is the symbol error rate at the input. Consequently, the symbol error rate SER_O at the output of the hard input RS is the following:

[3.164]

The performance curve SER_O = f(SER_I) for the (255, 239,17) RS code is given in Figure 3.36.

The RS is decoded using the same methodology as the BCH codes. We need however to add one more operation consisting in evaluating the so-called amplitude of the errors, i.e. to determine the position of the erroneous bits within the erroneous bits.

3.7. Applications

The first field of applications of the block codes was spatial communications. The (32,6,16) Reed–Muller block codes with soft input decoders have been used in 1969 and 1977 for the Mariner and Viking missions (Mars).

**Figure 3.36**. *SER_O* = f(*SER_I*) *for the* (255,239,17) *Reed–Solomon code*

Among the other applications of block codes, we can quote the (8, 4) extended Hamming code that was used for teletext and the (273, 191) difference set cyclic code used for digital audio broadcast (DAB).

The Reed–Solomon codes are well adapted to the correction of bursts of errors in communication systems. The most popular RS codes are the (255, 239, 17), (255, 223, 33) and (255, 251, 5) in . This last code is part of the cross interleave Reed-Solomon code (CIRC) used for compact disc (CD) application. The CIRC is composed of a (28, 24, 5) RS code, an interleaver and a (32, 28, 5) RS code [WIC 99]. The (28, 24) RS code is built from a (255, 251, 5) RS code. For that, we add 227 null octets to the 24 information octets in order to obtain a total of 251 octets, then apply a (255, 251, 5) RS code. After coding, we remove the 227 null octets to obtain a codeword composed of 28 octets including 4 redundancy octets. The same principle is applied for the (32, 28, 5) code and for the decoding of these codes.

For the CD-ROM, a (1170, 1032) product code is used to correct large bursts of errors. It is composed of line RS(26, 24) codes and vertical RS(45, 43) codes in . We will come back to these concatenated codes in Chapter 5.

3.8. Exercises

3.8.1. Exercise 1: repetition code

We consider a binary equiprobable source and a BSC channel with p = 10⁻³. To limit the noise effect, we use a repetition code: a “0” is encoded by the codeword [0 0 0] and a “1” is encoder by the codeword [1 1 1].

1) Calculate the probabilities that 1, 2 or 3 errors occur during the transmission of a codeword.

2) We use the majority logic decoding (the decoded bit will be “1” if we receive at least two “1”). A received word is correctly decoded if two of the three bits of the word are without error. Compute the new error probability.

We generalize this principle: “0” is encoded by 2m + 1 “0”

“1“ is encoded by 2m + 1 “1”

3) What is the minimum Hamming distance between two codewords when m = 2, then for any m?

4) As previously, we use the majority logic decoding. Deduce the maximum number of errors that we can correct.

5) What is the word error probability P_e after decoding? We will write this probability in function of m. We can show that when p << 1, the error probability can be approximated by P_e ≈ αp^j. We will give the expressions of α and j in function of m. Do the numerical application for m = 1, m = 2 and m = 3.

6) We want to achieve a word error probability after decoding and correction equal or better than P_e = 5.10⁻⁷. What is the minimum value of m to reach this performance?

3.8.2. Exercise 2: error probability on a binary symmetric channel

We consider a binary source and a binary symmetric channel with a probability transition p equal to 0.1. We have seen that the capacity of this channel is 0.52 Shannon/symb.

1) Determine the probability that a codeword of 23 bits is received without error at the output of the channel.

2) Perform the same calculation for 1, 2 and 3 errors.

3) We use a Golay code (23, 12) able to correct up to three transmission errors. Determine the word error probability after decoding.

4) Comment your results with respect to the capacity of the BSC channel.

3.8.3. Exercise 3: re-emission strategy

We transmit a binary message or word of length N with equiprobable bits over a binary symmetric channel.

Let p be the conditional probability p = Pr(Y = 0|X = 1) = Pr(Y = 1|X = 0).

1) Determine the probability to have no error during the transmission message.

2) Determine the probability to have exactly one error during the transmission message.

3) Determine the probability to have at least one error during the transmission message.

4) Perform the numerical application for p = 0.1 and N = 12.

If the message is erroneous, we repeat it until the message is received without errors.

5) What is the probability that the message will be correct without reemission?

6) What is the probability that the message will be correct after 1, 2, 3, …, n re-emissions?

The repetition of the message three times allows us to use the majority logic decoding (at each position, the decoded bit will be “1” if we receive at least two “1”). We can show that this strategy is equivalent to considering that the message has been transmitted over a binary symmetric channel with conditional probability p′ = 3p².

7) Calculate the probability to decode the message of N bits without error.

Another solution is to use an error correcting code. We consider the Golay code (24, 12) with minimum distance 8.

8) Calculate the word error probability after decoding.

3.8.4. Exercise 4: simplex code (7, 3)

Let us define the generator matrix of the simplex code (7, 3):

1) Transform G into a systematic generator matrix and determine the parity check matrix H.
2) What is the link between this code and the Hamming code (7, 4)?
3) Compute the WEF A(D) and deduce the minimum distance of this code.
4) Build the syndrome table.
5) Show that the codeword corresponding to the information word u = [1 0 1] is orthogonal to H.

3.8.5. Exercise 5: linear block code (6,3)

Let us consider the generator matrix of the following linear blocks code (6, 3):

1) Transform G into a systematic generator matrix and determine the parity check matrix H.
2) Compute the WEF A(D) and deduce the minimum distance of this code.
3) Build the syndrome table.
4) Let us consider the following information word u = [1 0 1]. Determine the corresponding codeword. We assume that e = [0 0 1 0 0 0]. Can the decoder correct this error?

3.8.6. Exercise 6: Hamming code (7,4)

Let us consider the generator matrix of the Hamming code (7, 4):

1) Transform G into a systematic generator matrix and determine the parity check matrix H.
2) Compute the WEF A(D) and deduce the minimum distance of this code.
3) Build the syndrome table.
4) Let us consider the following information word u = [1 1 1 1]. Determine the corresponding codeword. We assume that e = [1 0 0 0 0 0 0]. Deduce the received word r = c + e. Estimate the codeword using the syndrome table.

3.8.7. Exercise 7: Reed–Muller code (8,4)

Let us consider the generator matrix of the Hamming code (7, 4):

1) Determine the systematic generator matrix and the parity check matrix H. We add a redundancy bit c₇ satisfying the following parity check equation: c₇ = c₀ + c₁ + c₂ + c₃ + c₄ + c₅ + c₆. The new code is a first order Reed–Muller (8, 4) or extended Hamming code.
2) Determine the systematic generator matrix G′ of this code.
3) Show that this code is autodual (G′ = H′).
4) Determine the WEF A(D) of the code (8, 4) and deduce its minimum distance.
5) Is it possible to link this code with the trellis diagram given in Figure 3.37?

**Figure 3.37**. *Trellis diagram of the* (8, 4) *Reed–Muller code*

3.8.8. Exercise 8: hard trellis based decoding of block code

Let us consider the trellis diagram given in Figure 3.38 and corresponding to the Hamming code defined by the following systematic generator matrix:

Let u = [1 1 1 1] be an information word

1) Determine the associated codeword.

We consider that an error occurs during the transmission e = [1 0 0 0 0 0 0].

**Figure 3.38**. *Trellis diagram of the (7, 4) code*

2) Determine the received word r = c + e.

3) Perform the hard decoding using the Viterbi algorithm.

3.8.9. Exercise 9: soft trellis based decoding of block code

Let us consider the trellis given in Figure 3.38 and corresponding to the Hamming code defined by the following systematic generator matrix:

Let u = [1 1 1 1] be an information word

1) Determine the associated codeword.
Before the transmission the bits are converted into a bipodal signal bit 0 ⇒ −1 and bit 1⇒ +1. At the output of the AWGN channel after filtering and sampling, the received vector is y = [−0.2 + 0.8 + 1.0 + 1.4 −0.5 + 2.5 + 0.7].
2) Perform the soft decoding using the Viterbi algorithm.

3.8.10. Exercise 10: soft input decoding

We consider the systematic linear block code (3, 2) defined by its parity check equation u₀ + u₁ + c₂ = 0 and a transmission over an additive white Gaussian channel with variance σ² = N₀/2.

1) Give the systematic generator matrix, the list of codewords and the WER Deduce the minimum distance, the error correction capability and the error detection capability.
2) Draw graphically the different words x = [x₀,x₁,x₂] in a three dimension space. Determine the Euclidian distance between two words.

The pairwise error probability is given as:

3) Determine the word error rate after soft decoding by using the union bound.

The trellis diagram associated with this code is given in Figure 3.39.

**Figure 3.39**. *Trellis diagram of the (3, 2) code*

We receive the word y = [+0.9 + 0.1 + 1.4].

4) Apply the soft Viterbi decoding (we will assume that E_s = +1).
5) From the received word, compute the conditional probabilities Pr(u₁ = 1|y) and Pr(u₁ = 0|y) assuming that N₀ = 4.

3.8.11. Exercise 11: comparison of hard and soft decoding

Let us consider a linear block code defined by the following generator matrix:

and its associated trellis diagram given in Figure 3.40.

1) Determine the list of codewords and the WEE Deduce the minimum distance, the error correction capability and the error detection capability.

We use this code for a transmission over an AWGN with variance σ². Before the transmission we convert the bits into a bipodal signal bit 0 ⇒ −1 and bit 1⇒ +1.

We select the information word u = [0 0 0].

2) Determine the associated codeword c and the transmitted vector x = [x₀ x₁ x₂ x₃ x₄ x₅ x₆ x₇].

The error vector is as follows:

3) Determine the received vector y with and without thresholding.
4) Perform a hard input decoding by applying the Viterbi algorithm on the trellis diagram. Conclusion?
5) Prove that for an AWGN, we have:

log (∏p(y_i|x_i)) ∝ − Σ (y_i − x_i)² where the sign ∝ means “proportional to”.

6) Perform a soft input decoding (instead of using the Euclidian metric (y_i − x_i)² we will use the metric |y_i − x_i| to simplify the calculations). Conclusion?

3.8.12. Exercise 12: Hamming code

We consider a transmission over a binary symmetric channel using an Hamming code (15, 11).

1) The generator polynomial of this code is g(x) = 1 + p + p⁴. Justify this choice.
2) The information word is u = [11010010100]. Determine the associated systematic codeword.
3) Give the hardware structure of this coder with premultiplication. Determine the remainder of the polynomial division using the hardware structure.
4) We receive the word r = [011011010101000]. Check that this word is not a codeword.
5) Give the hardware structure of the decoder.

3.8.13. Exercise 13: cyclic code

Let us consider the following generator matrix G:

1) Determine K and N.
2) Transform G in a systematic generator matrix and deduce the parity check matrix H.
3) The minimum distance of the code is 4. Determine the correction and detection error capabilities.

Let us consider the cyclic code defined by the generator polynomial g(p) = (1 + p)(1 + p + p³). The polynomials 1 + p and 1 + p + p³ are irreducible factors of 1 + p⁷.

4) Determine K and N.
5) Let u = [0 0 1] be an information word. Calculate the associated systematic codeword.
6) Give the hardware structure of the encoder.
7) We receive the following word [0 1 0 1 1 1 1]. Check if this word is a codeword.

For this code, each codeword must be divisible by 1 + p + p³ and 1 + p (and consequently by g(p)).

8) Fill in the table by indicating the number of errors corresponding to each case situation.

1+ p + p³	1+ p	Number of errors
divisible	divisible
non-divisible	non-divisible
non-divisible	divisible

We assume now that the decoder will verify only the divisibility by

1+ p + p³

9) Calculate the remainder of the division of the polynomial associated with the receive word by 1 + p + p³.
10) Give the hardware structure of the associated polynomial divisor. Give the output of the shift registers after each clock transition.
11) Deduce the position of the error.
12) Make the link between the error correcting code defined by the systematic matrix generator and the code defined by g(p).

3.8.14. Exercise 14: Hamming code

We consider the Hamming code (7, 4) defined by the generator polynomial g(p) = 1 + p + p³.

1) Let u = [0 1 1 1] be an information word. Calculate the associated codeword c.
2) Draw the hardware structure of the coder with premultiplication.
3) Verify the result obtained in the first question.

3.8.15. Exercice 15: performance of Reed–Solomon code (255,291)

Let us consider the (255, 191) RS code defined in . This code is used for example in the DVB-H standard.

1) Since this code is MDS (d_min = N − K + 1), deduce the error correction capability.
2) What is the maximum and minimum number of bits that this code is able to correct?
3) Give the relation word error rate at the output of the hard decoding decoder in function of the symbol error rate at the input (SER_I).
4) Deduce the relation symbol error rate at the output (SER_O) of the hard decoding decoder in function of the symbol error rate at the input (SER_I).
5) Numerical application; SER_I = 2.10⁻², determine the symbol error rate at the output (we will only consider the first-term in the sum). We can use the following relations :

3.8.16. Exercise 16: CRC

In the radio frequency identification standard (RFID), a CRC of length 5 bits is added at each frame to detect the frame error at the receiver side. The generator polynomial used in this standard is g(p) = p⁵ + p² + 1.

1) Let us consider the information frame of length K = 6 u = [101001]. Determine the five CRC bits to be added in order to build the coded frame. Check that the coded frame is a codeword.
2) Draw the hardware structure of the encoder.
3) Let 10110001001 be the received sequence. Is it a codeword?

3.8.17. Exercise 17: Reed-Solomon code

Let us consider a (255, 239) RS code in .

1) Give the minimum distance and error capability of this error correcting code.
2) Give the relation symbol error rate at the output SER_O of the hard decoding decoder in function of the symbol error rate at the input SER_I.
3) Plot the curve SER_O = f(SER_I).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.