All communication channels contain some degree of noise, namely interference caused by various sources such as neighboring channels, electric impulses, deterioration of the equipment, etc. This noise can interfere with data transmission. Just as holding a conversation in a noisy room becomes more difficult as the noise becomes louder, so too does data transmission become more difficult as the communication channel becomes noisier. In order to hold a conversation in a loud room, you either raise your voice, or you are forced to repeat yourself. The second method is the one that will concern us; namely, we need to add some redundancy to the transmission in order for the recipient to be able to reconstruct the message. In the following, we give several examples of techniques that can be used. In each case, the symbols in the original message are replaced by codewords that have some redundancy built into them.
Consider an alphabet . We want to send a letter across a noisy channel that has a probability of error. If we want to send , for example, then there is a 90% chance that the symbol received is . This leaves too large a chance of error. Instead, we repeat the symbol three times, thus sending . Suppose an error occurs and the received word is . We take the symbol that occurs most frequently as the message, namely . The probability of the correct message being found is the probability that all three letters are correct plus the probability that exactly one of the three letters is wrong:
which leaves a significantly smaller chance of error.
Two of the most important concepts for codes are error detection and error correction. If there are at most two errors, this repetition code allows us to detect that errors have occurred. If the received message is , then there could be either one error from or two errors from ; we cannot tell which. If at most one error has occurred, then we can correct the error and deduce that the message was . Note that if we used only two repetitions instead of three, we could detect the existence of one error, but we could not correct it (did come from or ?).
This example was chosen to point out that error correcting codes can use arbitrary sets of symbols. Typically, however, the symbols that are used are mathematical objects such as integers mod a prime or binary strings. For example, we can replace the letters by 2-bit strings: 00, 01, 10, 11. The preceding procedure (repeating three times) then gives us the codewords
Suppose we want to send a message of seven bits. Add an eighth bit so that the number of nonzero bits is even. For example, the message 0110010 becomes 01100101, and the message 1100110 becomes 11001100. An error of one bit during transmission is immediately discovered since the message received will have an odd number of nonzero bits. However, it is impossible to tell which bit is incorrect, since an error in any bit could have yielded the odd number of nonzero bits. When an error is detected, the best thing to do is resend the message.
The parity check code of the previous example can be used to design a code that can correct an error of one bit. The two-dimensional parity code arranges the data into a two-dimensional array, and then parity bits are computed along each row and column.
To demonstrate the code, suppose we want to encode the 20 data bits . We arrange the bits into a matrix
and calculate the parity bits along the rows and columns. We define the last bit in the lower right corner of the extended matrix by calculating the parity of the parity bits that were calculated along the columns. This results in the matrix
Suppose that this extended matrix of bits is transmitted and that a bit error occurs at the bit in the third row and fourth column. The receiver arranges the received bits into a matrix and obtains
The parities of the third row and fourth column are odd, so this locates the error as occurring at the third row and fourth column.
If two errors occur, this code can detect their existence. For example, if bit errors occur at the second and third bits of the second row, then the parity checks of the second and third columns will indicate the existence of two bit errors. However, in this case it is not possible to correct the errors, since there are several possible locations for them. For example, if the second and third bits of the fifth row were incorrect instead, then the parity checks would be the same as when these errors occurred in the second row.
The original message consists of blocks of four binary bits. These are replaced by codewords, which are blocks of seven bits, by multiplying (mod 2) on the right by the matrix
For example, the message becomes
Since the first four columns of are the identity matrix, the first four entries of the output are the original message. The remaining three bits provide the redundancy that allows error detection and correction. In fact, as we’ll see, we can easily correct an error if it affects only one of the seven bits in a codeword.
Suppose, for example, that the codeword 1100011 is sent but is received as 1100001. How do we detect and correct the error? Write in the form , where is a matrix. Form the matrix , where is the transpose of . Multiply the message received times the transpose of :
This is the 6th row of , which means there was an error in the 6th bit of the message received. Therefore, the correct codeword was 1100011. The first four bits give the original message 1100. If there had been no errors, the result of multiplying by would have been , so we would have recognized that no correction was needed. This rather mysterious procedure will be explained when we discuss Hamming codes in Section 24.5. For the moment, note that it allows us to correct errors of one bit fairly efficiently.
The Hamming [7, 4] code is a significant improvement over the repetition code. In the Hamming code, if we want to send four bits of information, we transmit seven bits. Up to two errors can be detected and up to one error can be corrected. For a repetition code to achieve this level of error detection and correction, we need to transmit 12 bits in order to send a 4-bit message. Later, we’ll express this mathematically by saying that the code rate of this Hamming code is 4/7, while the code rate of the repetition code is . Generally, a higher code rate is better, as long as not too much error correcting capability is lost. For example, sending a 4-bit message as itself has a code rate of 1 but is unsatisfactory in most situations since there is no error correction capability.
The International Standard Book Number (ISBN) provides another example of an error detecting code. The ISBN is a -digit codeword that is assigned to each book when it is published. For example, the first edition of this book had ISBN number 0-13-061814-4. The first digit represents the language that is used; indicates English. The next two digits represent the publisher. For example, is associated with Pearson/Prentice Hall. The next six numbers correspond to a book identity number that is assigned by the publisher. The tenth digit is chosen so that the ISBN number satisfies
Notice that the equation is done modulo . The first nine numbers are taken from but may be , in which case it is represented by the symbol . Books published in 2007 or later use a 13-digit ISBN, which uses a slightly different sum and works mod 10.
Suppose that the ISBN number is sent over a noisy channel, or is written on a book order form, and is received as . The ISBN code can detect a single error, or a double error that occurs due to the transposition of the digits. To accomplish this, the receiver calculates the weighted checksum
If , then we do not detect any errors, though there is a small chance that an error occurred and was undetected. Otherwise, we have detected an error. However, we cannot correct it (see Exercise 2).
If is the same as except in one place , we may write where . Calculating gives
Thus, if a single error occurs, we can detect it. The other type of error that can be reliably detected is when and have been transposed. This is one of the most common errors that occur when someone is copying numbers. In this case and . Calculating gives
If , then the checksum is not equal to , and an error is detected.
This code was used by the Mariner spacecraft in 1969 as it sent pictures back to Earth. There are 64 codewords; 32 are represented by the rows of the matrix
The matrix is constructed as follows. Number the rows and columns from 0 to 31. To obtain the entry in the th row and th column, write and in binary. Then
For example, when and , we have and . Therefore, .
The other 32 codewords are obtained by using the rows of . Note that the dot product of any two rows of is 0, unless the two rows are equal, in which case the dot product is 32.
When Mariner sent a picture, each pixel had a darkness given by a 6-bit number. This was changed to one of the 64 codewords and transmitted. A received message (that is, a string of 1s and s of length 32) can be decoded (that is, corrected to a codeword) as follows. Take the dot product of the message with each row of . If the message is correct, it will have dot product 0 with all rows except one, and it will have dot product with that row. If the dot product is 32, the codeword is that row of . If it is , the codeword is the corresponding row of . If the message has one error, the dot products will all be , except for one, which will be . This again gives the correct row of or . If there are two errors, the dot products will all be , except for one, which will be , or . Continuing, we see that if there are seven errors, all the dot products will be between and , except for one between and or between and , which yields the correct codeword. With eight or more errors, the dot products start overlapping, so correction is not possible. However, detection is possible for up to 15 errors, since it takes 16 errors to change one codeword to another.
This code has a relatively low code rate of , since it uses 32 bits to send a 6-bit message. However, this is balanced by a high error correction rate. Since the messages from Mariner were fairly weak, the potential for errors was high, so high error correction capability was needed. The other option would have been to increase the strength of the signal and use a code with a higher code rate and less error correction. The transmission would have taken less time and therefore potentially have used less energy. However, in this case, it turned out that using a weaker signal more than offset the loss in speed. This issue (technically known as coding gain) is an important engineering consideration in the choice of which code to use in a given application.