Chapter 4
Information-Theoretic Elements

Information theory provides us with fundamental limits on the transmission rates supported by a channel. In this chapter, we analyze these limits for the coded modulation (CM) schemes presented in Chapter 2, paying special attention to bit-interleaved coded modulation (BICM).

This chapter is structured as follows. In Section 4.1, we introduce the concepts of mutual information (MI) and channel capacity; in Section 4.2, we study the maximum transmission rates for general CM systems; and in Section 4.3, for BICM systems. We review their relation and we analyze how they are affected by the choice of the (discrete) constellation. We pay special attention to the selection of the binary labeling and use of probabilistic shaping in BICM. We conclude the chapter by showing in Section 4.5 ready-to-use numerical quadrature formulas to efficiently compute MIs.

4.1 Mutual Information and Channel Capacity

We analyze the transmission model defined by (2.27), i.e., c04-math-0001, where c04-math-0002 is an c04-math-0003-dimensional zero-mean Gaussian vector with covariance matrix c04-math-0004. In this chapter, we are mostly interested in discrete constellations. However, we will initially assume that the input symbols c04-math-0005 are modeled as independent and identically distributed (i.i.d.) continuous random variables characterized by their probability density function (PDF) c04-math-0006. Considering constellations with continuous support (c04-math-0007) provides us with an upper limit on the performance of discrete constellations. As mentioned before, we also assume that the channel state c04-math-0008 is known at the receiver (through perfect channel estimation) and is not available at the transmitter. The assumption of the transmitter not knowing the channel state is important when analyzing the channel capacity (defined below) for fading channels.

The MI between the random vectors c04-math-0009 and c04-math-0010 conditioned on the channel state c04-math-0011 is denoted by c04-math-0012 and given by1

where c04-math-0015 is given by (2.28).

For future use, we also define the conditional MIs

Although c04-math-0018 is the most common notation for MI found in the literature (which we also used in Chapter 1), throughout this chapter, we also use an alternative notation

which shows that conditioning on c04-math-0020 in (4.1) is equivalent to conditioning on the instantaneous signal-to-noise ratio (SNR); this notation also emphasizes the dependence of the MI on the input PDF c04-math-0021. This notation allows us to express the MI for a fast fading channel (see Section 2.4) by averaging the MI in (4.5) over the SNR, i.e.,

4.6 equation

where c04-math-0023 is given by (4.5). Throughout this chapter, we use the notation c04-math-0024 and c04-math-0025 to denote MIs for the additive white Gaussian noise (AWGN) and fading channels, respectively.

The MIs above have units of c04-math-0026 (or equivalently c04-math-0027) and they define the maximum transmission rates that can be reliably2 used when the codewords c04-math-0029 are symbols generated randomly according to the continuous distribution PDF c04-math-0030. More precisely, the converse of Shannon's channel coding theorem states that it is not possible to transmit information reliably above the MI, i.e.,

The channel capacity of a continuous-input continuous-output memoryless channel under average power constraint is defined as the maximum MI, i.e.,

where the optimization is carried out over all input distributions that satisfy the average energy constraint c04-math-0033. Furthermore, we note that c04-math-0034 is independent of c04-math-0035 because we assumed that the transmitter does not know the channel. Because of this, we do not consider the case when the transmitter, knowing the channel, adjusts the signal's energy to the channel state.

MI curves are typically plotted versus SNR, however, to better appreciate their behavior at low SNR, plotting MI versus the average information bit energy-to-noise ratio c04-math-0053 is preferred. To do this, first note that (4.7) can be expressed using (2.38) as

The notation c04-math-0055 emphasizes that the MI is a function of both c04-math-0056 and c04-math-0057, which will be useful throughout this chapter. In what follows, however, we discuss c04-math-0058 as a function of c04-math-0059 for a given input distribution, i.e., c04-math-0060.

The MI is an increasing function of the SNR, and thus, it has an inverse, which we denote by c04-math-0061. By using this inverse function (for any given c04-math-0062) in both sides of (4.14), we obtain

4.15 equation

Furthermore, by rearranging the terms, we obtain

which shows that c04-math-0065 is bounded from below by c04-math-0066, which is a function of the rate c04-math-0067.

In other words, the function c04-math-0068 in (4.16) gives, for a given input distribution c04-math-0069, a lower bound on the c04-math-0070 needed for reliable transmission at rate c04-math-0071. For example, for the AWGN channel in Example 4.1, and by using (4.9) in (4.16), we obtain

In this case, the function c04-math-0073 depends solely on c04-math-0074 (and not on c04-math-0075, as in (4.16)), which follows because c04-math-0076 in (4.9) depends only on c04-math-0077.

The expressions in (4.17) allow us to find a lower bound on c04-math-0078 when c04-math-0079, or equivalently, when c04-math-0080, i.e., in the low-SNR regime. Using (4.17), we obtain

which we refer to as the Shannon limit (SL). The bound in (4.18) corresponds to the minimum average information bit energy-to-noise ratio c04-math-0082 needed to reliably transmit information when c04-math-0083.

For notation simplicity, in (4.14)–(4.16), we consider nonfading channels. It is important to note, however, that because c04-math-0084, exactly the same expressions apply to fading channels, i.e., when c04-math-0085 is replaced by c04-math-0086.

4.2 Coded Modulation

In this section, we focus on the practically relevant case of discrete constellations, and thus, we restrict our analysis to probability mass functions (PMFs) c04-math-0087 with c04-math-0088 nonzero mass points, i.e., c04-math-0089, c04-math-0090.

4.2.1 CM Mutual Information

We define the coded modulation mutual information (CM-MI) as the MI between the input and the output of the channel when a discrete constellation is used for transmission. As mentioned in Section 2.5.1, in this case, the transmitted symbols are fully determined by the PMF c04-math-0091, and thus, we use the matrix c04-math-0092 to denote the support of the PMF (the constellation c04-math-0093) and the vector c04-math-0094 to denote the probabilities associated with the symbols (the input distribution). The CM-MI can be expressed using (4.2), where the first integral is replaced by a sum and the PDF c04-math-0095 by the PMF c04-math-0096, i.e.,

4.19 equation
4.20 equation

where the notation c04-math-0100 emphasizes the dependence of the MI on the input PMF c04-math-0101 (via c04-math-0102 and c04-math-0103, see (2.41)).

For the AWGN channel with channel transition probability (2.31), the CM-MI in (4.21) can be expressed as

For a uniform input distribution c04-math-0106 in (2.42), the above expression particularizes to

c04f001

Figure 4.1 c04-math-0108PSK constellations and AWGN channel: (a) CM-MI and (b) c04-math-0109 in (4.16). The AWGN capacity c04-math-0110 and the corresponding c04-math-0111 are shown for reference

c04f002

Figure 4.2 c04-math-0121PAM constellations and AWGN channel: (a) CM-MI and (b) c04-math-0122 in (4.16). The AWGN capacity c04-math-0123 and the corresponding c04-math-0124 are shown for reference

c04f003

Figure 4.3 CM-MI for 4PSK (4QAM) and for the c04-math-0136QAM constellations in Fig. 2.11 (c04-math-0137) over the AWGN channel. The AWGN capacity c04-math-0138 in (4.9) is also shown

c04f004

Figure 4.4 c04-math-0141PAM constellations and Rayleigh fading channel: (a) CM-MI and (b) c04-math-0142c04-math-0143 in (4.16). The average capacity c04-math-0144 and the corresponding c04-math-0145 are shown for reference

4.2.2 CM Capacity

The CM-MI c04-math-0146 corresponds to the maximum transmission rate when the codewords' symbols c04-math-0147 are taken from the constellation c04-math-0148 following the PMF c04-math-0149. In such cases, the role of the receiver is to find the transmitted codewords using c04-math-0150 by applying the maximum likelihood (ML) decoding rule we defined in Chapter 3. In practice, the CM encoder must be designed having the decoding complexity in mind. To ease the implementation of the ML decoder, particular encoding structures are adopted. This is the idea behind trellis-coded modulation (TCM) (see Fig. 2.4), where the convolutional encoder (CENC) generates symbols c04-math-0151 which are mapped directly to constellation symbols c04-math-0152. In this case, the code can be represented using a trellis structure, which means that the Viterbi algorithm can be used to implement the ML decoding rule, and thus, the decoding complexity is manageable.

The most popular CM schemes are based on uniformly distributed constellation points. However, using a uniform input distribution is not mandatory, and thus, one could think of using an arbitrary PMF (and/or a nonequally spaced constellation) so that the CM-MI is increased. To formalize this, and in analogy with (4.8), we define the CM capacity for a given constellation size c04-math-0153 as

where the optimization over c04-math-0155 and c04-math-0156 is equivalent to the optimization over the PMF c04-math-0157.

The optimization problem in (4.25) is done under constraint c04-math-0158, where c04-math-0159 depends on both c04-math-0160 and c04-math-0161. In principle, a constraint c04-math-0162 could be imposed; however, for the channel in (2.27) we consider here, an increase in c04-math-0163 always results in a higher MI and thus, the constraint is always active.

Again, the optimization result (4.25) should be interpreted as the maximum number of bits per symbol that can be reliably transmitted using a fully optimized c04-math-0164-point constellation, i.e., when for each SNR value, the constellation and the input distribution are selected in order to maximize the CM-MI. This is usually referred to as signal shaping.

The CM capacity for fading channels is defined as

We note that the optimal constellations c04-math-0166 and c04-math-0167, i.e., those solving (4.25) or (4.26), are not the same for the AWGN and fading channels. This is different from the case of continuous distributions c04-math-0168, where the Gaussian distribution is optimal for each value of c04-math-0169 over the AWGN channel and thus, the same distribution yields the capacity of fading channels, cf. Example 4.2.

The joint optimization over both c04-math-0170 and c04-math-0171 is a difficult problem, and thus, one might prefer to solve simpler ones:

The expression (4.28) is typically called geometrical shaping as only the constellation symbols (geometry) are being optimized, while the problem (4.27) is called probabilistic shaping as the probabilities of the constellation symbols are optimized.

Finally, as a pragmatic compromise between the fully optimized signal shaping of (4.25) and the geometric/probabilistic solutions of , we may optimize the distribution c04-math-0174 while maintaining the “structure” of the constellation, i.e., allowing for a scaling of a predefined constellation with a factor c04-math-0175

where c04-math-0177 is the optimal value of c04-math-0178.

c04f005

Figure 4.5 Solution of problem (4.27): optimal PMF for an equidistant 8PAM constellation in the AWGN channel

c04f006

Figure 4.6 Solution of problem (4.28): optimal input distribution for an c04-math-0188-ary constellation in the AWGN channel

c04f007

Figure 4.7 Solution of problem (4.29): optimal PMF for an equidistant 8PAM constellation in the AWGN channel

We conclude this section by quantifying the gains obtained by changing the input distribution. We show in Fig. 4.8 (a)3 the CM-MI c04-math-0196 (8PAM with a uniform input distribution), as well as c04-math-0197 in (4.27), c04-math-0198 in (4.28), and c04-math-0199 in (4.29). These results show that the gains offered by probabilistic and geometric shaping are quite small; however, when the mixed optimization in (4.29) is done (i.e., when the probability and the gain c04-math-0200 are jointly optimized), the gap to the AWGN capacity is closed for any c04-math-0201. To clearly observe this effect, we show in Fig. 4.8 (b) the MI gains offered (with respect to c04-math-0202) by the two approaches as well as the gain offered by using the optimal (Gaussian) distribution. This figure shows that the optimization in (4.29) gives considerably larger gains, which are close to optimal ones for any c04-math-0203.

c04f008

Figure 4.8 (a) The CM-MI c04-math-0204, the capacities c04-math-0205 in (4.27), c04-math-0206 in (4.28), and c04-math-0207 in (4.29) and (b) their respective MI gains with respect to c04-math-0208. The AWGN capacity (a) and the corresponding gain (b) are also shown

4.3 Bit-Interleaved Coded Modulation

In this section, we are interested in finding the rates that can be reliably used by the BICM transceivers shown in Fig. 2.7. We start by defining achievable rates for BICM transceivers with arbitrary input distributions, and we then move to study the problem of optimizing the system's parameters to increase these rates.

4.3.1 BICM Generalized Mutual Information

For the purpose of the discussion below, we rewrite here (3.23) as

4.30 equation
4.31 equation

where the symbol-decoding metric

is defined via the bit decoding metrics, each given by

The symbol-decoding metric c04-math-0213 is not proportional to c04-math-0214, i.e., the decoder does not implement the optimal ML decoding. To find achievable rates in this case, we consider the BICM decoder as the so-called mismatched decoder. In this context, for an arbitrary decoding metric c04-math-0215, reliable communication is possible for rates below the generalized mutual information (GMI) between c04-math-0216 and c04-math-0217, which is defined as

where

We immediately note that if c04-math-0220 and c04-math-0221, we obtain

that is, using symbol metrics matched to the conditional PDF of the channel output, we obtain the CM-MI c04-math-0224 in (4.23). This result is generalized in the following theorem.

The following theorem gives an expression for c04-math-0247 in (4.35) when the decoder uses a symbol metric c04-math-0248 given by (4.32) and an arbitrary bit metric c04-math-0249.

Corollary 4.12 shows that when the symbol metrics are constrained to follow (4.32), the best we can do is to use the bits metrics (4.33). The resulting achievable rates lead to the following definition of the BICM generalized mutual information (BICM-GMI) for fading and nonfading channels as

where the dependence on the constellation symbols c04-math-0266, their labeling c04-math-0267, and the bits' PMF c04-math-0268 is made explicit in the argument of c04-math-0269. The bitwise MIs necessary to calculate the BICM-GMI in (4.54), (4.55) are given by

4.57 equation

At this point, some observations are in order:

  • The MI in (4.22) is an achievable rate for any CM transmission, provided that the receiver implements the ML decoding rule. The BICM-GMI in (4.54) and (4.55) is an achievable rate for the (suboptimal) BICM decoder.
  • The BICM-GMI was originally derived without the notion of mismatched decoding, but based on an equivalent channel model where the interface between the interleaver and deinterleaver in Fig. 2.7 is replaced by c04-math-0272 parallel memoryless binary-input continuous-output (BICO)channels. Such a model requires the assumption of a quasi-random interleaver. Hence, the value of Theorem 4.11 is that it holds independently of the interleaver's presence.
  • No converse decoding theorem exists, i.e., while the BICM-GMI defines an achievable rate, it has not been proved to be the largest achievable rate when using a mismatched (BICM) decoder. This makes difficult a rigorous comparison of BICM transceivers because we cannot guarantee that differences in the BICM-GMI will translate into differences between the maximum achievable rates. However, practical coding schemes most often closely follow and do not exceed the BICM-GMI. This justifies its practical use as a design rule for the BICM. We illustrate this in the following example.
c04f009

Figure 4.9 Throughputs c04-math-0286 obtained for the AWGN channel and a PCCENC TENC with 11 different code rates c04-math-0287 for (a) 4PAM and (b) 8PAM labeled by the BRGC. The CM-MI, BICM-GMI, and AWGN capacity c04-math-0288 in (4.9) are also shown

While the quantities (4.54) and (4.55) were originally called the BICM capacity, we avoid using the term “capacity” to point out that no optimization of the input distribution is carried out. Using (4.56), the BICM-GMI in (4.54) can be expressed as

where to pass from (4.58) to (4.59) we used the law of total probability applied to expectations. Moreover, by using (2.75) and by expanding the expectation c04-math-0291 over c04-math-0292 and then over c04-math-0293, we can express (4.59) as

where to pass from (4.60) to (4.61), we used (2.75) and the fact that the value of c04-math-0296 does not affect the conditional channel transition probability, i.e., c04-math-0297. The dependence of the BICM-GMI on the labeling c04-math-0298 becomes evident because the sets c04-math-0299 appear in (4.61), and as we showed in Section 2.5.2, these sets define the subconstellations generated by the labeling.

For AWGN channels (c04-math-0300), the BICM-GMI in (4.61) can be expressed as

Furthermore, for uniformly distributed bits, c04-math-0302 for c04-math-0303, and thus, the BICM-GMI in (4.62) becomes

where we use c04-math-0305 to denote uniformly distributed bits, i.e., c04-math-0306. In what follows, we show examples of the BICM-GMI for different constellations c04-math-0307 and labelings c04-math-0308 with uniform input distributions.

c04f010

Figure 4.10 BICM-GMI and CM-MI for 8PSK and 16PSK (see Fig. 2.9) over the AWGN channel (a) and the corresponding functions c04-math-0314 in (4.16) (b). The AWGN capacity c04-math-0315 in (4.9) c04-math-0316 (a) and the function c04-math-0317 in (4.17) together with the SL in (4.18) (b) are also shown

c04f011

Figure 4.11 BICM-GMI and CM-MI for 8PAM over the AWGN channel (a) and the corresponding functions c04-math-0330 in (4.16) (b). The BICM-GMI is shown for the four binary labelings in Fig. 2.13. The shadowed regions shows the inequalities (4.7) (a) and (4.16) (b) for the BSGC. The AWGN capacity c04-math-0331 in (4.9) c04-math-0332 (a) and the function c04-math-0333 in (4.17) together with the SL in (4.18) (b) are also shown

c04f012

Figure 4.12 Functions c04-math-0339 in (4.16) for the BICM-GMI for 4PSK and 8PSK over the AWGN channel and for all binary labelings that produce different BICM-GMI curves. The function c04-math-0340 in (4.17) and the SL in (4.18) are also shown

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset