Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 2

Basics of communications

Abstract

This chapter provides a brief introduction to the basics of communications, for the benefit of readers without a background in communications. Information theory and different layers in communication networks, ranging from the physical layer to upper layers, will be introduced. Finally, several typical communication systems will be discussed as examples.

Keywords

Information theory; Entropy; Communication channel; Modulation; Coding; Layered structure; Networking

2.1 Introduction

In this section, we introduce the basics of communications, which will serve as a foundation for the topics covered in the subsequent chapters. A reader acquainted with the area of communications can skip this chapter, or take a quick glance to review the main concepts in communications. For readers interested in more detailed explanations of communications, a guidance to further reading will be given at the end of this chapter.

This chapter is organized as follows:

1. Since a communication system essentially conveys information from source to destination, we first introduce the measure of information (i.e., how much information is carried by a message), which falls in the area of information theory.

2. Since the information must be transmitted through a certain channel, we then explain the models of typical communication channels and the corresponding channel capacity (i.e., the maximum amount of information that can be carried by the channel for reliable communications).

3. The first step of digital communications is to convert the original information source to bit sequences. Hence we will explain the basic concepts of source coding.

4. These bits are then channel coded in order to combat noise by using redundancies, and then modulated from coded bits to physical signals (such as sinusoidal signals). The schemes of typical modulation and coding will be detailed.

5. Once the point-to-point communications are understood, we turn to communication networks and explain the functionalities of different layers.

6. Finally, we provide an introduction to several typical communication systems, either wired or wireless.

2.2 Information measures

In this section, we introduce how to measure information.

2.2.1 Shannon Entropy

Information and communications are triggered by uncertainties of information source (which can be represented by a random variable or random process). When the random variable is realized and has a deterministic value, the more uncertain the random variable is, the more information we can obtain.

For a random variable X with finite alphabet $X$ $X$ and probability distribution P, its Shannon entropy is defined as

$\begin{array}{l} H (X) = - \sum_{x \in X} P (x) log P (x), \end{array}$ $\begin{array}{l} H (X) = - \sum_{x \in X} P (x) log P (x), \end{array}$

(2.1)

where the unit is bit (or nat) when the logarithm base is 2 (or e). In the special case of uniform distribution, the entropy of X is given by $log | X |$ $log | X |$ , where $| X |$ $| X |$ is the cardinality of $X$ $X$ . For example, if X has eight possible values, then we obtain 3 bits when X is determined.

Similarly, for two random variables X and Y, we can also define the joint information as

$\begin{array}{l} H (X; Y) = - \sum_{x \in X, y \in Y} P (x, y) log P (x, y) \end{array}$ $\begin{array}{l} H (X; Y) = - \sum_{x \in X, y \in Y} P (x, y) log P (x, y) \end{array}$

(2.2)

and the conditional entropy as

$\begin{array}{l} H (X | Y) = - \sum_{x \in X, y \in Y} P (x, y) log P (x | y) . \end{array}$ $\begin{array}{l} H (X | Y) = - \sum_{x \in X, y \in Y} P (x, y) log P (x | y) . \end{array}$

(2.3)

It is easy to verify the following chain rule:

$\begin{array}{l} H (X, Y) = H (X | Y) + H (Y) . \end{array}$ $\begin{array}{l} H (X, Y) = H (X | Y) + H (Y) . \end{array}$

(2.4)

2.2.2 Mutual Information

For two random variables X and Y, the mutual information is to measure the amount of randomness they share. It is defined as

$\begin{array}{l} I (X; Y) = H (X) - H (X | Y) = H (Y) - H (Y | X) . \end{array}$ $\begin{array}{l} I (X; Y) = H (X) - H (X | Y) = H (Y) - H (Y | X) . \end{array}$

(2.5)

When I(X;Y ) = H(X), X and Y completely determine each other; on the other hand, if I(X;Y ) = 0, X and Y are mutually independent. As we will see, the mutual information is of key importance in deriving the capacity of communication channels.

2.3 Communication channels

A communication channel is the medium through which information is sent from the source to the destination, which is illustrated in Fig. 2.1. It is an abstraction of practical communication systems and can be modeled using mathematics. Once the mathematical model is set up for the communication channel, we can estimate the capacity of the channel.

f02-01-9780128019504 — Fig. 2.1 Model of communication systems.

2.3.1 Typical Channels

Below we introduce some typical communication channels and their corresponding characteristics:

• Wireless channels: The information is carried by electromagnetic waves in the open space. There are many typical wireless communication systems such as 4G cellular systems and WiFi (IEEE 802.11) systems, which enable mobile communications. However, wireless channels are often unreliable and have relatively small channel capacities.

• Optical fibers: The information can also be sent through optical fibers in wired communication networks. This communication channel is highly reliable and can support very high communication speeds. It is often used as the backbone network in a large area communication network such as the Internet.

• Underwater channels: Underwater acoustic waves can convey information with very low reliability and very low communication rates, which can be used in underwater communications such as submarine or underwater sensor communications.

2.3.2 Mathematical Model of Channels

A communication channel can be modeled using conditional probabilities. Given an input x, the channel output y is random due to random noise, and is characterized by the corresponding conditional probability P(y|x). Several typical models of communication channels are given below, as illustrated in Fig. 2.2:

f02-02-9780128019504 — Fig. 2.2 Examples of channel models.

• Binary symmetric channel: Both the input and output are binary with alphabet {0, 1}. The input is swapped at the output with probability 1 − p. Such a model can be used to describe the errors in demodulation and decoding, where 1 − p is the error probability of each information bit.

• Erasure channel: The input is binary. A new element ϵ, which denotes the loss of information, is added to the output, besides bits 0 and 1. Such a model can depict communication systems with possible packet losses, such as the Internet.

• Gaussian channel: Both the input and output of the channel are real numbers. The output is a superposition of the input and Gaussian noise, i.e.,

$\begin{array}{l} y (t) = x (t) + n (t), \end{array}$ $\begin{array}{l} y (t) = x (t) + n (t), \end{array}$

(2.6)

where t is the time index. This can be used to model a physical layer signal with Gaussian noise. Denote by P_s and $σ_{n}^{2}$ $σ_{n}^{2}$ the powers of the signal and the noise. The signal-to-noise ratio (SNR) is defined as

$\begin{array}{l} SNR = \frac{P_{s}}{σ_{n}^{2}} . \end{array}$ $\begin{array}{l} SNR = \frac{P_{s}}{σ_{n}^{2}} . \end{array}$

(2.7)

As will be seen, the SNR is of key importance in determining the performance of Gaussian channels.

• Gaussian band-limited channel: This channel is continuous in time. The additive noise has power spectral density N₀ and the signal is allowed to transmit within a frequency band with bandwidth W.

• Multiple access channel: When there are multiple (say K) transmitters and a single receiver, the transmitted signals will have an effect on each other. For the discrete case, the channel is characterized by the conditional probability given the input symbols X₁, …, X_K, namely

$\begin{array}{l} P (Y | X_{1}, \dots, X_{K}), \forall Y \in Y, X_{1}, \dots, X_{K} \in X . \end{array}$ $\begin{array}{l} P (Y | X_{1}, \dots, X_{K}), \forall Y \in Y, X_{1}, \dots, X_{K} \in X . \end{array}$

(2.8)

An important special case is the Gaussian multiple access channel, in which the received signal is given by

$\begin{array}{l} y = \sum_{k = 1}^{K} x_{k} + n, \end{array}$ $\begin{array}{l} y = \sum_{k = 1}^{K} x_{k} + n, \end{array}$

si14_e (2.9)

where n is Gaussian noise.

2.3.3 Channel Capacity

In his groundbreaking paper [32], Shannon proposed the concept of channel capacity C (bits/channel use) for communication channels; i.e., if the transmission rate R (also bits/channel use) is less than C, then there exists an encoding and decoding scheme to make the transmission error arbitrarily small; otherwise, the data cannot be reliably transmitted.

For the simplest case of discrete memoryless channels, the channel capacity is given by

$\begin{array}{l} C = max_{P_{X}} I (X; Y), \end{array}$ $\begin{array}{l} C = max_{P_{X}} I (X; Y), \end{array}$

(2.10)

where P_X is the distribution of the input.

The capacities of the channels introduced in the previous subsection are given below:

• The binary symmetric channel has a channel capacity of 1 − H(p), where $H (p) = - p log p - (1 - p) log (1 - p)$ $H (p) = - p log p - (1 - p) log (1 - p)$ is the Shannon entropy of a binary distribution with probabilities p and 1 − p.

• The erasure channel has a channel capacity p, where p is the probability that the transmitted bit is not erased.

• The Gaussian channel has the following channel capacity:

$\begin{array}{l} C = \frac{1}{2} log (1 + \frac{P_{s}}{σ_{n}^{2}}), \end{array}$ $\begin{array}{l} C = \frac{1}{2} log (1 + \frac{P_{s}}{σ_{n}^{2}}), \end{array}$

si17_e (2.11)

where P_s is the maximum average power, if the constraint is the average power (instead of the peak power).

• The band-limited Gaussian channel has the following channel capacity:

$\begin{array}{l} C = \frac{W}{2} log (1 + \frac{P_{s}}{W N_{0}}), \end{array}$ $\begin{array}{l} C = \frac{W}{2} log (1 + \frac{P_{s}}{W N_{0}}), \end{array}$

(2.12)

if the constraint is on the average power.

• Multiple access channel: When there are K transmitters, the communication capability is described by the capacity region Ω, within which each point (R₁, …, R_K) is achievable. For the Gaussian multiple access channel, we have

$\begin{array}{l} Ω = \{(R_{1}, \dots, R_{K}) |\sum_{i \in I} R_{i} \leq \frac{1}{2} log (1 + \frac{\sum_{k \in I} P_{k}}{σ_{n}^{2}}), I \subset {1, \dots, K}\} . \end{array}$ $\begin{array}{l} Ω = \{(R_{1}, \dots, R_{K}) |\sum_{i \in I} R_{i} \leq \frac{1}{2} log (1 + \frac{\sum_{k \in I} P_{k}}{σ_{n}^{2}}), I \subset {1, \dots, K}\} . \end{array}$

si19_e (2.13)

Although the concept of channel capacity has been widely used in the practical design of communication systems, the following points need to be noted in the context of communications in a CPS.

• The channel capacity defined in traditional information theory is based on asymptotic analysis and infinite codeword length. However, in the context of CPS, due to the requirement of real-time for controlling the dynamics, the infinite codeword length, which also implies infinite communication delay, cannot be tolerated.

• The achievability of channel capacity is usually based on the argument of random coding, which is of theoretical importance but is not practical. Although for simple binary channels the low-density parity-check (LDPC) codes have almost attained the channel capacity, the practical capacity of coding schemes has not been obtained for many typical channel models.

2.4 Source coding

Source coding means converting from the original information, which could be either discrete or continuous, to a series of information bits. It can be categorized into lossless source coding, in which no information is lost during the source coding, and lossy source coding, in which a portion of the information is lost. For simplicity, we assume that the information symbols are discrete-time and i.i.d.

2.4.1 Lossless Source Coding

The information source of lossless source coding, which is represented by a random variable X, must be finite. Denote by $X = {x_{1}, \dots, x_{n}}$ $X = {x_{1}, \dots, x_{n}}$ the alphabet of the information source symbols. It is assumed that the distribution is {P(X=x_j)}_j. One simple approach is to use $⌈ {log}_{2} n ⌉$ $⌈ {log}_{2} n ⌉$ bits to represent these n symbols. However, we can do better and use variable length code, in which the information symbols may be represented by different numbers of bits. To achieve the lower average code length, we should assign fewer bits to more frequent symbols. It has been shown that the minimum average number of bits is given by H(P) (i.e., the entropy of X) and can be achieved by properly assigning the bits to the symbols with known probabilities.

The above discussion assumes known distributions of the information symbols. In practice, it is possible that the distribution is unknown and we only have a sequence of bits. In the situation of unknown statistics, a technique of universal source coding, such as the Lempel-Ziv coding scheme, can be applied. A surprising conclusion is that, even though the statistical information is lacking, the universal coding can still achieve the optimal compression rate asymptotically (as the length of sequence tends to infinity), given that the information source is stationary and ergodic.

2.4.2 Lossy Source Coding

When the information source is continuous in value, it is impossible to encode the source information symbols using finitely many bits. Hence some information has to be lost when digital communications are used. We say that the source coding is lossy in this situation. Note that we can also apply lossy source coding for discrete information sources, if some information loss is tolerable.

Theoretically, lossy source coding can be studied using rate-distortion theory [33]. First a distortion function is defined as $d (X, \hat{X})$ $d (X, \hat{X})$ , where X is the information symbol and $\hat{X}$ $\hat{X}$ is the recovered one. For example, when X is real valued, the distortion can be defined as the mean square error; i.e., $d (X, \hat{X}) = E [| X - \hat{X} |^{2}]$ $d (X, \hat{X}) = E [| X - \hat{X} |^{2}]$ . Then a rate-distortion function R(D) is defined for each given average distortion D, which means the minimum source coding rate ensuring that the average distortion is no larger than D. It is well known that R(D) can be obtained from the optimization of mutual information between X and $\hat{X}$ $\hat{X}$ .

In practice, lossy source coding is carried out by a quantizer. Essentially a quantizer divides the scope of the information source symbol X into many regions, each of which corresponds to a bit string. Fig. 2.3 shows a simple scalar quantizer, in which the range of X is the interval [0, 1] and it is divided into four intervals. When X falls in one of the intervals, the quantizer outputs the corresponding two bits. When X is a vector, a vector quantizer is needed, which is much more complicated. A powerful design algorithm for vector quantizers is Lloyd’s algorithm [34].

f02-03-9780128019504 — Fig. 2.3 Example of a scalar quantizer.

2.5 Modulation and coding

When receiving a series of information bits, the transmitter will convert them to physical symbols ready for transmission. Usually two steps are needed for this conversion. First the information bits are encoded by adding redundancies in order to combat the communication channel unreliability and the potential errors arising from this. Then the coded bits are converted to physical symbols that are ready to be passed to the hardware for transmission.

2.5.1 Channel Coding

There are two types of channel coding used to protect the transmission reliability from the channel uncertainty. In error detection coding, mistakes during the transmission may be detected. In error correction coding, the errors may be corrected without retransmitting the message.

Error detection coding

In error detection coding schemes, typical errors during the transmission can be detected. Then the received codeword may be discarded and the failure of the transmission is announced; or the receiver will send back a request to the transmitter for retransmission. The simplest scheme of error detection is the parity check bit, namely the sum of the information bits is appended to the codeword. For example, when the original information bits are 1100111, the transmitted bit sequence is 11001111, where the last bit 1 is the parity check bit. If an error occurs in one of the information bits and the receiver receives 11001110, then the error can be detected since the parity check bit is unequal to the sum of the information bits. Although it is also possible that the parity check bit, or two information bits, is wrong, the corresponding probability is much smaller.

Error correction coding

In the error correction coding scheme, typical errors in transmission can be corrected at the receiver without retransmission. The capability of error correction is assured by adding redundancies to the information bits. Take the simplest repetition code, for example. When we transmit bit 1, we can transmit it three times, thus sending out bit sequence 111. The decoder uses the majority rule; i.e., the decision is the most frequent bit in the received sequence. Hence if there is one transmission error and the decoder receives bits 011, the decoder can claim that the information bit is actually 1 and thus correct the error. The transmission rate, usually denoted by R, is defined as the ratio between the number of information bits and the number of coded bits. The above example of repetition code has a transmission rate 1/3. The smaller R is, the more redundancies are added and the more reliable the transmission is; however, a smaller R also uses more communication bandwidth since the transmitter needs to send out more coded bits in the same time. A tradeoff is needed between the reliability and resource consumption.

A typical error correction coding scheme used in practice is the linear block code, in which the coded bits c are given by

$\begin{array}{l} c = b G, \end{array}$ $\begin{array}{l} c = b G, \end{array}$

(2.14)

where b represents the information bits (as a row vector) and G is called the generation matrix. The receiver receives a senseword s = c + e, where e is the error vector. The decoder applies a matrix H, which satisfies G^TH = 0, to calculate d = sH = eH, which is called the syndrome. The error vector e can be found by looking up a table indexed by the syndrome.

A more powerful error correction code is the convolutional code, which does not have a fixed codeword length and is thus different from the linear block code. In the convolutional code, memory is introduced by streaming the input information bits through registers. Take the structure in Fig. 2.4, for example. There are two registers; thus the memory length is 2. For each input information bit b(t), the two coded bits, denoted by c₁(t) and c₂(t), are generated by b(t) and the previous two information bits b(t − 1) and b(t − 2), which are given by

$\begin{array}{l} \{\begin{array}{l} c_{1} (t) = b (t) + b (t - 1) + b (t - 2), \\ c_{2} (t) = b (t) + b (t - 2) . \end{array} \end{array}$ $\begin{array}{l} \{\begin{array}{l} c_{1} (t) = b (t) + b (t - 1) + b (t - 2), \\ c_{2} (t) = b (t) + b (t - 2) . \end{array} \end{array}$

si27_e (2.15)

Then the two registers are occupied by b(t − 1) and b(t). Since one input information bit triggers the generation of two coded bits, the transmission rate is $\frac{1}{2}$ $\frac{1}{2}$ . By changing the number of simultaneous output coded bits and the number of branches of the input bits, we can achieve any rational number for transmission rates. As we will see in later chapters, the structure of the convolutional code is actually similar to that of linear dynamical systems, thus facilitating the proposed turbo channel decoding and system state estimation in CPSs.

f02-04-9780128019504 — Fig. 2.4 Illustration of convolutional code.

A breakthrough in the area of error correction codes is the turbo code, which was proposed by Berrou et al. in 1993 [35] and achieves transmission rates very close to the Shannon limit of channel capacity (Fig. 2.5) . For simplicity, we use the original form of turbo codes given in Ref. [35]. The encoder essentially uses a parallel convolutional code. Two convolutional encoders are used. The raw information bits, and a randomly permeated version of the bits (carried out by an interleaver), are fed into the two convolutional codes. Then the output of the turbo encoder consists of the original information bits and the output bits of the two convolutional encoders.

f02-05-9780128019504 — Fig. 2.5 Illustration of turbo code.

The key novelty in the turbo code is in its decoder, which is illustrated in Fig. 2.6. The received encoded bits at the decoder consist of three parts: m (the original information bits with possible errors), X₁ (the output of convolutional encoder 1), and X₂ (the output of convolutional encoder 2). m and X₁ are fed into a standard decoder, which generates the soft decoding output (namely the probability distribution of each bit being 0 or 1), instead of hard decisions. The soft output is fed into decoder 2 as prior information. Decoder 2 combines m, X₂ and the prior information from decoder 1, and also generates soft outputs. Then the outputs of decoder 2 are fed into decoder 1 as prior information. Such a procedure is repeated iteratively (thus named “turbo”) until convergence is achieved. The performance of decoding is improved through this iterative procedure. Such a “turbo” principle of information processing has been applied in many other tasks such as multiuser detection and equalization. As we will see, it can also be used for system state estimation in a CPS.

f02-06-9780128019504 — Fig. 2.6 Illustration of the turbo decoder.

2.5.2 Modulation

When the information bits are encoded into codewords (note that in some situations the channel coding procedure is not necessary), the procedure of modulation maps the abstract information to physical signals. In many typical communication systems, the information is conveyed by sinusoidal carriers, namely

$\begin{array}{l} s (t) = A cos (2 π f t + ϕ) = R e [A e^{j 2 π f t + ϕ}], \end{array}$ $\begin{array}{l} s (t) = A cos (2 π f t + ϕ) = R e [A e^{j 2 π f t + ϕ}], \end{array}$

(2.16)

where A, f, and ϕ are the amplitude, frequency, and phase of the sinusoidal signal, respectively. Hence the information can be carried in the amplitude, frequency or phase, thus resulting in the following types of modulation schemes:

• Pulse amplitude modulation (PAM): There are M possible signal amplitudes, which represent ${log}_{2} M$ ${log}_{2} M$ bits. The simplest case is M = 2, namely the transmitter either transmits or not (A = 0), which is called on-off keying (OOK).

• Frequency shift keying (FSK): The transmitter can choose one of M frequencies for the sinusoidal carrier signal. Hence each switch of the frequency represents ${log}_{2} M$ ${log}_{2} M$ bits.

• Phase shift keying (PSK): The transmitter switches among M possible phases for transmitting ${log}_{2} M$ ${log}_{2} M$ bits.

The above three typical modulation schemes can be combined; e.g., quadrature amplitude modulation (QAM) is a combination of PAM and PSK. To explain the mechanism of QAM, we consider the following signal:

$\begin{array}{l} s (t) & = A cos ϕ cos (2 π f t) - A sin ϕ sin (2 π f t) \\ = A_{1} cos (2 π f t) + A_{2} sin (2 π f t), \end{array}$ $\begin{array}{l} s (t) & = A cos ϕ cos (2 π f t) - A sin ϕ sin (2 π f t) \\ = A_{1} cos (2 π f t) + A_{2} sin (2 π f t), \end{array}$

(2.17)

which is decomposed to the quadrature and in-phase terms (A₁ and A₂ are the corresponding amplitudes). We can consider the carriers $cos (2 π f t)$ $cos (2 π f t)$ and $sin (2 π f t)$ $sin (2 π f t)$ as two orthogonal signals, each of which carries one branch of independent information. Hence (A₁, A₂) can be considered as a point in the signal plane (called a constellation), which is used to convey information. When there are M possible points in the constellation, we call the modulation scheme M-QAM. Fig. 2.7 shows illustrations of 16QAM and 64QAM.

f02-07-9780128019504 — Fig. 2.7 Illustration of QAM.

When selecting a suitable modulation scheme, there is a tradeoff between the transmission rate and transmission reliability. Usually, the symbol rate (i.e., the number of modulation signals transmitted in one second) is fixed. Hence if a denser constellation is used, more coded bits can be transmitted in the same time, thus improving the data transmission rate. However, the points in a dense modulation are close to each other, and thus can be confused by noise and results in transmission errors. As we will see in subsequent chapters, this tradeoff is important for the modulation selection in CPSs.

2.6 Networking

We now discuss communication networks with an emphasis on the layered structure.

2.6.1 Graph Representation

A communication network can be represented by a graph, in which each vertex represents a communication node and each edge represents a communication link. Hence graph theory (e.g., traditional graph theory such as connectivity and coloring, algebraic graph theory and random graph theory) is of great importance in the research on communication networks.

2.6.2 Layered Structure

To facilitate the design of communication networks, the functionalities of networks can be organized into a stack of layers. Each layer takes charge of different tasks and communicates with adjacent layers. Certain protocols are also designed for the interfaces between two adjacent layers. The most popular definition of network layers is the Open Systems Interconnection (OSI) reference model developed by the International Standards Organization (ISO). Another typical model is the TCP/IP reference model. Both are illustrated in Fig. 2.8. The details of the models are explained as follows.

f02-08-9780128019504 — Fig. 2.8 Illustration of layers in OSI and TCP/IP.

We first introduce the OSI model, in which the network is divided into seven layers. Since the layers for session and presentation are not used in most network designs, we focus on the remaining five layers, which have been widely used in the design and analysis of communication networks.

• Physical layer: The physical layer is concerned with how to transmit information bits from a transmitter to a receiver. It mainly involves the modulation/demodulation, coding/decoding,¹ and signal processing for transmission and reception. Most of our previous discussions relate to the physical layer.

• Data link layer: This layer takes charge of tasks such as frame acknowledgment (ACK), flow regulation, and channel sharing. The latter, called medium access control (MAC), is the most important for wireless networks due to the broadcast nature of wireless transmissions, and is therefore usually considered as an independent layer. Essentially, the MAC layer addresses resource allocation, e.g., how to allocate different communication channels to different users, and scheduling, e.g., when there is competition among users, which user should obtain the priority to transmit.

• Network layer: This layer determines how to find a route in the network from the source to the destination. For example, we need to design an addressing mechanism for routing. Moreover, when the addresses of the source and destination are known, we need to design algorithms for the network to find a path with the minimum cost (e.g., the number of hops) to the destination. When a path is broken by an emergency, the network layer needs to find a new path for the data flow.

• Transport layer: This layer receives data from the application layer, splits it into smaller units if needed, and then passes it to the network layer. The main task of the transport layer is congestion control, i.e., how to control the source rate according to the congestion situation in the network.

• Application layer: This provides various protocols for different applications. For example, the HyperText Transfer Protocol (HTTP) is used for websites.

Note that the physical, data link, and network layers are concerned with the intermediate nodes in the network, while the transport and application layers involve only the two ends of data flow, namely the source and destination.

In the TCP/IP reference model, the physical and data link layers are not well specified. They are considered as the host-to-network layer. The Internet and TCP layers in the TCP/IP reference model roughly correspond to the network and transport layers in the OSI model. More details can be found in Ref. [36].

2.6.3 Cross-Layer Design

In traditional design of communication networks, the design and operation of different layers are carried out independently. Different layers are coupled only through the inter-layer interfaces. However, it was found that the isolated design of different layers may decrease the efficiency of a network. Hence a cross-layer design was proposed for communication networks. An excellent tutorial on cross-layer design can be found in Ref. [37]. Moreover, the theory of network utility maximization and optimization-based decomposition provides a unified framework for cross-layer designs in communication networks. Due to space limitations, we do not discuss this theory here. Readers can find a comprehensive introduction to this topic in Ref. [38].

A motivating example for cross-layer design is the opportunistic scheduling in cellular systems [39]. Consider a base station serving multiple users. Different users may have different channel gains. The base station needs to schedule the transmission of multiple users. Recall that scheduling is an MAC layer issue while channel gain is a physical layer quantity. In traditional layered design, the scheduling algorithm does not take the channel gains into account. However, it has been shown that, to maximize the sum capacity, it is optimal to schedule only the user having the largest channel gain. Hence we see that, if the sum capacity is the performance metric, it is more desirable to incorporate physical layer issues into the scheduling algorithm in the MAC layer, thus resulting in a cross-layer design.

In the subsequent discussions, we will focus more on the MAC layer, network layer, and transport layer, which are the main focuses of studies on networking. Since modulation and coding have been introduced, we skip the discussion of the physical layer.

2.6.4 MAC Layer

A major task of the MAC layer is to schedule the transmissions of different users. We first introduce multiple access schemes in which multiple transmitters access the same receiver, which is typical in cellular systems. Then we briefly introduce scheduling in generic networks.

Multiple access schemes

Consider a multiuser system in which multiple transmitters access the same receiver. The following multiple access schemes are used in typical wireless multiuser communication systems, illustrated in Fig. 2.9:

f02-09-9780128019504 — Fig. 2.9 Illustration of multiple access schemes, in which bars with the same colors represent the data packets of one user.

• Frequency division multiple access (FDMA): Different transmitters transmit over different frequency channels, thus avoiding possible collisions. In first-generation cellular systems, FDMA is used to separate different users. In 4G cellular systems, the separation in the frequency domain is achieved by discrete Fourier transform and is called orthogonal frequency division multiple access (OFDMA).

• Time division multiple access (TDMA): The transmitters use different time slots for transmission; this is used in the second generation of cellular systems.

• Code division multiple access (CDMA): The transmitters transmit at the same time and use the same frequency band, thus resulting in collisions on the air. However, each transmitter has a spreading code, say s_k for user k, which can be considered as an N-dimensional vector. Hence the received signal is given by (noise is omitted here)

$\begin{array}{l} r = \sum_{i = 1}^{K} x_{i} s_{i}, \end{array}$ $\begin{array}{l} r = \sum_{i = 1}^{K} x_{i} s_{i}, \end{array}$

si36_e (2.18)

where x_k is the information symbol of user k. When the spreading codes of the users are orthogonal to each other, i.e., $s_{i}^{T} s_{j} = 0$ $s_{i}^{T} s_{j} = 0$ when i≠j, the receiver can detect x_k by correlating the received signal with the spreading codes of user k:

$\begin{array}{l} s_{k}^{T} r & = s_{k}^{T} \sum_{i = 1}^{K} x_{i} s_{i} \\ = x_{k} s_{k}^{T} s_{k} + s_{k}^{T} \sum_{i = 1, i \neq k}^{K} x_{i} s_{i} \\ = x_{k} ∥ s_{k} ∥^{2} . \end{array}$ $\begin{array}{l} s_{k}^{T} r & = s_{k}^{T} \sum_{i = 1}^{K} x_{i} s_{i} \\ = x_{k} s_{k}^{T} s_{k} + s_{k}^{T} \sum_{i = 1, i \neq k}^{K} x_{i} s_{i} \\ = x_{k} ∥ s_{k} ∥^{2} . \end{array}$

si38_e (2.19)

Note that CDMA is the fundamental signaling technique in 3G cellular systems.

• Random access: In contrast to the above multiple access schemes, the random access scheme allows collisions of transmissions. Usually the random access scheme is suitable for random data traffic with light loads, where the communication channel is idle for a large portion of the time, thus making the collision probability small. The earliest scheme of random access is the Aloha system. In an Aloha system, each transmitter begins to transmit whenever it has data to transmit. If there is a collision, each active transmitter randomly chooses a backoff time and retransmits the data packet after the backoff time. In this manner, two colliding transmitters will retransmit at different times with a large probability due to the random backoff time. Another typical random access scheme is carrier sensing multiple access (CSMA), in which the transmitter senses the channel before transmission and transmits only when it finds that the channel is idle.

Scheduling

Scheduling in the MAC layer is to allocate communication resources to different transmitters. For example, in 4G OFDMA systems, different subcarriers in the frequency spectrum are allocated to different users. In TDMA-based communication networks, the scheduler assigns different time slots to different users, thus avoiding collisions. Here we provide a brief introduction to TDMA-based networks with generic topology.

We consider a wireless communication network with N users. We denote by n ∼ m if users n and m are neighbors in the network. We assume that there are a total of F data flows in the network. We denote by S_f and D_f the source node and destination node of flow f, respectively. We assume that the number of packets arriving at the source node of data flow f satisfies a Poisson distribution with expectation a_f. The routing paths of the F data flows can be represented by an F × N matrix R in which R_fn = 1 if data flow f passes through user n and R_fn = 0 otherwise. We denote by $I_{n}$ $I_{n}$ the set of flows passing through secondary user n. When a packet cannot be transmitted immediately, it is placed in a queue and different data flows at the same node have different queues. The queue length (i.e., the number of packets in the queue) is denoted by Q_nf for node n and flow f. Since wireless communications are used for each link, there could be collisions of data packets. Usually the relationship of interference among the network nodes is represented by a graph, in which each node represents a communication link in the network and each edge means that the two corresponding communication links cause interference to each other if transmitting simultaneously.

The goal of the scheduler is to schedule the transmissions of the nodes according to the status of queues and stabilize the queuing dynamics. The groundbreaking paper by Tassiulas and Emphremides [40] identifies the region of stability, i.e., the set of arrival rates a = {a₁, …, a_F} that can be stabilized, and the corresponding scheduling algorithm. It is shown that the closure of the stability region is given by

$\begin{array}{l} \bar{C} = {a | \exists feasible f and c s.t. M^{- 1} f \leq c}, \end{array}$ $\begin{array}{l} \bar{C} = {a | \exists feasible f and c s.t. M^{- 1} f \leq c}, \end{array}$

(2.20)

where the vector f consists of the serving rate of each link, c is the vector consisting of the capacities of the links (as functions of the scheduling policy), and M is a diagonal matrix whose diagonal elements are the probabilities of service completion of each link.

The following scheduling policy is shown to universally stabilize the feasible arrival rates, in the following steps (consider time slot t):

• Step 1. A weight D_i(t) is assigned to link i in the following manner. For each flow j and link i, consider

$\begin{array}{l} D_{i j} (t) = \{\begin{array}{l} (Q_{q (i) j} (t - 1) - Q_{h (i) j} (t - 1)) m_{i}, if i is not the destination, \\ Q_{q (i) j} (t - 1) m_{j}, if i is the destination, \end{array} \end{array}$ $\begin{array}{l} D_{i j} (t) = \{\begin{array}{l} (Q_{q (i) j} (t - 1) - Q_{h (i) j} (t - 1)) m_{i}, if i is not the destination, \\ Q_{q (i) j} (t - 1) m_{j}, if i is the destination, \end{array} \end{array}$

si41_e (2.21)

where q(i) is the receiver node of link i, h(i) is the transmitter node of link i, and m_i is the success probably of link i. Then we set

$\begin{array}{l} D_{i} (t) = max_{j} D_{i j} (t) . \end{array}$ $\begin{array}{l} D_{i} (t) = max_{j} D_{i j} (t) . \end{array}$

(2.22)

• Step 2. Find a maximum weighted vector c:

$\begin{array}{l} \hat{c} = arg max_{feasible c} D^{T} (t) c, \end{array}$ $\begin{array}{l} \hat{c} = arg max_{feasible c} D^{T} (t) c, \end{array}$

(2.23)

where the elements in c indicate the active status of each link (1 if active and 0 otherwise) and c must satisfy the constraint of no collisions in the interference graph.

• Step 3. If link i is activated, the data flow that maximizes {D_ij} is allowed to transmit.

From the above procedure, we can see that the scheduling algorithm is more inclined to activate links with large differences in the two corresponding queue lengths. Hence if a node has large queue lengths for the packets, it is more likely to be scheduled. However, although the above scheduling algorithm can stabilize the queuing dynamics, it has the following severe challenges:

• Each communication link needs to send its weight to the scheduler; hence the corresponding two nodes need to exchange their information. These bring significant overheads to the network.

• The optimization in Eq. (2.23) is an integer programming problem and is usually NP-hard. For large-scale networks, it is infeasible to obtain the optimal solution.

Due to the above challenges in practice, there have been many studies on reducing the communication overheads (e.g., locally broadcasting the weights) or the computational cost (e.g., using the greedy algorithm). There have been no systematic studies on data traffic scheduling in the context of CPSs. Although it is reasonable to apply the existing scheduling algorithms to the communications for control, it may not be efficient since the goal of the existing algorithms is to guarantee the conveyance of all data packets, instead of controlling the physical dynamics.

2.6.5 Network Layer

As has been explained, the main task of the network layer is to find the route for each data flow. Usually it is desirable to find the shortest path for each data flow, where the term “shortest” usually means the minimum number of hops, which may be weighted, between the source node and the destination node.

A typical routing algorithm for finding the shortest path is distance vector routing, which is also called the Bellman-Ford algorithm or the Ford-Fulkerson algorithm. In the routing algorithm, each node maintains a routing table. In the table, each item is indexed by each possible destination node, and records the next-hop node and the total distance to the destination of the shortest path. These nodes exchange information in the routing tables and update their routing tables until the shortest paths are found. The updating procedure is illustrated in Procedure 1.

Procedure 1

Procedure for Distance Vector Routing in an N-Node Network

1: for Each node n, n = 1, …, N do

2: Initialization: Set D_nn = 0 and $D_{n m} = \infty$ $D_{n m} = \infty$ , for m = 1, …, N, m ≠ n. Here D_nm means the minimum distance between nodes n and m.

3: end for

4: for The routing tables have not converged do

5: for Each noden, n = 1, …, N do

6: Exchange the routing table with neighbors.

7: Update the routing table: $D_{n m} = {min}_{k \sim n} w_{n k} + D_{k m}$ $D_{n m} = {min}_{k \sim n} w_{n k} + D_{k m}$ , where k ∼ n means that node k is adjacent to noden, and w_nk is the weight for the hop between nodes n and k. Keep record of the optimal next hop node.

8: end for

9: end for

The key step in Procedure 1 is the updating rule of the routing table. Each node compares the minimum distance in its current routing table, and the minimum distances stored in other neighbors (plus the distances to the neighbors), and chooses the minimal one.

There are also many other concerns in the design of the routing table; e.g., the possible breakdown of communication links and the corresponding response of the routing algorithm. The characteristics of the network (e.g., wired or wireless) should also be incorporated into the routing algorithm, thus resulting in many algorithms such as hierarchical routing and OSPF routing algorithms.

Note that the above routing algorithms are for unicast routing; i.e., there is only one destination. In many situations, multicast routing is needed; i.e., a packet needs to be sent to more than one destination node. For example, in a CPS, if there are multiple controllers, a sensor may need to send its measurements to multiple destinations. Usually a tree is formed for the multicast, as illustrated in Fig. 2.10.

f02-10-9780128019504 — Fig. 2.10 Illustrations of unicasting and multicasting in a CPS.

2.6.6 Transport Layer

One of the major tasks of the transport layer is congestion control. In the Transmission Control Protocol (TCP), which is widely used on the Internet, a congestion window is used to control the data traffic volume. The sender can send out all the packets within the congestion window. At the beginning of the connection, the window size is set to the maximum segment size allowed in the connection. Then upon receiving the ACK from the receiver, which means that the corresponding packets have been correctly received, the transmitter doubles its congestion window. When the congestion window is larger than a certain threshold, the increase of the window size becomes linear. If an ACK has not been received before the expiration of the timer, the transmitter assumes that congestion has occurred in the network (thus causing the loss of a packet) and then decreases the congestion window size to the minimum one and also reduces the threshold to a half. This procedure is illustrated in Fig. 2.11.

f02-11-9780128019504 — Fig. 2.11 Illustration of congestion control.

2.7 Typical communication systems

In this section we list some typical communication systems. They are all potential candidates for communication networks in a CPS.

2.7.1 Wired Networks

In wired networks, information is sent through a guided medium such as optical fibers. Usually, wired networks have higher throughput and much more reliability. The disadvantages of wired networks include the lack of flexibility and mobility.

According to the geometrical scope, wired networks can be categorized as follows:

• Local area networks (LANs): An LAN may cover a single house, or a campus, whose coverage may be within a few kilometers. The communication agents are connected by cables such as optical fibers. Typical LAN systems include Ethernet (IEEE 802.3) and Token Ring (IEEE 802.5).

• Metropolitan area networks (MANs): The corresponding coverage is a city. A typical system is the cable television network, which can also provide an Internet service.

• Wide area networks (WANs): The coverage of a WAN could be a country or even a continent. The Internet can be considered as a WAN. The hosts (machines) are connected by subnets, which consist of transmission lines and switching elements (routers). Most modern WANs use packet switched subnets, in which the data packets are transmitted in a store-and-forward manner.

2.7.2 Wireless Networks

Wireless communication networks have been widely used in both civil and military communications. Many of them can also be adapted to controlling a CPS. Several of the most popular wireless networks are listed below:

• Cellular networks: In cellular networks, the space is divided into cells. Within each cell, a base station is in charge of the communications in it. The users in the cell either transmit to the base station (namely uplink) or receive data from the base station (namely downlink). The base stations are connected with high-speed wired backbone networks. The most recently deployed cellular network is the 4G one, which is called the Long Term Evolution (LTE) network and is mainly based on the signaling of OFDMA. A major challenge in cellular networks is power control within and across cells.

• WiFi networks: In contrast to cellular networks, WiFi networks, which are based on the IEEE 802.11 protocol, cover much smaller areas such as a house or a floor in a building. In contrast to cellular networks, whose deployment is carefully designed, WiFi networks are much more ad hoc and flexible. The cost of the access point (AP) in WiFi, which connects wireless users to the Internet and plays the role of the base station in cellular networks, is much less than that of a base station, thus making careful planning unnecessary.

With reference to the different coverages, wireless networks are usually categorized into the following types, whose coverages and data rates are illustrated in Fig. 2.12:

f02-12-9780128019504 — Fig. 2.12 Illustration of congestion control.

• Wireless personal area networks (WPANs): The coverage of WPANs is usually within the reach of a person. Typical systems include Bluetooth and Zigbee.

• Wireless local area networks (WLANs): The coverage of WLANs is usually small, say tens of meters. WiFi is a typical WLAN.

• Wireless mesh networks: A mesh network usually covers a larger area than WLANs. It can be built using WLANs. For example, WiFi can be used to construct a mesh network.

• Wireless metropolitan networks (WMANs): A WMAN may cover a few kilometers. A typical WMAN is WiMAX, which is based on IEEE802.16.

• Wireless wide area networks (WWANs): A WWAN may cover a city and its suburbs. Cellular networks may be used to cover such a large area.

• Wireless regional area networks (WRANs): A WRAN may cover a large distance such as 100 kilometers. A typical standard for WRANs is IEEE 802.22, which may be operated in the TV band.

2.8 Conclusions

This chapter provides a brief introduction to modern communication theory and systems. Due to space limitations, we cannot provide a comprehensive introduction to all aspects of communications. Readers who are interested in more details are referred to the following literature:

• Excellent comprehensive introductions to communication theory can be found in Refs. [41, 42]. Note that these two books focus on wireless communications. A textbook on wired communications, particularly optical communications, is Ref. [43].

• More details on information theory can be found in Ref. [33].

• Ref. [44] provides numerous discussions on wireless channels.

• Introductions to source coding can be found in Refs. [45, 46], while introductions to channel coding can be found in Refs. [47, 48]. Many details on modulation schemes can be found in Ref. [49].

• There are many books on communication networks. One entry-level textbook is Ref. [36]. More theoretical analysis can be found in Ref. [50].

• A comprehensive introduction to the most recent wireless communication systems can be found in Ref. [51].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 2: Basics of communications

Create new playlist

Sign In

Sign Up