8.5 Physical dynamics-aware channel decoding

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

8.5 Physical dynamics-aware channel decoding

In this section, we use existing coding schemes and outline a decoding procedure that considers the physical dynamics in a CPS, following the study in Ref. [165].

8.5.1 Motivation

Since the reliability of communication is usually very high in CPSs (e.g., in the wide area monitoring system in smart grids, the packet error rate should be less than 10⁻⁵ [166]) while the communication channel may experience various degradations (e.g., the harsh environments for communications in smart grids), it is important to use channel coding to protect the information transmission.

A straightforward approach for coding in a CPS is to follow the traditional procedure of separate quantization, coding, transmission, decoding, and further processing, which is adopted in conventional communication systems (Fig. 8.7). However, separate channel decoding and system state estimation may not be optimal when there exist redundancies in the transmitted messages. For example, the transmitted codeword at time t, b(t), is generated by the observation on the physical dynamics, y(t). Due to the time correlation of system states, y(t) is correlated with y(t + 1), thus being able to provide information for decoding the codeword in the next time slot t + 1. Hence if the decoding procedure is independent of the system state estimation, the permanence will be degraded due to the loss of redundancy. One may argue that the redundancy can be removed in the source coding procedure (e.g., encoding only the innovation vector in Kalman filtering if the physical dynamics is linear with Gaussian noise [167]). However, this requires substantial computation, which a sensor may not be able to carry out. Moreover, the sensor may not have the overall system parameters for the extraction of innovation information. Hence it is more desirable to shift the burden to the controller, which has more computational power and more system information; then the inexpensive sensor focuses on transmitting the raw data.

f08-07-9780128019504 — Fig. 8.7 Illustration of the components in a CPS.

In this section, we propose an algorithm for joint channel decoding and state estimation in a CPS. We consider traditional channel codes with memory (e.g., convolutional codes). Then the physical dynamics system state evolution and the channel codes can be considered as an equivalent concatenated code, as illustrated in Fig. 8.8, in which the outer code is the generation of system observation (we call it a “nature encoded” codeword) while the inner code is the traditional channel coding. The outer code is generated by the physical dynamics in an analog manner (since the value is continuous) while the inner code is carried out by a digital encoder. We will use belief propagation (BP) [168], which has been widely applied in decoding turbo codes and low-density parity-check (LDPC) codes, to decode this equivalent concatenated code. The major challenge is that the Bayesian inference is over a hybrid system which has both continuous- and discrete-valued nodes.

f08-08-9780128019504 — Fig. 8.8 Illustration of the equivalent concatenated code for system state evolution and channel coding.

8.5.2 System Model

We consider a discrete-time dynamical system with N-dimensional system state x and K-dimensional observation y. The dynamics are described by

$\begin{array}{l} \{\begin{array}{l} x (t + 1) = f (x (t), u (t), n (t)), \\ y (t) = g (x (t), w (t)), \end{array} \end{array}$ $\begin{array}{l} \{\begin{array}{l} x (t + 1) = f (x (t), u (t), n (t)), \\ y (t) = g (x (t), w (t)), \end{array} \end{array}$

si165_e (8.111)

where u is the control action taken by the controller, and n and w are both random perturbations, whose probability distributions are assumed to be known.

A special case is linear system dynamics, which is described by

$\begin{array}{l} \{\begin{array}{l} x (t + 1) = A x (t) + B u (t) + n (t), \\ y (t) = C x (t) + w (t), \end{array} \end{array}$ $\begin{array}{l} \{\begin{array}{l} x (t + 1) = A x (t) + B u (t) + n (t), \\ y (t) = C x (t) + w (t), \end{array} \end{array}$

si107_e (8.112)

where n and w are both Gaussian noises with zero means and covariance matrices Γ_n and Γ_w, respectively.

We assume that y(t) is observed by a sensor.² It quantizes each dimension of the observation using B bits, thus forming a KB-dimensional binary vector which is given by

$\begin{array}{l} b (t) = (b_{1} (t), b_{2} (t), \dots, b_{K B} (t)) . \end{array}$ $\begin{array}{l} b (t) = (b_{1} (t), b_{2} (t), \dots, b_{K B} (t)) . \end{array}$

(8.113)

We assume that the sensor simply carries out a scalar quantization for each dimension and does not use compression to remove the redundancy between y(t) and y(t − 1). It is straightforward to extend this to the case of vector quantization. However, it is nontrivial to extend it to the case of partial compression of the redundancy, which is beyond the scope of this study.

The information bits b(t) are put into an encoder to generate a codeword c(t). We assume binary phase shift keying (BPSK) modulation for the communication from the sensor to the controller. which is given by

$\begin{array}{l} r (t) = 2 c (t) - 1 + e (t), \end{array}$ $\begin{array}{l} r (t) = 2 c (t) - 1 + e (t), \end{array}$

(8.114)

where 2c(t) − 1 converts the alphabet {0, 1} to the antipodal signal {−1, +1}, and e(t) is the additive white Gaussian noise with zero expectation and normalized variance σ_e². In this chapter, we do not consider fading, since within one codeword the channel is usually stationary and the channel gain can be incorporated (together with the transmit power) into the noise variance σ_e².

8.5.3 Joint Decoding

As we have explained, there exists information redundancy between the system state in different time slots, as well as the received bits in the communications. Hence we can apply the framework of BP for the procedure of joint decoding and system state estimation.

A brief introduction to Pearl’s BP

We first provide a brief introduction to the BP algorithm, in particular a version of Pearl’s BP. We use an example to illustrate the procedure. The details and formal description can be found in Ref. [168].

Take Fig. 8.9 for instance, where a random variable X has parents U₁, U₂, …, U_M and children Y₁, Y₂, …, Y_N. Here the parent nodes are characterized by the following equality of conditional probability:

$\begin{array}{l} P (X | all other random variables) = P (X | U_{1}, U_{2}, \dots, U_{M}), \end{array}$ $\begin{array}{l} P (X | all other random variables) = P (X | U_{1}, U_{2}, \dots, U_{M}), \end{array}$

(8.115)

that is, given the random variables in the parent nodes, X is independent of all nonparent nodes. As indicated by the name, we say that a node is a child of X if X is a parent of it. The message passing of Pearl’s BP is illustrated by dotted arrows and dashed arrows in the figure. The dashed arrows transmit π-messages from a parent to its children. For instance, the message passing from U_m to X is $π_{U_{m}, X} (U_{m})$ $π_{U_{m}, X} (U_{m})$ , which is the prior information of U_m given that all the information U_m has been received. The dotted arrows transmit λ-messages from child to parent. For instance, the message passed from Y_n to X is $λ_{Y_{n}, X} (X)$ $λ_{Y_{n}, X} (X)$ , which is the likelihood of X given that the information Y_n has been received. After X receives all π-messages (i.e., $π_{U_{m}, X} (U_{m})$ $π_{U_{m}, X} (U_{m})$ from its parents U₁, U₂, …, U_M) and all λ-messages (i.e., $λ_{Y_{n}, X} (X)$ $λ_{Y_{n}, X} (X)$ , X from its children Y₁, Y₂, …, Y_N), X updates its local belief information BEL_X(x), and transmits λ-messages $λ_{X, U_{m}} (U_{m})$ $λ_{X, U_{m}} (U_{m})$ to its parents and π-messages $π_{X, Y_{n}} (x)$ $π_{X, Y_{n}} (x)$ to its children. The expressions for these messages are given by

$\begin{array}{l} π_{X} (x) & = \sum_{U} p (x | U) \prod_{m = 1}^{M} π_{U_{m}, X} (U_{m}), \end{array}$ $\begin{array}{l} π_{X} (x) & = \sum_{U} p (x | U) \prod_{m = 1}^{M} π_{U_{m}, X} (U_{m}), \end{array}$

si176_e (8.116)

$\begin{array}{l} γ_{X} (U) & = \sum_{x} \prod_{n = 1}^{N} λ_{Y_{n}, X} (x) p (x | U), \end{array}$ $\begin{array}{l} γ_{X} (U) & = \sum_{x} \prod_{n = 1}^{N} λ_{Y_{n}, X} (x) p (x | U), \end{array}$

si177_e (8.117)

$\begin{array}{l} B E L_{X} (x) & = α \times \prod_{n = 1}^{N} λ_{Y_{n}, X} (x) \times π_{X} (x), \end{array}$ $\begin{array}{l} B E L_{X} (x) & = α \times \prod_{n = 1}^{N} λ_{Y_{n}, X} (x) \times π_{X} (x), \end{array}$

si178_e (8.118)

$\begin{array}{l} λ_{X, U_{m}} (U_{m}) & = \sum_{U, \neq U_{m}} γ_{X} (U) \times \prod_{j \neq m} π_{U_{j}, X} (U_{j}), \end{array}$ $\begin{array}{l} λ_{X, U_{m}} (U_{m}) & = \sum_{U, \neq U_{m}} γ_{X} (U) \times \prod_{j \neq m} π_{U_{j}, X} (U_{j}), \end{array}$

(8.119)

$\begin{array}{l} π_{X, Y_{n}} (x) & = π_{X} (x) \times \prod_{i \neq n} λ_{Y_{i}, X} (x), \end{array}$ $\begin{array}{l} π_{X, Y_{n}} (x) & = π_{X} (x) \times \prod_{i \neq n} λ_{Y_{i}, X} (x), \end{array}$

(8.120)

where U = (U₁, U₂, …, U_M) and Y = (Y₁, Y₂, …, U_N).

f08-09-9780128019504 — Fig. 8.9 Illustration of message passing in BP.

In the initialization stage of Pearl’s BP, the initial values are defined as

$\begin{array}{l} λ_{X, U} (u) = \{\begin{array}{l} p (x_{0} | u), & X is evidence, x = x_{0}, \\ 1, & X is not evidence \end{array} \end{array}$ $\begin{array}{l} λ_{X, U} (u) = \{\begin{array}{l} p (x_{0} | u), & X is evidence, x = x_{0}, \\ 1, & X is not evidence \end{array} \end{array}$

si181_e (8.121)

and

$\begin{array}{l} π_{X, Y} (x) = \{\begin{array}{l} δ (x, x_{0}), & X is evidence, x = x_{0}, \\ p (x), & X is not evidence . \end{array} \end{array}$ $\begin{array}{l} π_{X, Y} (x) = \{\begin{array}{l} δ (x, x_{0}), & X is evidence, x = x_{0}, \\ p (x), & X is not evidence . \end{array} \end{array}$

si182_e (8.122)

Iterative decoding

Based on the tools of a Bayesian network, we can derive the iterative decoding procedure. Fig. 8.10 shows the procedure of message passing in the decoding of a CPS: x_t−2 summarizes all the information obtained from previous time slots and transmits π-message π_{x_t−2, x_t−1}(x_t−2) to x_t−1. The BP procedure can be implemented in either synchronous or asynchronous manner. To accelerate the procedure, we implement the asynchronous Pearl BP. The updating order and message passing in one iteration are given in Procedure 9.

f08-10-9780128019504 — Fig. 8.10 Bayesian network structure and message passing for a CPS.

Procedure 9

BP-Based Channel Decoding

1: Set the maximal iteration times N

2: for Each time slot t do

3: Receive r(t).

4: for Iteration time i ≤ N. do

5: Pass message $x_{t - 1} \to 213 y_{t - 1}$ $x_{t - 1} \to 213 y_{t - 1}$ ; i.e., the sensor observation procedure.

6: Pass message $y_{t - 1} \to 213 b_{t - 1}$ $y_{t - 1} \to 213 b_{t - 1}$ ; i.e., the quantization procedure.

7: Pass message $b_{t - 1} \to 213 y_{t - 1}$ $b_{t - 1} \to 213 y_{t - 1}$ ; i.e., the observation reconstruction procedure.

8: Pass message $y_{t - 1} \to 213 x_{t - 1}$ $y_{t - 1} \to 213 x_{t - 1}$ ; i.e., the system estimation procedure.

9: Pass message $x_{t - 1} \to 213 x_{t}$ $x_{t - 1} \to 213 x_{t}$ ; i.e., the system state evolution.

10: Pass message $x_{t} \to 213 y_{t}$ $x_{t} \to 213 y_{t}$ ; i.e., the observation procedure.

11: Pass message $y_{t} \to 213 b_{t}$ $y_{t} \to 213 b_{t}$ ; i.e., the quantization procedure.

12: Pass message $b_{t} \to 213 y_{t}$ $b_{t} \to 213 y_{t}$ ; i.e., the observation reconstruction procedure.

13: Pass message $y_{t} \to 213 x_{t}$ $y_{t} \to 213 x_{t}$ ; i.e., the system state estimation procedure.

14: Pass message $x_{t} \to 213 x_{t - 1}$ $x_{t} \to 213 x_{t - 1}$ ; i.e., the trace back procedure.

15: x_t−1 updates information.

16: end for

17: Use the belief of b(t) as the prior information for the decoding with r(t).

18: end for

The algorithm has been tested in the context of voltage control in smart grids [165]. Numerical results have shown that the performance of both decoding and system state estimation is substantially improved, when compared with traditional separate decoding and estimation. One potential challenge in joint decoding and estimation is the possibility of error propagation; i.e., the decoding error or estimation error will be propagated to the next time slot, since the decoding and estimation procedures in different time slots are correlated. An effective approach to handle error propagation is to monitor the performance and restart the decoding/estimation procedure once performance degradation is detected.

8.6 Control-oriented channel coding

Most channel coding schemes are designed for common purpose data communications. Although they can also be used in the context of controlling a CPS, it is not clear whether these coding schemes designed for pure data communications are optimal for the purpose of control. There have been some studies on designing specific channel codes for control systems [169, 170]. In this section we give a brief introduction to these efforts, which may provide insights for further communication system designs in CPSs, although they have not been implemented in real systems.

8.6.1 Trajectory Codes

Trajectory codes for the purpose of automatic control were proposed in Ref. [169]. The main feature is the online error correction capability for real-time estimation and control.

System model

We assume that the communication channel has binary input and output. The transmitter has a time-varying state x_t, which is represented by the vertex of a graph G. The edges in the graph G show the possibility of state transitions. It is assumed that G is known to both the transmitter and receiver. This setup is similar to the random walk on graphs.

The time is divided into rounds and channel use times. In each round, the state of the transmitter makes one move in G, and then the transmitter can use the communication channel M times. The encoded bits are based on all the history of the transmitter state. The goal of the design is to find a coding scheme that enables the receiver to obtain accurate estimation of the transmitter’s state with a large probability. Hence the transmitter can be considered as the combination of physical dynamics and communication transmitter. The coding rate is given by

$\begin{array}{l} ρ = \frac{log Δ}{M}, \end{array}$ $\begin{array}{l} ρ = \frac{log Δ}{M}, \end{array}$

(8.123)

where Δ is the maximum out-degree or in-degree of nodes in the graph G.

The receiver outputs an estimation on x_t, which is denoted by ${\hat{x}}_{t}$ ${\hat{x}}_{t}$ , based on all the received coded bits. The error of decoding is represented by the shortest length between x_t and ${\hat{x}}_{t}$ ${\hat{x}}_{t}$ in the graph G, which is denoted by $d_{G} (x, \hat{x})$ $d_{G} (x, \hat{x})$ . The communication scheme has an error exponent κ if

$\begin{array}{l} P (d_{G} (x, \hat{x}) \geq l) \leq exp (- κ l) . \end{array}$ $\begin{array}{l} P (d_{G} (x, \hat{x}) \geq l) \leq exp (- κ l) . \end{array}$

(8.124)

It was shown in Ref. [169] that the communication is online efficient if the time and space complexities of coding and decoding are of the order of ${(log t)}^{O (1)}$ ${(log t)}^{O (1)}$ . If the coding scheme has a positive rate, positive error exponent and is online efficient, then we say that the code is asymptotically good.

Note that the problem considered above is more like a system estimation problem. The control procedure is not considered; i.e., the system state x_t evolves independently of the communication performance. However, the system state is closely related to the control problems, since the error of system state estimation substantially determines the performance of control; meanwhile the system state x_t can be considered as the system state in which the impact of control has been removed.

Concept of trajectory codes

The trajectory of the transmitter’s state can be represented by a path in the corresponding graph G, whose initial time is denoted by t₀. If two trajectories, γ and γ′, have the same length, the same starting time and the same initial vertex, we denote this by γ ∼ γ′. The distance between two trajectories indicates the number of different states in the two trajectories and is denoted by τ(γ, γ′).

A trajectory code, denoted by χ, maps from a trajectory to a certain alphabet in a concatenation manner:

$\begin{array}{l} χ (γ) = (χ (γ (t_{0} + 1), t_{0} + 1), \dots, χ (γ (t_{0} + t), t_{0} + t)) . \end{array}$ $\begin{array}{l} χ (γ) = (χ (γ (t_{0} + 1), t_{0} + 1), \dots, χ (γ (t_{0} + t), t_{0} + t)) . \end{array}$

(8.125)

The Hamming distance between two equal-length codewords is denoted by h. The relative distance of a trajectory code is denoted by

$\begin{array}{l} δ = inf_{γ \sim γ^{'}} \frac{h (χ (γ), χ (γ^{'}))}{τ (γ, γ^{'})}, \end{array}$ $\begin{array}{l} δ = inf_{γ \sim γ^{'}} \frac{h (χ (γ), χ (γ^{'}))}{τ (γ, γ^{'})}, \end{array}$

(8.126)

which implies the capability of distinguishing two different trajectories of the transmitter’s state. A trajectory code is said to be asymptotically good if it has a positive rate and provides a positive relative distance.

Obviously, the larger the relative distance δ, the more capable the receiver is of estimating the transmitter’s state. When δ is large, even if there are some transmission errors, the remaining discrepancy between two codewords can still be used to distinguish the two different state trajectories. Then the major challenge is the existence and construction of an asymptotically good trajectory code. The following theorem guarantees the existence of correspondence:

Theorem 55

Every graph has an asymptotically optimal trajectory code. A trajectory code with δ < 1 is always feasible.

The details of the proof are given in Ref. [169]. Here we provide a sketch of the proof, which is based on the probabilistic approach and random coding, similarly to Shannon’s random coding argument [33]. First we consider a random coding scheme; i.e., the transmitter randomly selects an output from the output alphabet for each possible input (i.e., a state, or equivalently a vertex in G). Consider two trajectories γ₁ and γ₂, which satisfy γ₁ ∼ γ₂ but do not have common vertices except for the initial one. We call these two trajectories twins. It is easy to verify that

$\begin{array}{l} inf_{γ_{1} \sim γ_{2}} \frac{h (χ (γ), χ (γ^{'}))}{τ (γ, γ^{'})} = inf_{γ_{1}, γ_{2} are twins} \frac{h (χ (γ), χ (γ^{'}))}{τ (γ, γ^{'})} . \end{array}$ $\begin{array}{l} inf_{γ_{1} \sim γ_{2}} \frac{h (χ (γ), χ (γ^{'}))}{τ (γ, γ^{'})} = inf_{γ_{1}, γ_{2} are twins} \frac{h (χ (γ), χ (γ^{'}))}{τ (γ, γ^{'})} . \end{array}$

si201_e (8.127)

Then we define the event

$\begin{array}{l} A_{γ} = \{γ = (γ_{1}, γ_{2})| γ_{1} \sim γ_{2}, \frac{h (χ (γ), χ (γ^{'}))}{τ (γ, γ^{'})}) \leq δ\} . \end{array}$ $\begin{array}{l} A_{γ} = \{γ = (γ_{1}, γ_{2})| γ_{1} \sim γ_{2}, \frac{h (χ (γ), χ (γ^{'}))}{τ (γ, γ^{'})}) \leq δ\} . \end{array}$

si202_e (8.128)

If we can prove

$\begin{array}{l} \cap_{γ} A_{γ}^{c} \neq ϕ, \end{array}$ $\begin{array}{l} \cap_{γ} A_{γ}^{c} \neq ϕ, \end{array}$

(8.129)

then there exists at least one asymptotically good trajectory code. In Ref. [169], Eq. (8.129) is proved by citing the Lovasz Local Lemma.

Construction of trajectory codes

Although the existence of trajectory codes has been proved, it is still not clear how to construct these codes. In Ref. [169], two approaches are proposed to construct trajectory codes for the special case of d-dimensional grids. In this book, we provide a brief introduction to the first method.

Note that the grid graph is determined by the set of vertices V_n, d to be

$\begin{array}{l} V_{n, d} = {- n / 2 + 1, n / 2}^{d} . \end{array}$ $\begin{array}{l} V_{n, d} = {- n / 2 + 1, n / 2}^{d} . \end{array}$

(8.130)

Two nodes with distance 1 have a connecting edge. The goal of the construction procedures is to design a trajectory code

$\begin{array}{l} χ : V_{n, d} \times {1, \dots, n / 2} \to S, \end{array}$ $\begin{array}{l} χ : V_{n, d} \times {1, \dots, n / 2} \to S, \end{array}$

(8.131)

which is asymptotically good.

The first approach of construction is based on two codes, where $n_{1} \in Θ (log n)$ $n_{1} \in Θ (log n)$ and k is the least even integer no less than $\frac{12}{1 - δ} + 4$ $\frac{12}{1 - δ} + 4$ .

• Block code: Consider a block code $η : V_{n, d} \to R_{1}^{n_{1}}$ $η : V_{n, d} \to R_{1}^{n_{1}}$ , where R₁ is a particular finite alphabet. Intuitively, η encodes each vertex in V_n, d to a block code with codeword length n₁. We assume that the block code η has a positive rate and a relative distance (1 + δ)/2. The coding and decoding procedures can be computed in a time of order $n_{1}^{O (1)}$ $n_{1}^{O (1)}$ . η can be rewritten as

$\begin{array}{l} η (x) = (η_{1} (x, 1), \dots, η_{1} (x, n_{1})), \end{array}$ $\begin{array}{l} η (x) = (η_{1} (x, 1), \dots, η_{1} (x, n_{1})), \end{array}$

(8.132)

where $η_{1} : V_{n, d} \times {1, \dots, n_{1}} \to R_{1}$ $η_{1} : V_{n, d} \times {1, \dots, n_{1}} \to R_{1}$ .

• Recursive code: We assume that $χ_{1} : V_{k n_{1}, \dots, d} \times {1, \dots, k n_{1} / 2} \to S_{1}$ $χ_{1} : V_{k n_{1}, \dots, d} \times {1, \dots, k n_{1} / 2} \to S_{1}$ is a trajectory code with relative distance (1 + δ)/2, where S₁ is a finite alphabet. The construction of this code will be provided later.

Now we cover the space V_n, d by overlapping tiles. The tile placed at vertex x = (x₁, …, x_d, x_d+1) ∈ V_n, d is the mapping defined as

$\begin{array}{l} σ_{x} (y) = (χ_{1} (y - x), η_{1} (x_{1}, \dots, x_{d}, (y_{d + 1} - x_{d + 1} mod n_{1}))), \end{array}$ $\begin{array}{l} σ_{x} (y) = (χ_{1} (y - x), η_{1} (x_{1}, \dots, x_{d}, (y_{d + 1} - x_{d + 1} mod n_{1}))), \end{array}$

(8.133)

which maps from the domain

$\begin{array}{l} (\prod_{i = 1}^{d} (x_{1} - k n_{1} / 2 + 1, \dots, x_{i} + k n_{1} / 2)) \times (x_{d + 1} + 1, \dots, x_{d + 1} + k n_{1} / 2), \end{array}$ $\begin{array}{l} (\prod_{i = 1}^{d} (x_{1} - k n_{1} / 2 + 1, \dots, x_{i} + k n_{1} / 2)) \times (x_{d + 1} + 1, \dots, x_{d + 1} + k n_{1} / 2), \end{array}$

si214_e (8.134)

to S₁ × R₁.

We assume that n can be divided by kn₁. Then we pick out the tiles placed at the locations x having the following form:

$\begin{array}{l} x = (n_{1} k z_{1} + n_{1} a_{1}, \dots, n_{1} k z_{d + 1} + n_{1} a_{d + 1}), \end{array}$ $\begin{array}{l} x = (n_{1} k z_{1} + n_{1} a_{1}, \dots, n_{1} k z_{d + 1} + n_{1} a_{d + 1}), \end{array}$

(8.135)

where {z_i}_i satisfy

$\begin{array}{l} (z_{1}, \dots, z_{d + 1}) \in {\{- \frac{n}{2 k n_{1}} + 1, \dots, \frac{n}{2 k n_{1}}\}}^{d} \times \{1, \dots, \frac{n}{2 k n_{1}}\}, \end{array}$ $\begin{array}{l} (z_{1}, \dots, z_{d + 1}) \in {\{- \frac{n}{2 k n_{1}} + 1, \dots, \frac{n}{2 k n_{1}}\}}^{d} \times \{1, \dots, \frac{n}{2 k n_{1}}\}, \end{array}$

si216_e (8.136)

and

$\begin{array}{l} (a_{1}, \dots, a_{d + 1}) \in {\{- \frac{k}{2} + 1, \dots, \frac{k}{2}\}}^{d} \times {0, \dots, k - 1} . \end{array}$ $\begin{array}{l} (a_{1}, \dots, a_{d + 1}) \in {\{- \frac{k}{2} + 1, \dots, \frac{k}{2}\}}^{d} \times {0, \dots, k - 1} . \end{array}$

si217_e (8.137)

The set of these tiles, which actually form a covering of V_n, d, are labeled using (a₁, …, a_d+1). Hence there are a total of k^d+1 such sets. This procedure is illustrated in Fig. 8.11, in which we set n₁ = 4 and k = 2.

f08-11-9780128019504 — Fig. 8.11 Illustration of the tiles for designing trajectory codes.

The trajectory code is then given by

$\begin{array}{l} χ (y) = {χ_{(a_{1}, \dots, a_{d + 1}) (y)}}_{(a_{1}, \dots, a_{d + 1})}, \end{array}$ $\begin{array}{l} χ (y) = {χ_{(a_{1}, \dots, a_{d + 1}) (y)}}_{(a_{1}, \dots, a_{d + 1})}, \end{array}$

(8.138)

that is, for each state y, the coder output is the concatenation of all the coder outputs at the tiles satisfying Eq. (8.135).

The intuition behind the complicated procedure of construction is given below:

• The coder for long trajectories (of the order of n) is based on the coders for short trajectories (of the order of $log n$ $log n$ ).

• When the transmitter state is at a particular point of R^d, the coder output is determined by the output of the tiles close to it.

• The coder output of each tile in Eq. (8.133) consists of the recursive coder output, which encodes the transmitter state with respect to the center point of the tile, and the output of the block coder, which encodes the location of the center point of the tile.

It was shown in Ref. [169] that the above construction of trajectory codes can achieve the relative distance δ < 1, whose proof is involved. Then the only unsolved problem is the construction of the coder χ₁ for each tile. Since the problem size is much smaller, it is possible to obtain the coder by using exhaustive search, thus solving the whole problem.

8.6.2 Anytime Channel Codes

In this section, we follow the discussion in Ref. [170] to show that it is possible to design the channel codes directly for control.

System model

We consider a linear system with system state x and observations y, which are given by

$\begin{array}{l} x (t + 1) = F x (t) + G u (t) + w (t) \end{array}$ $\begin{array}{l} x (t + 1) = F x (t) + G u (t) + w (t) \end{array}$

(8.139)

and

$\begin{array}{l} y (t) = H x (t) + v (t), \end{array}$ $\begin{array}{l} y (t) = H x (t) + v (t), \end{array}$

(8.140)

where u(t) is the control action, w(t) and v(t) are bounded noise processes. It is assumed that (F, G) are controllable and (F, H) are observable.

At time t, the sensor observes y(t) and generates a k-bit message, which is a function of all existing observations. Suppose that n bits can be transmitted for the k information bits. Hence the data transmission rate is k/n.

Anytime reliability

According to the discussion in Chapter 5, the communication should satisfy the anytime reliability in order to stabilize the linear system. We say that an encoder-decoder pair can achieve (R, β, d₀)-reliability over a particular communication channel if

$\begin{array}{l} P (min {τ : \hat{b} (τ | t) \neq b_{τ}} = t - d + 1) \leq η 2^{- n β d}, \forall d \geq d_{0}, \forall t, \end{array}$ $\begin{array}{l} P (min {τ : \hat{b} (τ | t) \neq b_{τ}} = t - d + 1) \leq η 2^{- n β d}, \forall d \geq d_{0}, \forall t, \end{array}$

(8.141)

where $\hat{b} (τ | t)$ $\hat{b} (τ | t)$ denotes the estimation of b_τ (the message transmitted at time τ) given the received messages until time t, and η is a constant. Intuitively Eq. (8.141) indicates that the probability of decoding error decreases exponentially with elapsed time.

Note that this definition of anytime reliability is not literally the same as the original one in Ref. [13]. In Ref. [13], the encoder-decoder pair can achieve α-anytime reliability if

$\begin{array}{l} P (\hat{b} (τ | t) \neq b (τ)) \leq K 2^{- α (t - τ)}, \end{array}$ $\begin{array}{l} P (\hat{b} (τ | t) \neq b (τ)) \leq K 2^{- α (t - τ)}, \end{array}$

(8.142)

for any t and τ. However, it is easy to see that these two definitions are equivalent. First, if Eq. (8.141) holds, then we have

$\begin{array}{l} P (\hat{b} (τ | t) \neq b (τ)) & \leq \sum_{d = t - τ}^{\infty} P (min {τ : \hat{b} (τ | t) \neq b_{τ}} = t - d + 1) \\ \leq \sum_{d = t - τ}^{\infty} η 2^{- n β d} \\ = \frac{η 2^{- n β (t - τ)}}{1 - 2^{- n β}}, \end{array}$ $\begin{array}{l} P (\hat{b} (τ | t) \neq b (τ)) & \leq \sum_{d = t - τ}^{\infty} P (min {τ : \hat{b} (τ | t) \neq b_{τ}} = t - d + 1) \\ \leq \sum_{d = t - τ}^{\infty} η 2^{- n β d} \\ = \frac{η 2^{- n β (t - τ)}}{1 - 2^{- n β}}, \end{array}$

si225_e (8.143)

which implies β-anytime reliability. The first inequality arises because the event that b_τ is incorrectly decoded belongs to the event that the first incorrectly decoded message is b_s, where s ≤ τ.

On the other hand, if the encoder-decoder pair is α-anytime reliable according to Eq. (8.142), we have

$\begin{array}{l} P (min {τ : \hat{b} (τ | t) \neq b_{τ}} = t - d + 1) & \leq P (\hat{b} (t - d + 1 | t) \neq b (t - d + 1)) \\ \leq K 2^{- α d}, \end{array}$ $\begin{array}{l} P (min {τ : \hat{b} (τ | t) \neq b_{τ}} = t - d + 1) & \leq P (\hat{b} (t - d + 1 | t) \neq b (t - d + 1)) \\ \leq K 2^{- α d}, \end{array}$

si226_e (8.144)

where the first inequality arises because the event that the first decoding error happens for b_t−d belongs to the event that b_t−d is incorrectly decoded. According to the conclusion in Ref. [13, Theorem 5.2], the anytime reliability can guarantee the stability of linear systems with sufficiently small outage probability.

Linear tree codes

Ref. [170] proposed the use of linear tree codes to design the anytime reliable code. Note that the concept of tree codes will be explained in the next section. Due to the requirement of causal coding, the output of linear tree codes c_r at time slot r is given by

$\begin{array}{l} c_{r} = f_{r} (b_{1 : r}) = \sum_{k = 1}^{r} G_{r k} b_{k}, \end{array}$ $\begin{array}{l} c_{r} = f_{r} (b_{1 : r}) = \sum_{k = 1}^{r} G_{r k} b_{k}, \end{array}$

si227_e (8.145)

where b_k is the k-bit information symbol generated at time k, the subscript r is the current time index, and {G_rk} are the n × k generating matrices. This coding procedure is illustrated in Fig. 8.12.

f08-12-9780128019504 — Fig. 8.12 Illustration of the linear tree code.

Hence the overall generating matrix can be written as

$\begin{array}{l} G_{n, R} = (\begin{array}{l} G_{11} & 0 & \dots & \dots & \dots \\ G_{21} & G_{22} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ G_{r 1} & G_{r 2} & \dots & G_{r r} & \dots \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ \end{array}) . \end{array}$ $\begin{array}{l} G_{n, R} = (\begin{array}{l} G_{11} & 0 & \dots & \dots & \dots \\ G_{21} & G_{22} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ G_{r 1} & G_{r 2} & \dots & G_{r r} & \dots \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ \end{array}) . \end{array}$

si228_e (8.146)

The parity check matrix of G_n, R is denoted by H_n, R, which satisfies

$\begin{array}{l} G_{n, R} H_{n, R} = 0 . \end{array}$ $\begin{array}{l} G_{n, R} H_{n, R} = 0 . \end{array}$

(8.147)

It is easy to verify that H_n, R also has the lower triangular matrix form.

Maximum likelihood decoding

We assume that maximum likelihood (ML) decoding is used. Since we want to test whether the communication is anytime reliable, we focus on the probability of the event that the earliest wrongly decoded message is d time slots ago. Denoting this probability by $P_{t, d}^{e}$ $P_{t, d}^{e}$ , we use the union bound to upper bound $P_{t, d}^{e}$ $P_{t, d}^{e}$ :

$\begin{array}{l} P_{t, d}^{e} \leq \sum_{c \in C_{t, d}} P (0 is decoded as c), \end{array}$ $\begin{array}{l} P_{t, d}^{e} \leq \sum_{c \in C_{t, d}} P (0 is decoded as c), \end{array}$

si232_e (8.148)

where $C_{t, d}$ $C_{t, d}$ is the set of codewords whose earliest nonzero symbol occurs at time t − d, and we assume that an all-zero sequence is transmitted.

It was shown in Ref. [171] that the error probability of ML decoding is bounded by

$\begin{array}{l} P (0 is decoded as c) \leq ζ^{∥ c ∥}, \end{array}$ $\begin{array}{l} P (0 is decoded as c) \leq ζ^{∥ c ∥}, \end{array}$

(8.149)

where ζ is the Bhattacharya parameter, which is defined as (here we assume a discrete channel output alphabet $Z$ $Z$ )

$\begin{array}{l} ξ = \sum_{z \in Z} \sqrt{p (z | X = 0) p (z | X = 1)} . \end{array}$ $\begin{array}{l} ξ = \sum_{z \in Z} \sqrt{p (z | X = 0) p (z | X = 1)} . \end{array}$

(8.150)

Hence according to the union bound in Eq. (8.148), we have

$\begin{array}{l} P_{t, d}^{e} \leq \sum_{w_{min, d}^{t} \leq w \leq n d} N_{w, d}^{t} ζ^{2}, \end{array}$ $\begin{array}{l} P_{t, d}^{e} \leq \sum_{w_{min, d}^{t} \leq w \leq n d} N_{w, d}^{t} ζ^{2}, \end{array}$

si237_e (8.151)

where w is the weight of decoder output c, $N_{w, d}^{t}$ $N_{w, d}^{t}$ is the number of codes in $C_{t, d}$ $C_{t, d}$ having weight w, and $w_{min, d}$ $w_{min, d}$ is the minimum weight of the codes in $C_{t, d}$ $C_{t, d}$ . The reason why w ≤ nd is that there is no error before the t − dth message. This motivates the following definition:

Definition 23

A linear tree code in Eq. (8.145) is said to have (α, θ, d₀)-anytime distance if:

• The parity check matrix H_n, R has a full rank for all t > 0.

• $w_{min, d}^{t} \geq α$ $w_{min, d}^{t} \geq α$ and $N_{w, d}^{t} \leq 2^{θ w}$ $N_{w, d}^{t} \leq 2^{θ w}$ , for all t > 0 and d ≥ d₀.

The first requirement on the full rank of H_n, R is to guarantee that the encoding procedure is invertible. The second condition implies

$\begin{array}{l} P_{t, d}^{e} \leq η 2^{- α ({log}_{2} (\frac{1}{ζ}) - θ)}, \end{array}$ $\begin{array}{l} P_{t, d}^{e} \leq η 2^{- α ({log}_{2} (\frac{1}{ζ}) - θ)}, \end{array}$

(8.152)

where

$\begin{array}{l} η = {(1 - 2^{{log}_{2} (\frac{1}{ζ}) - θ})}^{- 1} . \end{array}$ $\begin{array}{l} η = {(1 - 2^{{log}_{2} (\frac{1}{ζ}) - θ})}^{- 1} . \end{array}$

si245_e (8.153)

Based on the above discussion, we have the following proposition:

Proposition 6

If the linear tree code uses ML decoding and has(α, θ, d₀) -anytime distance, then it is(R, β, d₀) -anytime reliable, for channels having Bhattacharya parameter ζ where

$\begin{array}{l} β = α (log (\frac{1}{ζ}) - θ) . \end{array}$ $\begin{array}{l} β = α (log (\frac{1}{ζ}) - θ) . \end{array}$

(8.154)

Code construction

Now the challenge is how to construct a linear code satisfying the requirement of (α, θ, d₀)-anytime distance. In Ref. [170], the following Toeplitz structure was considered:

$\begin{array}{l} H_{n, R}^{T Z} = (\begin{array}{l} H_{1} & 0 & \dots & \dots & \dots \\ H_{2} & H_{1} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ H_{r} & H_{r - 1} & \dots & H_{1} & \dots \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ \end{array}), \end{array}$ $\begin{array}{l} H_{n, R}^{T Z} = (\begin{array}{l} H_{1} & 0 & \dots & \dots & \dots \\ H_{2} & H_{1} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ H_{r} & H_{r - 1} & \dots & H_{1} & \dots \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ \end{array}), \end{array}$

si247_e (8.155)

which assures the causality of the coding procedure.

Then a random construction approach for $H_{n, R}^{T Z}$ $H_{n, R}^{T Z}$ is proposed in Procedure 10.

Procedure 10

Constructing the Linear Tree Code

1: Select a fixed full rank binary matrix for H₁.

2: for k = 2, 3, …do

3: for Each element in H_k do

4: Select 1 with probability p and 0 with probability1 − p.

5: end for

6: end for

The main theorem in Ref. [170] shows that the above construction results in an anytime reliable code with a large probability.

Theorem 56

The probability that the construction procedure in Procedure 10 results in a code with(α, θ, d₀) -anytime distance is bounded by

$\begin{array}{l} P (the resulting code has (α, θ, d_{0}) -anytime distance) \geq 1 - 2^{- Ω (n d_{0})}, \end{array}$ $\begin{array}{l} P (the resulting code has (α, θ, d_{0}) -anytime distance) \geq 1 - 2^{- Ω (n d_{0})}, \end{array}$

(8.156)

where

$\begin{array}{l} α < H^{- 1} (1 - R log (1 / (1 - \bar{p}))) \end{array}$ $\begin{array}{l} α < H^{- 1} (1 - R log (1 / (1 - \bar{p}))) \end{array}$

(8.157)

and

$\begin{array}{l} θ > - log [{(1 - \bar{p})}^{- (1 - R)} - 1], \end{array}$ $\begin{array}{l} θ > - log [{(1 - \bar{p})}^{- (1 - R)} - 1], \end{array}$

(8.158)

with $\bar{p} = min (p, 1 - p)$ $\bar{p} = min (p, 1 - p)$ .

The detailed proof of this theorem in Ref. [170] is quite involved. In this book, we provide an intuitive sketch of the proof. Consider time slot t. Suppose that an all-zero codeword is sent and consider a codeword c≠0. If $H_{r, R}^{t} c = 0$ $H_{r, R}^{t} c = 0$ , then c may be confused with the correct one, which results in a decoding error. Due to the requirement of anytime reliability, we are interested in the locations of errors (or equivalently the nonzero elements of c). Since we require that the messages before time t − d be correctly decoded, we have

$\begin{array}{l} \{\begin{array}{l} c (τ) = 0, \forall τ \leq t - d, \\ c (t - d + 1) \neq 0 \\ H_{n, R}^{t} c = 0, \end{array} \end{array}$ $\begin{array}{l} \{\begin{array}{l} c (τ) = 0, \forall τ \leq t - d, \\ c (t - d + 1) \neq 0 \\ H_{n, R}^{t} c = 0, \end{array} \end{array}$

si254_e (8.159)

where c = (c₀, …, c_t). This is equivalent to

$\begin{array}{l} (\begin{array}{l} H_{1} & 0 & \dots & \dots & \dots \\ H_{2} & H_{1} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ H_{r} & H_{r - 1} & \dots & H_{1} & \dots \end{array}) (\begin{array}{l} c_{t - d + 1} \\ c_{t - d + 2} \\ ⋮ \\ c_{t} \end{array}) = (\begin{array}{l} 0 \\ 0 \\ ⋮ \\ 0 \end{array}), \end{array}$ $\begin{array}{l} (\begin{array}{l} H_{1} & 0 & \dots & \dots & \dots \\ H_{2} & H_{1} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ H_{r} & H_{r - 1} & \dots & H_{1} & \dots \end{array}) (\begin{array}{l} c_{t - d + 1} \\ c_{t - d + 2} \\ ⋮ \\ c_{t} \end{array}) = (\begin{array}{l} 0 \\ 0 \\ ⋮ \\ 0 \end{array}), \end{array}$

si255_e (8.160)

which can be rewritten as

$\begin{array}{l} (\begin{array}{l} C_{1 - d + 1} & 0 & \dots & \dots & \dots \\ C_{t - d + 2} & C_{t - d + 1} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ C_{t} & C_{t - 1} & \dots & C_{t - d + 1} & \dots \end{array}) (\begin{array}{l} h_{1} \\ h_{2} \\ ⋮ \\ h_{d} \end{array}) = (\begin{array}{l} 0 \\ 0 \\ ⋮ \\ 0 \end{array}), \end{array}$ $\begin{array}{l} (\begin{array}{l} C_{1 - d + 1} & 0 & \dots & \dots & \dots \\ C_{t - d + 2} & C_{t - d + 1} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ C_{t} & C_{t - 1} & \dots & C_{t - d + 1} & \dots \end{array}) (\begin{array}{l} h_{1} \\ h_{2} \\ ⋮ \\ h_{d} \end{array}) = (\begin{array}{l} 0 \\ 0 \\ ⋮ \\ 0 \end{array}), \end{array}$

si256_e (8.161)

where $h_{i} = vec (H_{i}^{T})$ $h_{i} = vec (H_{i}^{T})$ and $C_{i} = diag (c_{i}^{T}, \dots, c_{i}^{T})$ $C_{i} = diag (c_{i}^{T}, \dots, c_{i}^{T})$ . Since H₁ has been fixed in the construction procedure, we have

$\begin{array}{l} (\begin{array}{l} C_{1 - d + 1} & 0 & \dots & \dots & \dots \\ C_{t - d + 2} & C_{t - d + 1} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ C_{t - 1} & C_{t - 2} & \dots & C_{t - d + 1} & \dots \end{array}) (\begin{array}{l} h_{2} \\ h_{3} \\ ⋮ \\ h_{d} \end{array}) = (\begin{array}{l} C_{t - d + 2} \\ C_{t - d + 3} \\ ⋮ \\ C_{t} \end{array}) h_{1}, \end{array}$ $\begin{array}{l} (\begin{array}{l} C_{1 - d + 1} & 0 & \dots & \dots & \dots \\ C_{t - d + 2} & C_{t - d + 1} & 0 & \dots & \dots \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ C_{t - 1} & C_{t - 2} & \dots & C_{t - d + 1} & \dots \end{array}) (\begin{array}{l} h_{2} \\ h_{3} \\ ⋮ \\ h_{d} \end{array}) = (\begin{array}{l} C_{t - d + 2} \\ C_{t - d + 3} \\ ⋮ \\ C_{t} \end{array}) h_{1}, \end{array}$

si259_e (8.162)

and C_t−d+1h₁ = 0. We abbreviate Eq. (8.162) to

$\begin{array}{l} C h = C^{'} h_{1} . \end{array}$ $\begin{array}{l} C h = C^{'} h_{1} . \end{array}$

(8.163)

Hence the higher dimensional vector h cannot be arbitrarily chosen due to the linear constraint in Eq. (8.163). So h must be selected within a lower dimensional subspace. Since {h_i}, i = 2, …, t, are randomly chosen, the corresponding probability that h falls in a lower dimensional subspace can be bounded by the following lemma:

Lemma 8

Consider a randomly selected m -dimensional vector v in{0, 1}^m with probability

$\begin{array}{l} P (V) = p^{∥ v ∥} {(1 - p)}^{m - ∥ v ∥}, \end{array}$ $\begin{array}{l} P (V) = p^{∥ v ∥} {(1 - p)}^{m - ∥ v ∥}, \end{array}$

(8.164)

that is, with probability p (or 1 − p ), 1 (or 0) is selected for each dimension of v . Then the probability that v lies in an l -dimensional subspace U ( l ≤ m ) is bound by

$\begin{array}{l} P (U) \leq max {(p, 1 - p)}^{m - l} . \end{array}$ $\begin{array}{l} P (U) \leq max {(p, 1 - p)}^{m - l} . \end{array}$

(8.165)

The detailed evaluation of the upper bound will lead to the desired conclusion.

8.6.3 Convolutional LDPC for Control

In the previous subsection, it is rigorously shown that the proposed coding scheme can stabilize the physical dynamics in a CPS. However, the following two issues in the coding scheme hinder its application in real practice:

• The parity check matrices are generated randomly, so it is difficult to control the decoding error probability.

• With a large probability, the matrix H_r is nonzero, which means that even the bits with a long time lapse still have an impact on the current coding procedure; hence all the previous information bits need to be stored, thus requiring infinite memory.

To handle these two challenges, some practical coding schemes have been proposed. In this book we briefly introduce the convolutional LDPC codes [172, 173].

It was proposed in Ref. [172] that the LDPC convolutional codes can be used for assuring anytime reliability for the purpose of controlling dynamics in a CPS. Originally LDPC convolutional codes were proposed for pure data communications, rather than for CPSs. Essentially, the parity check matrix in the LDPC convolutional codes has the following form:

$\begin{array}{l} H_{\infty} = (\begin{array}{l} H_{0} (1) & \dots & \dots & \dots & \dots \\ H_{1} (1) & H_{0} (2) & \dots & \dots & \dots \\ ⋮ & H_{1} (2) & ⋮ & \dots & \dots \\ H_{m_{s}} (1) & ⋮ & ⋱ & H_{0} (t) & ⋮ \\ ⋮ & H_{m_{s}} (2) & \dots & H_{1} (t) & ⋮ \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ \\ \dots & \dots & \dots & H_{m_{s}} (t) & ⋮ \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ \end{array}), \end{array}$ $\begin{array}{l} H_{\infty} = (\begin{array}{l} H_{0} (1) & \dots & \dots & \dots & \dots \\ H_{1} (1) & H_{0} (2) & \dots & \dots & \dots \\ ⋮ & H_{1} (2) & ⋮ & \dots & \dots \\ H_{m_{s}} (1) & ⋮ & ⋱ & H_{0} (t) & ⋮ \\ ⋮ & H_{m_{s}} (2) & \dots & H_{1} (t) & ⋮ \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ \\ \dots & \dots & \dots & H_{m_{s}} (t) & ⋮ \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ \end{array}), \end{array}$

si263_e (8.166)

where m_s is the memory length. It is easy to verify that the generating matrix G has a similar structure; hence the output of the tth message is given by

$\begin{array}{l} c (t) = \sum_{s = t - m_{s} + 1}^{t} b (s) H_{t - s} . \end{array}$ $\begin{array}{l} c (t) = \sum_{s = t - m_{s} + 1}^{t} b (s) H_{t - s} . \end{array}$

si264_e (8.167)

Such a scheme handles the above two challenges in the following manner:

• There have been plenty of studies on how to optimize the LDPC codes; moreover, efficient decoding algorithms have been found for the LDPC codes, which makes the LDPC convolutional codes fairly practical.

• The convolutional structure makes the encoding procedure causal. Moreover, the finite memory requires only bounded memory.

It was shown in Ref. [172] that the proposed convolutional LDPC code can achieve anytime reliability. The details can be found therein.

8.7 Channel coding for interactive communication in computing

The channel coding schemes in the previous sections are dedicated to system state estimation and control in a CPS. We can also consider state estimation and control as a problem of computing: the agents (sensors or controllers) in a CPS have local parameters or observations on the physical dynamics and want to compute the control actions as functions of the local numbers. Hence the agents can communicate by using an existing protocol of interactive communications (e.g., a sensor sends an observation to a controller; or two controllers exchange their control actions and observations). Such a protocol can always be designed when there is no communication error. Since a noisy communication channel can incur transmission errors, it is important to study how to carry out the interactive communication protocol designed for noiseless communication channels. Although the study on coding for interactive communication in computing is not confined to the purpose of control, it is of significant importance for us to design channel coding schemes in a CPS. Hence in this section, we follow the seminal work of Schulman [174] to introduce the studies on this topic.

8.7.1 System Model

For simplicity, we consider two agents A and B, which have local discrete arguments z_A and z_B, respectively. They are required to compute a function f(A, B). The two agents can communicate with each other, one bit per transmission. The simplest approach is for A to transmit z_A to B and then B calculates f(z_A, z_B). However, in many situations, fewer bits can be transmitted to obtain f(z_A, z_B), which is encompassed in research on communication complexity [94, 95].

We assume that the protocol for computing f(z_A, z_B) has been designed, given the assumption that there is no communication error. Now our challenge is how to carry out the communication for computing with a given probability of transmission errors, when the communication channel is noisy (thus the probability of transmission error is nonzero).

8.7.2 Tree Codes

The construction of a communication protocol for computing the function f(z_A, z_B) is based on the concept of tree codes. A d-ary tree with depth n is a tree in which each nonleaf node has d child nodes and the number of levels is n. Then a d-ary tree code over alphabet S with distance parameter α and depth n can be represented by a d-ary tree with depth n, in which each edge corresponds to an element in S. A codeword is generated by tracing from the root to a certain node in the tree and concatenating the outputs of the edges. We denote by W(v) the codeword generated by node v in the tree. Consider two nodes v₁ and v₂ with the same depth h in the tree, whose least common ancestor has depth h − l. Then Δ(W(v₁), W(v₂)) ≥ αl, where Δ(⋅, ⋅) is the Hamming distance. An illustration of the tree code is given in Fig. 8.13, where h = 5 and l = 3. Hence if α = 0.5, we have Δ(W(v₁), W(v₂)) ≥ 2.

f08-13-9780128019504 — Fig. 8.13 Illustration of tree code.

Construction of potent tree codes

In Ref. [174], the existence of good tree code with large relative Hamming distance is only assumed; the detailed construction of the code is not discussed. It was not until recent years that the explicit construction of the tree code was discussed [175], in which some intuitions were provided at the beginning. Since the random coding scheme has achieved great success in traditional information theory, the first idea for the code construction is to try the random coding scheme; i.e., we randomly assign output alphabets to each edge in the tree. This looks reasonable, since we require two paths in the tree with large discrepancies (as illustrated in Fig. 8.14A); otherwise it is hard to distinguish the two trajectories V P′ and V P. Since the outputs are randomly assigned, the output sequences should have a large Hamming distance with a large probability.

f08-14-9780128019504 — Fig. 8.14 (A) large deviations (B) crossed deviations (C) small deviations.

However, this is not the only requirement. We should let short divergent paths in the tree have sufficiently large Hamming distances. This nontrivial requirement is illustrated in Fig. 8.14B. Suppose that the transmission errors make the pebbles diverge from the correct path to wrong locations A, A′, B, B′. Assume that these short divergent paths have small relative Hamming distances. Then it is possible that the agents keep making mistakes due to the small relative Hamming distances. For example, the agent may be distracted to the wrong location A and return to the correct path after recognizing the mistake; however, it makes a mistake soon after and diverges to A′. In the subsequent procedure, the agent keeps falling into wrong locations such as B and B′. Hence in Ref. [174] it is required that all paths have large Hamming distances.

However, it is not necessary to require all paths to have large Hamming distances, which incurs too many constraints. It is argued in Ref. [175] that we can allow some branches in the tree to have small relative Hamming distances. As illustrated in Fig. 8.14C, each path from the root to a leaf is allowed to have some small diverged branches that have small relative Hamming distances. It is shown in Ref. [175] that this requirement can assure the success of protocol simulation.

To quantify the above requirement on the tree code, we need the following definition of a potent tree code:

Definition 24

Consider two nodes u and v with the same depth h in the tree, whose least common ancestor is w with depth h − l. We say that u and v are α-bad nodes if

$\begin{array}{l} Δ (W (u), W (v)) < α l . \end{array}$ $\begin{array}{l} Δ (W (u), W (v)) < α l . \end{array}$

(8.168)

The paths from w to u and v are called α-bad paths. The interval [h − l, h] is called an α-bad interval.

Consider a tree code with depth N. If there is a path Q whose union of all α-bad intervals has a total length no less than ϵN, then we say that this tree is an (ϵ, α)-bad tree. Otherwise, it is an (ϵ, α)-potent tree code.

For a detailed construction of potent tree code, it is proposed in Ref. [175] to use the ϵ-biased sample space, which is defined as follows:

Definition 25

Consider a sample space S on n bits. For any element x = (x₁, …, x_n) in X and any nonzero binary sequence a = (a₁, …, a_n), we define $y (x, a) = \sum_{i = 1}^{n} x_{i} a_{i}$ $y (x, a) = \sum_{i = 1}^{n} x_{i} a_{i}$ , where the addition is binary. X is said to be ϵ-biased with respect to a linear test if it satisfies

$\begin{array}{l} | P (y (X, a) = 0) - P (y (X, a)) | \leq ϵ, \end{array}$ $\begin{array}{l} | P (y (X, a) = 0) - P (y (X, a)) | \leq ϵ, \end{array}$

(8.169)

where X is randomly selected from S, for any nonzero a.

It is shown in Ref. [175] that an ϵ-biased sample space can be constructed as follows: consider a prime p > (n/ϵ)²; a point in S is given by a number x in [0, 1, …, p − 1] mapped to the n-bit sequence (r₀(x), …, r_n−1(x)), where

$\begin{array}{l} r_{i} (x) = \frac{1 - χ_{p} (x + i)}{2}, \end{array}$ $\begin{array}{l} r_{i} (x) = \frac{1 - χ_{p} (x + i)}{2}, \end{array}$

(8.170)

and χ_p(x) denotes the quadratic character of x (mod p).

Once an ϵ-biased sample space S is constructed, we can use it to build an (ϵ, α)-potent tree code. For simplicity, we assume that the output of each edge in the tree is binary. Since there are a total of d^N edges in a d-ary tree, the outputs of all edges in the tree can be described by a d^N-bit sequence: we sort the edges in the tree in a predetermined order and then assign each element in the sequence to an edge. This forms a d-ary small biased tree code.

Small biased tree codes have a useful property defined as follows:

Definition 26

A sample space for n bits, x₁, …, x_n, is called (ϵ, k)-independent if

$| P ((x_{i_{1}}, \dots, x_{i_{k}}) = ξ) - 2^{- k} | \leq ϵ,$ $| P ((x_{i_{1}}, \dots, x_{i_{k}}) = ξ) - 2^{- k} | \leq ϵ,$

for any k indices i₁, …, i_k and any k-bit sequence ξ.

The intuitive explanation of the property of (ϵ, k)-independence is that the corresponding sequence is very close to uniformly distributed for any k-subset. It is shown in Ref. [175] that if a sample space is ϵ-biased then it is also ((1 − 2^−k)ϵ, k)-independent, for any k. This property guarantees that, with a large probability, the small divergent branches in a small biased tree code have reasonable relative Hamming distances, since their bit assignments are “very random.”

In Ref. [175], the following theorem guarantees that the construction of small biased tree codes leads to potent tree codes, with a large probability:

Theorem 57

Consider ϵ and α in (0, 1) . With probability 1 − 2^−Ω(N) , a d -ary small biased tree code having depth N is (ϵ, α) -potent if the alphabet size is larger than (2d)^2+2/ϵ/(1 − α).

The detailed proof is given in Ref. [175]. Here we provide a brief summary of the basic idea of the proof. Consider a node v. It is possible that there exists another node u having the same depth such that u forms a bad interval of l. To that end, we need to check the last l bits leading to v and u, denoted by W_l(v) and W_l(u), respectively. Due to the property of small biasedness in Definition (26), the Hamming distance between W_l(v) and W_l(u) should be large with a large probability. As will be shown later, potent tree codes can accomplish the task of iterative communications.

Explicit construction of tree codes

In the above discussion, the construction of potent tree codes is based on a random coding scheme. It is of substantial theoretic importance since it can prove the existence of potent tree codes. However, it is of no use in practice due to the complexity caused by the lack of efficient coding and decoding schemes.

A breakthrough in the deterministic and efficient construction of tree codes was made by Braverman [176]. The corresponding cost of computation time is subexponential; i.e., the time costs of code construction, encoding, and decoding are all $2^{n^{ϵ}}$ $2^{n^{ϵ}}$ for a tree code with size n, where ϵ is a small number.

In this book, we focus on the deterministic construction of tree code. The corresponding coding and decoding complexities can be found in Ref. [176]. The basic idea of tree code construction is to combine multiple small-sized tree codes and then obtain a large-scale tree code. To that end, we introduce an operation called tree code product, i.e., we construct a tree code with depth d₁ × d₂ from two tree codes having depths d₁ and d₂.

First we assume that a tree code with depth d has been constructed. Here we assume that d is small such that we can use exhaustive search to find the tree code. Then we convert it to a local tree code of depth D ≫ d and locality l. Note that a local tree code with depth D and locality l, as well as other parameters, is defined as follows:

Definition 27

A local tree code, with depth d, distance α, locality l, input alphabet size σ_i, and output alphabet size σ_o, is a σ_i regular tree with depth d. In the tree, each edge is labeled with a letter from σ_o. This defines the coding mapping $T$ $T$ . For any three codewords w, w₁, and w₂ over the input alphabet, such that

$\begin{array}{l} \{\begin{array}{l} | w_{1} | = | w_{2} | \leq l, \\ | w | + | w_{1} | \leq d, \\ w_{1} (q) \neq w_{2} (1), \end{array} \end{array}$ $\begin{array}{l} \{\begin{array}{l} | w_{1} | = | w_{2} | \leq l, \\ | w | + | w_{1} | \leq d, \\ w_{1} (q) \neq w_{2} (1), \end{array} \end{array}$

si272_e (8.171)

we have

$\begin{array}{l} Δ (T (w \circ w_{1}), T (w \circ w_{2})) \geq α | w_{1} |, \end{array}$ $\begin{array}{l} Δ (T (w \circ w_{1}), T (w \circ w_{2})) \geq α | w_{1} |, \end{array}$

(8.172)

where ∘ means the operation of concatenation.

Remark 17

Eq. (8.172) means that, for two input sequences, the distance of their coding outputs is proportional to the depth of their diverged paths in the tree. This is the same as the definition of the tree code. The difference from the tree code is that we only require this property for the sequences with small disparities (i.e., the depth of the subtree rooted at the least ancestor is at most l). This is why we call it “local” tree code.

The construction of the local tree code with a much larger depth D is carried out by repeating the mapping $T$ $T$ provided by the original tree code with a much smaller depth d. It is shown in Ref. [176] that this construction satisfies the requirement for local tree code.

Based on the conversion from a tree code to a local tree code with much larger depth, we can combine two tree codes with smaller depths to obtain a new tree code with a larger depth, which is stated in the following theorem:

Theorem 58

Consider two tree codes $T_{I}$ $T_{I}$ and $T_{O}$ $T_{O}$ , whose depths, distances, and alphabet sizes are given by (d₁, α₁, σ_i, σ_o1) and (d₂, α₂, σ_i, σ_o2) , respectively. Then one can construct a product tree code $T_{P}$ $T_{P}$ having parameters (d, α, σ_i, σ_o) which satisfy

$\begin{array}{l} \{\begin{array}{l} α = min (α_{1}, α_{2} / 10), \\ d = \frac{d_{1} d_{2}}{4}, \\ σ_{o} = {(σ_{o 1} σ_{o 2})}^{O (1)} . \end{array} \end{array}$ $\begin{array}{l} \{\begin{array}{l} α = min (α_{1}, α_{2} / 10), \\ d = \frac{d_{1} d_{2}}{4}, \\ σ_{o} = {(σ_{o 1} σ_{o 2})}^{O (1)} . \end{array} \end{array}$

si278_e (8.173)

The procedures of code construction and encoding are illustrated in Fig. 8.15. Essentially, the encoding is a concatenation of an inner code $T_{i}^{'}$ $T_{i}^{'}$ and an outer code T_o′. The inner code $T_{i}^{'}$ $T_{i}^{'}$ is a local tree code, which is obtained from $T_{i}$ $T_{i}$ . It takes care of the divergent paths whose divergence is short. Then the coding output is fed into the outer code T_o′, which takes care of longer paths. $T_{o}^{'}$ $T_{o}^{'}$ is obtained from an intermediate code $T_{o}^{″}$ $T_{o}^{″}$ . Simply speaking, $T_{o}^{″}$ $T_{o}^{″}$ is obtained by spreading $T_{o}$ $T_{o}$ over different blocks. Then the output of $T_{o}^{″}$ $T_{o}^{″}$ is protected by traditional error correction codes, thus forming $T_{o}^{'}$ $T_{o}^{'}$ . The details of the construction can be found in Ref. [176].

f08-15-9780128019504 — Fig. 8.15 Illustration of the procedure of constructing the product tree code.

8.7.3 Protocol Simulation

We explain the protocol based on the tree codes, as proposed in Ref. [174].

The protocol for computing the function f(z_A, z_B), denoted by π, can be represented by a 4-tree, in which the trajectory of a communication for computing can be represented by a path from the root to a leaf node. Each level of edges means one round of communications. Note that agents A and B transmit simultaneously within one round. For each node in the tree, the four child nodes led by the four edges 00, 01, 10, and 11 represent the state after this round of communications. For example, the node led by edges 00 is the state if the bits transmitted in this round are both zero. This procedure can be illustrated by the first two rounds:

• In the first round, agents A and B transmit bits π_A(z_A, ϕ) and π_B(z_B, ϕ), respectively. The two bits are denoted by m₁ = π(z, ϕ), which generates the four child nodes.

• In the second round, the agents A and B transmit bits π_A(z_A, π(z, ϕ)) and π_B(z_B, π(z, ϕ)), respectively. The two transmitted bits are denoted by m₂.

In a generic round, say round i, the agents A and B transmit bits π_A(z_A, {m_i}_{j=1, …, i−1}) and π_B(z_B, {m_i}_{j=1, …, i−1}), respectively. This procedure can be represented by a trajectory in a tree denoted by $T$ $T$ . At node v of $T$ $T$ , agent A (or B) transmits bit π_A(z_A, v) (or π_B(z_B, v)).

Given a tree code and the protocol π for noiseless communication channels, we can design the protocol subject to noisy channels by simulating the protocol π. The basic idea is that both agents estimate each other’s current state by using the received bits. If the two agents have different estimations of the current location in $T$ $T$ , due to errors in the transmissions, they will begin to find that the received bits are possibly different from what they expect. At this time, they will trace back in the tree $T$ $T$ and try to synchronize their estimated location in $T$ $T$ .

To that end, the following structures are needed at both agents A and B:

• Pebble: Each agent has a pebble to indicate its conjecture on the current location in the tree $T$ $T$ , namely which node has been reached in the protocol π. In each round of communications, the pebble of each agent could be moved upstream (action $B$ $B$ ) or downstream (indicated by one of the four possible edges, 00, 01, 10, and 11) of the tree $T$ $T$ , or keep its current location (action $H$ $H$ ). The agent also transmits either 0 or 1 according to the edge in $T$ $T$ ; e.g., the current location of the protocol tree $T$ $T$ .

• Agent state: The state of each agent can be represented by another 12-ary tree $Y$ $Y$ . Each node has 12 child nodes, indicated by the movement of the pebble (six possibilities) and transmitted bit (two possibilities).

With the two structures explained above, we can describe the protocol for simulating π with noisy communication channels, which is detailed in Procedure 11.

Procedure 11

Iterative Communication for Simulating Protocol π

1: Input: π: the protocol to be simulated; T: the number of rounds in the protocol π; (z_A, z_B): local inputs; $T$ $T$ : the protocol tree of π: $Y$ $Y$ : the tree of system state.

2: for t = 1 : 5T do

3: Agent A checks its state s_Ain $Y_{A}$ $Y_{A}$ .

4: Agent A sends out bit w(s_A).

5: Agent A estimates the current state g of agent B such that the Hamming distance Δ(W(g), Z) is minimized, where Z is the received bits and W(g) is the bit sequence sent by B if its current state is g.

6: Compute the corresponding location of the pebble of B, namely pebble(g).

7: Agent A compares its own pebble location v_A and pebble(g).

8: if v_A is a strict ancestor of pebble(g) then

9: Agent A does not move its pebble in $T$ $T$ (i.e., the action is $H$ $H$ ). Reset its own state.

10: end if

11: if v_A and pebble(g) have a strict common ancestor then

12: Move the pebble upstream in $T$ $T$ (i.e., the action is $B$ $B$ ). Reset its own state.

13: end if

14: if v_A = pebble(g) then

15: Move the pebble downstream in $T$ $T$ .

16: end if

17: Agent B carries out the same steps as those of A.

18: end for

Note that we can fix the total number of rounds at 5T: if the simulation has been completed before time 5T, we let the agents send bits 0 until 5T; if the simulation has not been completed, we still stop at time 5T and claim that the procedure of simulation fails.

Note that, for the potent tree code proposed in Ref. [175], the interactive protocol is still the same.

8.7.4 Performance Analysis

We first focus on the analysis in Ref. [174], where it is shown that the procedure described in Procedure 11 can simulate the original protocol π with an exponentially small failure probability. This is summarized in the following theorem:

Theorem 59

We assume a binary symmetric communication channel, and the existence of a tree code with relative distance 1/2. The simulation procedure described in Procedure 11 , which is run 5T times (T is the number of rounds in the original protocol 4π ), can simulate π with an error probability less than or equal to 2^−5T.

The rigorous proof of this important theorem is given in Ref. [174]. In this book, we provide an intuitive explanation. The key step is to quantify the level of success in the simulation procedure by defining a concept called mark:

Definition 28

Consider the two pebbles at v_A and v_B in $T$ $T$ . The least common ancestor is denoted by $\bar{v}$ $\bar{v}$ . The mark of the simulation procedure is defined as

$\begin{array}{l} mark (v_{A}, v_{B}, \bar{v}) = depth (\bar{v}) - max (d (\bar{v}, v_{A}), d (\bar{v}, v_{B})), \end{array}$ $\begin{array}{l} mark (v_{A}, v_{B}, \bar{v}) = depth (\bar{v}) - max (d (\bar{v}, v_{A}), d (\bar{v}, v_{B})), \end{array}$

(8.174)

where d is the distance in the tree.

It is important to realize that π is successfully simulated if the mark at the termination (time 5T) is at least T: if so, the least common ancestor has a depth of at least T, which means that the two pebbles have reached a node with depth T simultaneously and thus implies the success of the simulation. This is rigorously proved in Lemma 4 of Ref. [174]. Hence the estimation of the error probability is adjusted to the problem of evaluating the probability that the mark is less than T.

The second important point is to realize the impact of a good move (when both agents estimate the system states correctly) or a bad move (when a state estimation error occurs):

• A good move increases the mark by 1.

• A bad move decreases the mark by no more than 3.

Then a sufficient condition for a successful simulation of π is that the proportion of good moves is at least 4/5, since at the termination at time 5T, we have

$\begin{array}{l} mark \geq \frac{4}{5} \times 5 T - 3 \times \frac{1}{5} \times 5 T = T . \end{array}$ $\begin{array}{l} mark \geq \frac{4}{5} \times 5 T - 3 \times \frac{1}{5} \times 5 T = T . \end{array}$

(8.175)

Therefore the problem becomes one of bounding the proportion of bad moves given the tree code with relative distance 1/2. We denote by l(t) the larger one of the pebble location errors. If there is a bad move at time t, we define the error interval as t − l(t) + 1, …, t, since there must be transmission errors within this error interval. Each bad move must be within an error interval; hence we can use the number of error intervals to bound the number of bad moves. A more detailed analysis leads to the conclusion in Theorem 59.

In Ref. [175], the requirement of the existence of tree code is relaxed to that of the potent tree code, while the interactive protocol is still the same as that in Ref. [174]. The corresponding performance is summarized in the following theorem:

Theorem 60

Suppose that the communication channel is binary symmetric. Consider a (1/20, α) -potent tree code. Then a protocol π of length T without communication error can be successfully simulated by Procedure 11 with probability 1 − 2^−Ω(T) , if it is carried out for N = O(T) rounds.

8.8 Conclusions

In this chapter, we have discussed the physical dynamics-aware design of the physical layer in the communication network of a CPS. Various approaches have been discussed, ranging from modulation to channel coding. These are still the subject of research in academia. Almost all existing communication protocols of the physical layer in a CPS have not included the awareness of physical dynamics. The main challenge lies in the following aspects:

• The computational complexity of these algorithms is still too high. For example, in the adaptive modulation, the scheduler needs to solve an LMI. Although efficient algorithms from convex optimization can be used, it is still challenging for real-time control in CPSs.

• These algorithms require detailed models and parameters of the physical dynamics, which may be unavailable in many situations. Hence lack of availability of information on physical dynamics could disable these algorithms. Even if the models and parameters can be obtained from system identification or machine learning, it is still not clear whether these algorithms are robust to errors in the models and parameters.

Hence there is still a long way to go before the physical dynamics-aware design of the physical layer becomes useful in practice.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8.5 Physical dynamics-aware channel decoding

Create new playlist

Sign In

Sign Up

8.5 Physical dynamics-aware channel decoding

8.5.1 Motivation

8.5.2 System Model

8.5.3 Joint Decoding

A brief introduction to Pearl’s BP

Iterative decoding

8.6 Control-oriented channel coding

8.6.1 Trajectory Codes

System model

Concept of trajectory codes

Construction of trajectory codes

8.6.2 Anytime Channel Codes

System model

Anytime reliability

Linear tree codes

Maximum likelihood decoding

Code construction

8.6.3 Convolutional LDPC for Control

8.7 Channel coding for interactive communication in computing

8.7.1 System Model

8.7.2 Tree Codes

Construction of potent tree codes

Explicit construction of tree codes

8.7.3 Protocol Simulation

8.7.4 Performance Analysis

8.8 Conclusions

Table of Contents for
8.5 Physical dynamics-aware channel decoding