5.4 Stochastic systems: stability

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

5.4 Stochastic systems: stability

In the previous section, we studied the communication requirements for controllers with separated state estimation and control. However, there are two limitations to this setup:

• For generic cases, it is not clear whether separated state estimation and control are optimal.

• For the communication channel, we used the channel capacity to measure the capability of conveying information. However, the concept of channel capacity is reasonable for communications with long delays, not for the case of CPSs with very limited delays.

• The possible transmission errors, due to the noisy channel, are not considered.

Hence in this section, we will study a novel metric for the communication channel in the context of stabilizing stochastic linear physical dynamics, namely the anytime capacity, which was proposed by Sahai and Mitter in 2006 [13].

5.4.1 System Model

For simplicity, we consider a scalar system state in a discrete time framework, whose evolution law is given by

$\begin{array}{l} x (t + 1) = A x (t) + u (t) + w (t), \end{array}$ $\begin{array}{l} x (t + 1) = A x (t) + u (t) + w (t), \end{array}$

(5.107)

where x is the system state, u is the control action, and w is time-independent random perturbation with bounded amplitude (i.e., |w(t)| < Ω/2), E[w] = 0, and $E [w^{2}] = σ_{w}^{2}$ $E [w^{2}] = σ_{w}^{2}$ . We assume that A > 1; otherwise, there is no need for communications since the system is inherently stable.

The overall system under study is illustrated in Fig. 5.18. A sensor observes the system state of the physical dynamics and then sends messages through a noisy communication channel. There could be a feedback channel for the forward communication channel, through which the sensor knows the output of the communication channel. It is also possible for the sensor to know the control action taken by the controller, if the communication channel output is fed back to the sensor and the sensor shares the strategy of the controller.

f05-18-9780128019504 — Fig. 5.18 System model for studying the communication requirements to stabilize the physical dynamics [13].

The following two concepts of system stability will be used in the subsequent analysis:

Definition 12

The physical dynamics with closed-loop control is said to be f-stable if

$\begin{array}{l} P (| x (t) | > m) < f (m), \forall t, m \geq 0, \end{array}$ $\begin{array}{l} P (| x (t) | > m) < f (m), \forall t, m \geq 0, \end{array}$

(5.108)

where f is a positive function of m.

Definition 13

The physical dynamics with closed-loop control is said to be η-stable if there exists a constant K such that

$\begin{array}{l} E [| x (t) |^{η}] \leq K, \forall t \geq 0 . \end{array}$ $\begin{array}{l} E [| x (t) |^{η}] \leq K, \forall t \geq 0 . \end{array}$

(5.109)

The η-stability is looser than the f-stability, since the f-stability places a direct constraint on large values of the system state.

For simplicity, we assume that the noisy communication channel is memoryless and discrete-time. It is characterized by the conditional probability P(r|s), where s (send) and r (receive) are the input and output symbols.

5.4.2 Inadequacy of Channel Capacity

For a communication channel, the traditional channel capacity is given by

$\begin{array}{l} C = sup_{P (S)} I (R; S), \end{array}$ $\begin{array}{l} C = sup_{P (S)} I (R; S), \end{array}$

(5.110)

which is optimized over the possible input symbol probability P(S). As explained in chapter 2, the channel capacity measures the capability of conveying information with asymptotically long codeword lengths and arbitrarily small error rates. However, in the context of controlling physical dynamics, the infinite delay of codewords in the traditional setup of information is intolerable, thus making the traditional concept of channel capacity inadequate.

Below is an example proposed in Ref. [13] showing the inadequacy of the traditional concept of channel capacity. Consider the binary erasure channel illustrated in Fig. 5.19. The probability that the transmitted symbol, 0 or 1, is correctly received is 1 − δ, while the probability of the symbol being erased (thus receiving a common symbol e) is given by δ. Similarly, we can have an L-bit erasure channel, in which the alphabets of inputs and outputs are both {0, 1}^L (hence the input and output are L-dimensional binary vectors, or L-bit packets). The correct transmission probability is p(s|s) = δ, while the erasure probability p(e|s) is δ. It has been shown that the channel capacity of an L-bit erasure channel is given by (1 − δ)L. Furthermore, we can have a real erasure channel, in which the input and output alphabets are real numbers R, and the conditional probabilities are p(x|x) = 1 − δ and p(0|x) = δ.

f05-19-9780128019504 — Fig. 5.19 Illustration of erasure channel.

For a counterexample of traditional channel capacity, we consider the system in Eq. (5.107) with A = 1.5 and Ω = 1 (hence the perturbation satisfies |w(t)|≤ 0.5). We assume that the communication channel is the real erasure case with δ = 0.5. Hence with probability 0.5, the system state x(t) can be perfectly transmitted to the controller, while the transmission is blocked with probability 0.5. The optimal control strategy is to transmit the system state measured by the sensor directly, thus s(t) = x(t); the control action is u(t) = −Ar(t). Hence when the transmitted symbol is successfully passed through the communication channel, the previous system state can be completely canceled by the control action and only the random perturbation remains. However, if the transmitted symbol is erased, no control action is carried out (since u(t) = 0).

In the event that all the transmissions after time t − i − 1 are erased while the first t − i transmissions are correct, the system state at time t + 1 is given by

$\begin{array}{l} x (t + 1) = \sum_{j = 0}^{i} 1 . 5^{j} w (t - j) . \end{array}$ $\begin{array}{l} x (t + 1) = \sum_{j = 0}^{i} 1 . 5^{j} w (t - j) . \end{array}$

si181_e (5.111)

Since the set of events (denoted by $E$ $E$ ) that all the transmissions after time t − i − 1 (i from 0 to t) are erased while the first t − i transmissions are correct is only a subset of the whole set of events (e.g., there could be alternative erasures and transmission successes), we have

$\begin{array}{l} E [x^{2} (t + 1)] & > E [x^{2} (t + 1) | E] P (E) \\ = \sum_{i = 0}^{t} {(\frac{1}{2})}^{i + 1} E [{|\sum_{j = 0}^{i} 1 . 5^{j} w (t - j)|}^{2}] \\ = \sum_{i = 0}^{t} {(\frac{1}{2})}^{i + 1} \sum_{j = 0}^{t} \sum_{k = 0}^{t} 1 . 5^{j + k} E [w (t - j) w (t - k)] \\ = \sum_{i = 0}^{t} {(\frac{1}{2})}^{i + 1} \sum_{j = 0}^{t} 1 . 5^{2 j} σ_{w}^{2} \\ = \frac{4 σ^{2}}{5} \sum_{i = 0}^{t} (1.12 5^{i + 1} - 0 . 5^{i + 1}), \end{array}$ $\begin{array}{l} E [x^{2} (t + 1)] & > E [x^{2} (t + 1) | E] P (E) \\ = \sum_{i = 0}^{t} {(\frac{1}{2})}^{i + 1} E [{|\sum_{j = 0}^{i} 1 . 5^{j} w (t - j)|}^{2}] \\ = \sum_{i = 0}^{t} {(\frac{1}{2})}^{i + 1} \sum_{j = 0}^{t} \sum_{k = 0}^{t} 1 . 5^{j + k} E [w (t - j) w (t - k)] \\ = \sum_{i = 0}^{t} {(\frac{1}{2})}^{i + 1} \sum_{j = 0}^{t} 1 . 5^{2 j} σ_{w}^{2} \\ = \frac{4 σ^{2}}{5} \sum_{i = 0}^{t} (1.12 5^{i + 1} - 0 . 5^{i + 1}), \end{array}$

si183_e (5.112)

where the first inequality is due to the fact that $E$ $E$ is only a subset of the events, and the third equation is due to the assumption that the noise is white. It is easy to verify that E[x²(t + 1)] diverges as $t \to \infty$ $t \to \infty$ .

According to the above argument, the communication channel cannot stabilize the physical dynamics. Meanwhile, the real erasure channel has an infinite channel capacity, since one successfully transmitted real-valued symbol can convey infinite information. Hence the channel capacity in the traditional sense does not necessarily characterize the capability of real-time transmission and system stabilization. Ref. [13] realized this deficiency and thus proposed the concept of anytime capacity, which will be explained subsequently.

5.4.3 Anytime Capacity

The focus of anytime capacity is “on the maximum rate achievable for a given sense of reliability rather than the maximum reliability possible at a given rate” [13]. The structure of the anytime capacity setup is illustrated in Fig. 5.20.

f05-20-9780128019504 — Fig. 5.20 Structure of the encoder and decoder in the anytime capacity setup.

We denote by M_τ the R-bit observation received by the sensor’s encoder at time τ. Based on the output of the communication channel, the decoder obtains a reconstructed message ${\hat{M}}_{τ} (t)$ ${\hat{M}}_{τ} (t)$ (τ ≤ t), namely the estimation of M_τ at time t. The error probability is $P ({\hat{M}}_{1 : t - d} (t) \neq M_{1 : t - d})$ $P ({\hat{M}}_{1 : t - d} (t) \neq M_{1 : t - d})$ , namely the probability that not all the messages M₁, …, M_t−d are correctly decoded at time t. The procedure is illustrated in Fig. 5.21.

f05-21-9780128019504 — Fig. 5.21 Procedure of encoding and decoding in the anytime capacity setup.

Based on this context, the anytime capacity of a communication channel is defined as follows [13]:

Definition 14

A communication system over a noisy channel with rate R has an encoder and decoder pair, which work as follows:

• At time τ, the encoder receives an R-bit message M_τ.

• The encoder generates a channel input based on all the previously received messages and feedbacks from the decoder. There could be a delay 1 + θ in the feedback; hence the information of the feedback mechanism is r(1 : t − 1 − θ) at time t.

• The decoder generates estimates for the messages, namely ${{\hat{M}}_{τ} (t)}_{τ \leq t}$ ${{\hat{M}}_{τ} (t)}_{τ \leq t}$ , at time t based on all the received channel outputs.

The system with rate R achieves anytime reliability α if there exists a constant K such that

$\begin{array}{l} P ({\hat{M}}_{1 : τ} (t) \neq M_{1 : τ}) \leq K 2^{- α (t - τ)}, \end{array}$ $\begin{array}{l} P ({\hat{M}}_{1 : τ} (t) \neq M_{1 : τ}) \leq K 2^{- α (t - τ)}, \end{array}$

(5.113)

for every τ and t. If Eq. (5.113) is valid for every possible message M, then we say that the system achieves uniform anytime reliability α. The corresponding encoding mechanisms are called anytime codes (uniform anytime codes respectively).

The α-anytime capacity C_any(α) of a communication channel is the maximum of the rate R at which a communication system can be designed for the channel to achieve uniform anytime reliability α.

It is easy to verify the following inequality [13]:

$\begin{array}{l} C_{any} (E_{r} (R)) \geq R, \end{array}$ $\begin{array}{l} C_{any} (E_{r} (R)) \geq R, \end{array}$

(5.114)

where E_r(R) is the traditional error exponent for the transmission rate R of the same communication channel [75]. Intuitively, the anytime capacity has a higher requirement than the traditional channel capacity.

Necessity

The following theorem shows the necessity of the anytime capacity for the purpose of stabilizing the linear physical dynamics in a CPS.

Theorem 11

Consider η > 0. If there exists a communication system and controller such that E[|x(t)|^η < K] for noise sequence $| w (t) | \leq \frac{Ω}{2}$ $| w (t) | \leq \frac{Ω}{2}$ , then the communication channel satisfies

$\begin{array}{l} C_{any} (η) \geq {log}_{2} A . \end{array}$ $\begin{array}{l} C_{any} (η) \geq {log}_{2} A . \end{array}$

(5.115)

The rigorous proof of the theorem is given in Ref. [13]. Here, we provide a sketch of the proof. The key point is the simulated plant as shown in Fig. 5.22. Due to the perfect feedback channel, the sensor can simulate the plant observer and controller. The idea is to link communication reliability with control stability.

f05-22-9780128019504 — Fig. 5.22 Anytime code structure with simulated plant.

Below is the basic setup of the proof:

• We can consider the system illustrated in Fig. 5.22 as a special communication system. The information source is a sequence of i.i.d. bits {s_i(t)}_i, t at the sensor. The bit sequence will be regenerated at the controller.

• The bit sequence {s_i(t)}_i, t is encoded as follows. The sensor uses the bit sequence to generate time-independent random perturbations {w(t)}, which correspond to the random perturbations in real physical dynamics. Hence the sensor can generate a simulated plant with simulated random perturbation and simulate control actions. The system state in a simulated plant x consists of two parts $\bar{x}$ $\bar{x}$ , which is driven by the generated random perturbation w, and $\tilde{x}$ $\tilde{x}$ , which is driven by the control action.

• We assume that the simulated physical dynamics can be stabilized by the control action. As we will see, the simulated dynamics x(t), as the sum of the simulated sources $\bar{x} (t)$ $\bar{x} (t)$ and $\tilde{x} (t)$ $\tilde{x} (t)$ , can be reconstructed at the controller with an error of a particular magnitude. The simulated dynamics $\tilde{x} (t)$ $\tilde{x} (t)$ in Fig. 5.22 can also be simulated at the controller due to the shared initial state and the control actions at both the sensor and the controller. Then we can use the bit sequence s(t) to convey information about $\bar{x}$ $\bar{x}$ . Since x is small due to the stabilization, $\bar{x}$ $\bar{x}$ can be approximated by $\tilde{x}$ $\tilde{x}$ . Thus $\bar{x}$ $\bar{x}$ can be reconstructed by the controller with small error (since $\tilde{x}$ $\tilde{x}$ can be generated by the controller), and the bit sequence s(t) can be decoded with a particular error rate, which can be controlled by the system stability characterized by η.

• Essentially, this communication system conveys information via the simulated physical dynamics, thus linking the communication capacity and system stability.

The details are given below. The simulated system state $\tilde{x} (t)$ $\tilde{x} (t)$ has the following dynamics:

$\begin{array}{l} \tilde{x} (t + 1) = A \tilde{x} (t) + u (t), \end{array}$ $\begin{array}{l} \tilde{x} (t + 1) = A \tilde{x} (t) + u (t), \end{array}$

(5.116)

where u(t) can be perfectly simulated by the sensor since it knows the output of the communication channel. In the other simulated dynamics, the system state is driven by the simulated noise, namely

$\begin{array}{l} \bar{x} (t + 1) = A \bar{x} (t) + w (t) . \end{array}$ $\begin{array}{l} \bar{x} (t + 1) = A \bar{x} (t) + w (t) . \end{array}$

(5.117)

Then the sum of $\bar{x}$ $\bar{x}$ and $\tilde{x}$ $\tilde{x}$ is a simulated version of the true dynamics, since

$\begin{array}{l} x (t + 1) & = \bar{x} (t + 1) + \tilde{x} (t + 1) \\ = A \bar{x} (t) + A \tilde{x} (t) + u (t) + w (t) \\ = A x (t) + u (t) + w (t) . \end{array}$ $\begin{array}{l} x (t + 1) & = \bar{x} (t + 1) + \tilde{x} (t + 1) \\ = A \bar{x} (t) + A \tilde{x} (t) + u (t) + w (t) \\ = A x (t) + u (t) + w (t) . \end{array}$

si208_e (5.118)

Since the original physical dynamics can be stabilized and |x(t)| will be sufficiently small, then $\tilde{x} (t)$ $\tilde{x} (t)$ will be sufficiently close to $- \bar{x} (t)$ $- \bar{x} (t)$ .

In Fig. 5.22, we assume that the observer transmits bit i at time $\frac{i}{R}$ $\frac{i}{R}$ where R is the data rate. It is easy to verify that

$\begin{array}{l} \bar{x} (t + 1) & = \sum_{i = 0}^{t} A^{i} w (t - 1 - i) \\ = A^{t} \sum_{j = 0}^{t} A^{- j} w (j) . \end{array}$ $\begin{array}{l} \bar{x} (t + 1) & = \sum_{i = 0}^{t} A^{i} w (t - 1 - i) \\ = A^{t} \sum_{j = 0}^{t} A^{- j} w (j) . \end{array}$

si212_e (5.119)

The sum $\sum_{j = 0}^{t} A^{- j} w (j)$ $\sum_{j = 0}^{t} A^{- j} w (j)$ is similar to the representation of a factional number with base A. Then, using induction, we can prove that $\bar{x}$ $\bar{x}$ can be written as

$\begin{array}{l} \bar{x} (t) = γ A^{t} \sum_{k = 0}^{⌊ R t ⌋} {(2 + ϵ_{1})}^{- k} S_{k}, \end{array}$ $\begin{array}{l} \bar{x} (t) = γ A^{t} \sum_{k = 0}^{⌊ R t ⌋} {(2 + ϵ_{1})}^{- k} S_{k}, \end{array}$

si215_e (5.120)

where S_k is the kth bit, and γ and ϵ₁ are constants whose expressions can be found in Ref. [13]. This can be achieved by representing the tth random perturbation w(t) as

$\begin{array}{l} w (t) = γ A^{t + 1} \sum_{k = ⌊ R t ⌋ + 1}^{⌊ R (t + 1) ⌋} {(2 + ϵ_{1})}^{- k} S_{k} . \end{array}$ $\begin{array}{l} w (t) = γ A^{t + 1} \sum_{k = ⌊ R t ⌋ + 1}^{⌊ R (t + 1) ⌋} {(2 + ϵ_{1})}^{- k} S_{k} . \end{array}$

si216_e (5.121)

Assuming that the rate R is smaller than ${log}_{2} A$ ${log}_{2} A$ , it was shown in Ref. [13] that bits can be encoded into the simulated plant. At the output of the communication channel, estimates ${\bar{s}}_{i} (t)$ ${\bar{s}}_{i} (t)$ can be extracted from the ith bit such that

$\begin{array}{l} P (ŝ_{1 : j} (t) \neq s_{1 : j} (t)) \leq P (| x (t) | \geq A^{t - \frac{j}{R}} (\frac{γ ϵ_{1}}{1 + ϵ_{1}})) . \end{array}$ $\begin{array}{l} P (ŝ_{1 : j} (t) \neq s_{1 : j} (t)) \leq P (| x (t) | \geq A^{t - \frac{j}{R}} (\frac{γ ϵ_{1}}{1 + ϵ_{1}})) . \end{array}$

(5.122)

Here the key idea is that, since we have assumed that the physical dynamics are stabilized and thus x(t) is very small, $\bar{x} (t)$ $\bar{x} (t)$ is very close to $\tilde{x} (t)$ $\tilde{x} (t)$ and thus can be estimated from $\tilde{x} (t)$ $\tilde{x} (t)$ known at the controller (since the initial state and control actions are shared by both the controller and sensor).

Due to the assumption that the system is η-stable, we have

$\begin{array}{l} P (| x (t) | > m) & \leq \frac{E [| x (t) | η]}{m^{η}} \\ \leq K m^{- η}, \end{array}$ $\begin{array}{l} P (| x (t) | > m) & \leq \frac{E [| x (t) | η]}{m^{η}} \\ \leq K m^{- η}, \end{array}$

si223_e (5.123)

where the first inequality is due to the Markov inequality, which measures the probability of deviation from the expectation, and the second one is due to the assumption of η-stability.

Substituting Eq. (5.123) into Eq. (5.122), we have

$\begin{array}{l} P (ŝ_{1 : j} (t) \neq s_{1 : j} (t)) \leq (K (\frac{1}{γ} + \frac{1}{γ ϵ_{1}})) 2^{- η {log}_{2} A (t - \frac{i}{R})} . \end{array}$ $\begin{array}{l} P (ŝ_{1 : j} (t) \neq s_{1 : j} (t)) \leq (K (\frac{1}{γ} + \frac{1}{γ ϵ_{1}})) 2^{- η {log}_{2} A (t - \frac{i}{R})} . \end{array}$

si224_e (5.124)

Notice that $\frac{i}{R}$ $\frac{i}{R}$ is the time that the ith bit is sent and thus $t - \frac{i}{R}$ $t - \frac{i}{R}$ is the (continuous-time) delay of decoding. This concludes the proof by checking the definition of anytime capacity.

Sufficiency

We first extend the anytime capacity to a broader meaning:

Definition 15

We say that a communication system with rate R achieves g-anytime reliability given a function g(d) if

$\begin{array}{l} P ({\hat{M}}_{1 : t - d} (t) \neq M_{1 : t - d} (t)) < g (d), \end{array}$ $\begin{array}{l} P ({\hat{M}}_{1 : t - d} (t) \neq M_{1 : t - d} (t)) < g (d), \end{array}$

(5.125)

where g(d) = 1, ∀d < 0.

Similarly to C_any in the previous discussion, we can define C_g-any. Note that the definition of anytime capacity in Definition 14 is simply a special case of the g-anytime capacity, if the function g is exponential; i.e.,

$\begin{array}{l} g (d) = K 2^{- α d} . \end{array}$ $\begin{array}{l} g (d) = K 2^{- α d} . \end{array}$

(5.126)

Now we discuss the following sufficient condition for the stability of physical dynamics:

Theorem 12

Suppose that the noisy communication channel satisfies $C_{g - any} (g) \geq {log}_{2} A$ $C_{g - any} (g) \geq {log}_{2} A$ . Then the physical dynamics in Eq. (5.107) can be stabilized such that

$\begin{array}{l} P (| x (t) | > m) \leq g (K + {log}_{A} m) . \end{array}$ $\begin{array}{l} P (| x (t) | > m) \leq g (K + {log}_{A} m) . \end{array}$

(5.127)

Encoding procedure

Due to the assumption of perfect feedback of communication channels, the sensor can perfectly know the control action taken by the controller, namely u(t). On the other hand, the sensor also perfectly knows the system state x(t). Hence it also has perfect estimation of the random perturbation by using

$\begin{array}{l} w (t) = x (t + 1) - A x (t) - u (t) . \end{array}$ $\begin{array}{l} w (t) = x (t + 1) - A x (t) - u (t) . \end{array}$

(5.128)

Hence one natural idea is to encode w(t) and send it to the controller, since w is the only randomness in the system. However, Ref. [13] proposed a more effective approach in which “the observer will act as though it is working with a virtual controller through a noiseless channel of finite rate R…” The detailed encoding procedure is given as follows.

The sensor simulates a virtual process $\bar{x} (t)$ $\bar{x} (t)$ whose dynamics are given by

$\begin{array}{l} \bar{x} (t + 1) = A \bar{x} (t) + w (t) + ū (t), \end{array}$ $\begin{array}{l} \bar{x} (t + 1) = A \bar{x} (t) + w (t) + ū (t), \end{array}$

(5.129)

where $ū (t)$ $ū (t)$ is the computed action of a virtual controller simulated at the sensor. It also keeps updating the following two virtual dynamics:

$\begin{array}{l} x_{u} (t + 1) = A x_{u} (t) + ū (t) \end{array}$ $\begin{array}{l} x_{u} (t + 1) = A x_{u} (t) + ū (t) \end{array}$

(5.130)

and

$\begin{array}{l} \tilde{x} (t) = A \tilde{x} (t) + w (t), \end{array}$ $\begin{array}{l} \tilde{x} (t) = A \tilde{x} (t) + w (t), \end{array}$

(5.131)

which results in $\bar{x} (t) = x_{u} (t) + \tilde{x} (t)$ $\bar{x} (t) = x_{u} (t) + \tilde{x} (t)$ . Since the sensor can stabilize the simulated dynamics $\bar{x} (t)$ $\bar{x} (t)$ within an interval, − x_u(t) should be very close to $\tilde{x} (t)$ $\tilde{x} (t)$ . Hence the actual controller will try to take actions close to x_u(t). This can be accomplished by the sensor sending an approximation of the computed control action $ū$ $ū$ to the actual controller, which is illustrated in Fig. 5.23.

f05-23-9780128019504 — Fig. 5.23 Coding procedure for sufficiency.

In summary, the whole system works as follows. Since the sensor knows all the information of the physical dynamics, it simulates the stabilization of the dynamics by using computed control actions; the sensor sends its computed control actions to the controller after quantization; the controller estimates all the previous control actions with an exponentially decreasing error rate (guaranteed by the anytime capacity); then the controller estimates the current system state of the simulated dynamics at the sensor and takes control actions to force the true system state to be close to the simulated one. Note that, although the sensors sends its computed control actions to the controller, it does not directly inform the controller about the control action that the controller should take; it simply informs the controller where the simulated system state is and thus the controller can try to catch up with the simulated system state, which is destined to be stable (Fig. 5.24).

f05-24-9780128019504 — Fig. 5.24 Analysis of coding procedure.

Due to the communication constraint R, the sensor can take one of 2^R values (where we assume that 2^R is an integer). If $\bar{x} (t)$ $\bar{x} (t)$ is within the interval $[- \frac{Δ}{2}, \frac{Δ}{2}]$ $[- \frac{Δ}{2}, \frac{Δ}{2}]$ , $A \bar{x} (t)$ $A \bar{x} (t)$ lies in $[- \frac{A Δ}{2}, \frac{A Δ}{2}]$ $[- \frac{A Δ}{2}, \frac{A Δ}{2}]$ . The sensor can choose one of the possible control actions uniformly aligned in the interval $[- \frac{A Δ}{2}, \frac{A Δ}{2}]$ $[- \frac{A Δ}{2}, \frac{A Δ}{2}]$ , which is within the distance $\frac{A Δ}{2^{R + 1}}$ $\frac{A Δ}{2^{R + 1}}$ of $A \bar{x} (t)$ $A \bar{x} (t)$ , such that

$\begin{array}{l} A \bar{x} (t) + ū (t) \in [- \frac{A Δ}{2^{R + 1}}, \frac{A Δ}{2^{R + 1}}] . \end{array}$ $\begin{array}{l} A \bar{x} (t) + ū (t) \in [- \frac{A Δ}{2^{R + 1}}, \frac{A Δ}{2^{R + 1}}] . \end{array}$

(5.132)

After the perturbation by the random noise, we have

$\begin{array}{l} \bar{x} (t + 1) \in [- \frac{A Δ}{2^{R + 1}} - \frac{Ω}{2}, \frac{A Δ}{2^{R + 1}} + \frac{Ω}{2}] . \end{array}$ $\begin{array}{l} \bar{x} (t + 1) \in [- \frac{A Δ}{2^{R + 1}} - \frac{Ω}{2}, \frac{A Δ}{2^{R + 1}} + \frac{Ω}{2}] . \end{array}$

(5.133)

To make $\bar{x} (t)$ $\bar{x} (t)$ always within $[- \frac{Δ}{2}, \frac{Δ}{2}]$ $[- \frac{Δ}{2}, \frac{Δ}{2}]$ , we require

$\begin{array}{l} \frac{A Δ}{2^{R + 1}} \leq \frac{Δ}{2}, \end{array}$ $\begin{array}{l} \frac{A Δ}{2^{R + 1}} \leq \frac{Δ}{2}, \end{array}$

(5.134)

which can be achieved if

$\begin{array}{l} Δ = \frac{Ω}{1 - A 2^{- R}}, \end{array}$ $\begin{array}{l} Δ = \frac{Ω}{1 - A 2^{- R}}, \end{array}$

(5.135)

when $R > {log}_{2} A$ $R > {log}_{2} A$ .

Then the sensor can simply send out R bits to indicate the virtual control it takes to limit the virtual state $\bar{x} (t)$ $\bar{x} (t)$ within $[- \frac{Δ}{2}, \frac{Δ}{2}]$ $[- \frac{Δ}{2}, \frac{Δ}{2}]$ . Upon receiving the bits from the sensor, the controller chooses a control to make the true system state x(t) as close to the virtually simulated state $\bar{x}$ $\bar{x}$ as possible.

The strategy of the control action computed at the controller is illustrated in Fig. 5.25. The controller obtains the estimate of the system state using

$\begin{array}{l} {\hat{x}}_{t + 1} (t) = \sum_{i = 0}^{t} A^{i} û_{t - i} (t), \end{array}$ $\begin{array}{l} {\hat{x}}_{t + 1} (t) = \sum_{i = 0}^{t} A^{i} û_{t - i} (t), \end{array}$

si258_e (5.136)

where $û + t - i (t)$ $û + t - i (t)$ is the estimation of the computed control action at time t − i (i.e., $ū (t - i)$ $ū (t - i)$ ) given all the received channel outputs before time t + 1.

f05-25-9780128019504 — Fig. 5.25 Procedure for calculating the control action.

Then the controller chooses the control action u(t) such that $\tilde{x} (t + 1)$ $\tilde{x} (t + 1)$ equals $\hat{x} (t)$ $\hat{x} (t)$ ; i.e.,

$\begin{array}{l} u (t) = {\hat{x}}_{t + 1} (t) - A \tilde{x} (t) . \end{array}$ $\begin{array}{l} u (t) = {\hat{x}}_{t + 1} (t) - A \tilde{x} (t) . \end{array}$

(5.137)

Given the above strategies of coding, decoding, and control action, the stability of the dynamics is rigorously proved in Ref. [13].

5.5 Stochastic systems: reduction of shannon entropy

In this section, we analyze the reduction of Shannon entropy for the assessment of communication capacity requirements. As has been explained in chapter 2, entropy indicates the uncertainty of a system. Hence it is desirable to decrease the entropy such that the system state is more concentrated around the desired operation state. However, in stochastic systems, there is a high probability that external random perturbations will increase the entropy, unless the system state entropy has already been very large.² Therefore communication is needed to provide “negative entropy” to compensate for the entropy generated by random perturbations (entropy reduction will eventually be carried out by the controller), thus acting as a bridge linking the analyses of communications and stochastic system dynamics, as illustrated in Fig. 5.26.

f05-26-9780128019504 — Fig. 5.26 An illustration of entropy reduction.

Note that the concept of entropy is also used for the analysis in Section 5.2, where the system is deterministic and the only uncertainty stems from the unknown initial state. In this section, the system is stochastic; hence the Shannon entropy, instead of the topological entropy, is used to measure the system uncertainty. The relationship between the topological entropy and the Shannon-type entropy (more precisely, the Shannon-Sinai entropy of dynamical systems [12]) can be described by the Variational Principle (Theorem 6.8.1 in Ref. [12]), which will not be explained in detail in this book.

In this section, we will provide a comprehensive introduction to communication capacity analysis based on the reduction of Shannon entropy in physical dynamics. We will first provide a qualitative explanation of entropy reduction from the viewpoint of cybernetics. Then we begin from discrete state systems and extend to continuous state systems.

5.5.1 Cybernetics Argument

We first provide an analysis of Shannon entropy in dynamics, using the arguments of cybernetics [77, 78]. The analysis does not explicitly concern communications; however, it provides insights for future discussions.

Law of requisite variety

The relationship between communications and control was originally considered by Wiener [78]. Then Ashby [77] proposed the Law of Requisite Variety in his celebrated book. Here the term “variety” means the number of states of the corresponding system, which is actually closely related to the concept of entropy (if the system state is uniformly distributed, then entropy is equal to log(variety)). Based on the concept of variety, Ashby proposed the follow law of requisite variety:

To control a system, the variety of the control mechanism should be greater than or equal to that of the system being controlled.

Without a rigorous mathematical formulation and proof, Ashby claimed “only variety in the regulator can force down the variety due to the disturbances; only variety can destroy variety.” This is illustrated by the example in Fig. 5.27. We consider two players, R (regulator) and D (disturbance). The varieties of their actions are denoted by V_R and V_D, respectively. We assume that player D takes action first and then R follows. This is similar to practical control systems: the system state is first randomly perturbed and then controlled by the controller. In Fig. 5.27A, each player has three possible actions (α, β and γ for R, and 1, 2, 3 for D). The outcomes of different action pairs are given in the table. The goal of the game is: if the outcome is a, R wins; otherwise, R loses. It is obvious that, regardless of the action taken by player D, player R can always choose its action adaptively and achieve the goal a. In this case, we claim that player R can control the game. However, in Fig. 5.27B, player D has more options than player R, thus making R unable to control the game.

f05-27-9780128019504 — Fig. 5.27 Examples to illustrate the law of requisite variety.

We now consider a more generic case. If two elements in the same row are identical, then player R need not distinguish the corresponding actions of player D, which is too favorable to R. Hence we assume that no two elements in the same row are identical. Then it is easy to prove that the variety of the outcome (denoted by V_O), given the strategy of R, cannot be less than V_D/V_R.

Then if we use a logarithmic scale to measure the variety, the above conclusion can be translated into the following inequality:

$\begin{array}{l} log (V_{O}) \geq log (V_{D}) - log (V_{R}) . \end{array}$ $\begin{array}{l} log (V_{O}) \geq log (V_{D}) - log (V_{R}) . \end{array}$

(5.138)

If the distributions of O, R, and D are all uniform, then we have

$\begin{array}{l} H (O) \geq H (D) - H (R) . \end{array}$ $\begin{array}{l} H (O) \geq H (D) - H (R) . \end{array}$

(5.139)

This inequality implies that a larger H(R) (i.e., having more states of the controller) helps to better reduce the entropy of the output; otherwise, if H(R) is small, the output may have a larger uncertainty than the disturbance. Although communication is not explicitly mentioned in the argument, we can consider the communication network as part of the controller. If the communication capacity is too small, then the controller cannot have much variety (since it does not have many options due to the limited number of reliable messages sent from the sensor), thus causing large entropy (or uncertainty) at the system output. Note that these arguments are merely qualitative. More rigorous analysis will be provided subsequently.

Shannon entropy in discrete-value dynamics

Based on the qualitative argument in the law of requisite variety, we provide a detailed analysis of the entropy change in discrete physical dynamics by following the argument of Conant [79]. Consider a controller R (called a regulator in Ref. [79]) regulating a variable Z (which can be considered as the system state) subject to random perturbation S. R and S jointly determine the output Z. It is assumed that R, S, and Z have finite alphabets and the system evolves in discrete time, given by

$\begin{array}{l} Z (t) = ϕ (S (t), R (t)), \end{array}$ $\begin{array}{l} Z (t) = ϕ (S (t), R (t)), \end{array}$

(5.140)

where ϕ is the evolution law.

We further assume that S is independent of R. Notice that the system dynamics is memoryless. Two types of control strategies are considered, as illustrated in Fig. 5.28:

f05-28-9780128019504 — Fig. 5.28 Illustrations of point and path regulations.

• Point regulation: The goal is to minimize the changes of output Z (i.e., to make the regulated variable Z as constant as possible).

• Path regulation: The goal is to minimize the unpredictability of the outcomes. Hence the outcome Z could change; however, the change should be as predictable as possible.

We analyze the point regulation first. Since the regulator wants to minimize the change (or the uncertainty) of Z, it targets minimizing the entropy of Z. When R is active, the action selected by R is a function of Z (i.e., closed-loop control). The corresponding entropy of Z is denoted by H_c(Z), where the subscript c means closed loop. When R is idle and uses a fixed action i (i.e., open-loop control), the corresponding entropy of Z is denoted by $H_{o}^{i} (Z)$ $H_{o}^{i} (Z)$ , where the subscript o means open loop. The minimum entropy subject to the open-loop control is given by

$\begin{array}{l} H_{o}^{*} (Z) = min_{i} H_{o}^{i} (Z) . \end{array}$ $\begin{array}{l} H_{o}^{*} (Z) = min_{i} H_{o}^{i} (Z) . \end{array}$

(5.141)

Then we obtain the following theorem on the difference between $H_{o}^{*} (Z)$ $H_{o}^{*} (Z)$ and H_c(Z), i.e., the entropy reduced by the controller R when compared with the optimal open-loop control.

Theorem 13

The gap between the open-loop and closed-loop entropies satisfies

$\begin{array}{l} H_{o}^{*} (Z) - H_{c} (Z) \leq I_{c} (R; S) + K, \end{array}$ $\begin{array}{l} H_{o}^{*} (Z) - H_{c} (Z) \leq I_{c} (R; S) + K, \end{array}$

(5.142)

where $K = {log}_{2} (the largest number of repeated entries in ϕ (\cdot, R))$ $K = {log}_{2} (the largest number of repeated entries in ϕ (\cdot, R))$ .

Proof

For open-loop control, we have

$\begin{array}{l} H_{o}^{*} (Z) & = H_{o}^{*} (S, Z) - H_{o}^{*} (S | Z) \\ = H_{o}^{*} (S) + H_{o}^{*} (Z | S) - H_{o}^{*} (S | Z) \\ = H_{c} (S) - H_{o}^{*} (S | Z), \end{array}$ $\begin{array}{l} H_{o}^{*} (Z) & = H_{o}^{*} (S, Z) - H_{o}^{*} (S | Z) \\ = H_{o}^{*} (S) + H_{o}^{*} (Z | S) - H_{o}^{*} (S | Z) \\ = H_{c} (S) - H_{o}^{*} (S | Z), \end{array}$

si272_e (5.143)

where the superscript * means using open-loop control, which minimizes the entropy. Note that we used the facts $H_{c} (S) = H_{o}^{*} (S)$ $H_{c} (S) = H_{o}^{*} (S)$ (since S is not dependent on the control action) and $H_{o}^{*} (Z | S) = 0$ $H_{o}^{*} (Z | S) = 0$ (Z is deterministic given S and R).

For closed-loop control, we have

$\begin{array}{l} H_{c} (Z) & = H_{c} (R, Z) - H_{c} (R | Z) \\ = H_{c} (Z | R) + H_{c} (R) - H_{c} (R | Z) \\ = H_{c} (Z | R) + I_{c} (R; Z) \\ = H_{c} (S | R) + H_{c} (Z | R, S) - H_{c} (S | R, Z) + I_{c} (R; Z), \end{array}$ $\begin{array}{l} H_{c} (Z) & = H_{c} (R, Z) - H_{c} (R | Z) \\ = H_{c} (Z | R) + H_{c} (R) - H_{c} (R | Z) \\ = H_{c} (Z | R) + I_{c} (R; Z) \\ = H_{c} (S | R) + H_{c} (Z | R, S) - H_{c} (S | R, Z) + I_{c} (R; Z), \end{array}$

si275_e (5.144)

where we used the equality H_c(S|R) + H_c(Z|R, S) = H_c(Z|R) + H_c(S|R, Z) (which can be proved by adding H_c(R) to both sides of the equality). Hence we have

$\begin{array}{l} H_{o} (Z) - H_{c} (Z) & = H_{c} (S) - H_{c} (S | R) - I_{c} (R; Z) + H_{c} (S | R, Z) - H_{o}^{*} (S | Z) \\ = I_{c} (R; S) - I_{c} (R; Z) + H_{c} (S | R, Z) - H_{o}^{*} (S | Z) \\ \leq I_{c} (R; S) + H_{c} (S | R, Z), \end{array}$ $\begin{array}{l} H_{o} (Z) - H_{c} (Z) & = H_{c} (S) - H_{c} (S | R) - I_{c} (R; Z) + H_{c} (S | R, Z) - H_{o}^{*} (S | Z) \\ = I_{c} (R; S) - I_{c} (R; Z) + H_{c} (S | R, Z) - H_{o}^{*} (S | Z) \\ \leq I_{c} (R; S) + H_{c} (S | R, Z), \end{array}$

si276_e (5.145)

where the inequality is obtained from the positiveness of I_c(R;Z) and $H_{o}^{*} (S | Z)$ $H_{o}^{*} (S | Z)$ . Notice that, given R and Z, the uncertainty comes from the repeated outcomes in ϕ(⋅, R) for a fixed R. Hence H_c(S|R, Z) ≤ K according to the definition of K. This concludes the proof.

From the definition of K, we have the following corollary:

Corollary 1

If there are no repeated outcomes in $ϕ (\overset{\cdot}{,} R)$ $ϕ (\overset{\cdot}{,} R)$ for afixedR,we have

$\begin{array}{l} H_{o}^{*} (Z) - H_{c} (Z) \leq I_{c} (R; S) . \end{array}$ $\begin{array}{l} H_{o}^{*} (Z) - H_{c} (Z) \leq I_{c} (R; S) . \end{array}$

(5.146)

Remark 1

From the theorem, we observe that, when K = 0, the reduction of entropy (when compared with open-loop control) is bounded by the mutual information between the controller and perturbation. It can be explained as the information about the system dynamics provided by the sensor to the controller, or equivalently the effective information communicated by the sensor. Hence the communication capacity provides an upper bound for the reduction of entropy.

On the other hand, it is possible that the entropy reduction is larger than the communication capacity. This requires K > 0; i.e., the dynamics itself has the capability of reducing entropy when open-loop control is used. One extreme case is that, regardless of R and S, the output Z is always the same, thus making the entropy zero.

We then consider path regulation, in which we have a series of perturbations S(1 : T), control actions R(1 : T), and outputs Z(1 : T). The argument is similar. Again, we consider the open-loop and closed-loop cases. When the controller R is active (i.e., closed-loop control), we add a subscript c to the quantities; when open-loop control is used, we use the subscript o.

Using the same argument as in the point regulation, we can prove a similar conclusion:

$\begin{array}{l} H_{o} (Z (1 : T)) - H_{c} (Z (1 : T)) \leq I (R (1 : T); S (1 : T)) + K T, \end{array}$ $\begin{array}{l} H_{o} (Z (1 : T)) - H_{c} (Z (1 : T)) \leq I (R (1 : T); S (1 : T)) + K T, \end{array}$

(5.147)

which also implies

$\begin{array}{l} h_{o} (Z) - h_{c} (Z) \leq i (R; S) + K, \end{array}$ $\begin{array}{l} h_{o} (Z) - h_{c} (Z) \leq i (R; S) + K, \end{array}$

(5.148)

where h is the entropy rate, defined as $h (X) = {lim}_{T \to \infty} \frac{1}{T} H (X (1 : T))$ $h (X) = {lim}_{T \to \infty} \frac{1}{T} H (X (1 : T))$ , and the average mutual information i is similarly defined.

Shannon entropy in continuous-value dynamics

In the previous discussion, we considered discrete systems with very simple architectures (in particular, they are memoryless). We now provide a detailed analysis of the entropy change in continuous physical dynamics having more detailed and practical structures by following the argument in Ref. [80].

First we consider the entropy in the procedure of system state estimation, since in many situations (such as the optimal control of linear systems), the controller estimates the system state first and then computes the corresponding control action. The estimation procedure is illustrated in Fig. 5.29, where n is the random perturbation, x is the system state, and e is the estimation error. The sensor observes the random perturbation directly and the filter provides an estimation about x. We assume

$\begin{array}{l} \{\begin{matrix} x = D n, \\ v = F (z), \\ e = x - v . \end{matrix} \end{array}$ $\begin{array}{l} \{\begin{matrix} x = D n, \\ v = F (z), \\ e = x - v . \end{matrix} \end{array}$

si283_e (5.149)

f05-29-9780128019504 — Fig. 5.29 System structure for estimation.

The entropy relationships in the estimation problem are disclosed in the following theorem:

Theorem 14

For the above estimation system, we have the following conclusions:

• Regardless of F and D, the entropy of the error vector e is bounded by

$\begin{array}{l} h (e) \geq h (x) - I (x; z) . \end{array}$ $\begin{array}{l} h (e) \geq h (x) - I (x; z) . \end{array}$

(5.150)

• Minimizing h(e) is equivalent to minimizing I(x;z). If x and z are mutually independent, h(e) achieves the minimum.

Remark 2

A smaller h(e) means less uncertainty in e, i.e., a more precise estimation. If e is a scalar and is Gaussian with zero expectation (thus being unbiased), the minimization of error entropy is equivalent to minimizing the mean square error of the estimation. Since x is entirely determined by n, then h(x) is fixed and cannot be changed by the estimation mechanism. Hence according to Eq. (5.150), we can only increase the mutual information between x and z (i.e., the sensor report contains more information about the system state x) in order to decrease the lower bound of h(e). In the second conclusion, making e and z independent is equivalent to all the information in x being contained in z. Actually, in the estimation of linear systems, this is the celebrated orthogonality principle in linear estimations [81].

We then consider the entropy change in the disturbance rejection feedback control, whose architecture is illustrated in Fig. 5.30. Here, n(t) is the random perturbation and x(t) is the system state. The feedback consists of a linear prefilter B, a measurement error w, and a post filter C. The output of the sensor is given by

$\begin{array}{l} z = C (B x + w) . \end{array}$ $\begin{array}{l} z = C (B x + w) . \end{array}$

(5.151)

Then the feedback signal is given by

$\begin{array}{l} v (t) = F (z (t), t), \end{array}$ $\begin{array}{l} v (t) = F (z (t), t), \end{array}$

(5.152)

where F is a generic function which can be nonlinear and time varying.

f05-30-9780128019504 — Fig. 5.30 Architecture of disturbance rejecting feedback control system.

The input of the plant is the error between the perturbation n and the estimation v; i.e.,

$\begin{array}{l} e (t) = n (t) - v (t) . \end{array}$ $\begin{array}{l} e (t) = n (t) - v (t) . \end{array}$

(5.153)

The linear output of the plant x is then given by

$\begin{array}{l} x (t) = D e (t), \end{array}$ $\begin{array}{l} x (t) = D e (t), \end{array}$

(5.154)

where D is a fixed matrix.

Our goal is to compare the entropy of the system state in closed-loop and open-loop control systems. In open-loop control, where we set F = 0, we denote the output by x_o(t), given by x_o(t) = Dn(t). The following theorem shows the relationship between h(x) and h(x_o):

Theorem 15

Regardless of the feedback filter F, the following equality always holds:

$\begin{array}{l} h (x_{o}) - h (x) = I (x_{o}; z) - I (x; z) . \end{array}$ $\begin{array}{l} h (x_{o}) - h (x) = I (x_{o}; z) - I (x; z) . \end{array}$

(5.155)

Immediate conclusions can be drawn in the following corollary:

Corollary 2

The following inequalities hold:

$\begin{array}{l} h (x_{o}) - h (x) \leq I (x_{o}; z), \end{array}$ $\begin{array}{l} h (x_{o}) - h (x) \leq I (x_{o}; z), \end{array}$

(5.156)

$\begin{array}{l} h (x) \geq h (x_{o} | z), \end{array}$ $\begin{array}{l} h (x) \geq h (x_{o} | z), \end{array}$

(5.157)

and

$\begin{array}{l} h (x_{o}) - h (x) \leq I (x_{o}; C (B D n + w)) . \end{array}$ $\begin{array}{l} h (x_{o}) - h (x) \leq I (x_{o}; C (B D n + w)) . \end{array}$

(5.158)

Remark 3

Note that we do not specify the control mechanism since function F is arbitrary. Eqs. (5.156) and (5.158) provide bounds for the entropy reduction brought about by the feedback control. Eq. (5.157) shows that, to minimize the lower bound of h(x_o), we should minimize the uncertainty of x_o given z; i.e., letting z bring as much information about x_o as possible.

Controller design based on entropy

Since entropy measures the uncertainty of system dynamics, it can be used as the criterion of controller synthesis. Here, we briefly introduce the formulation of the entropy-based control system [82]. We assume that y(t) ∈ [a, b] is the output of a stochastic system at time slot t. The control action is denoted by u(t). The distribution of y and u is denoted by γ. The B-spline expansion of the distribution is given in [83]

$\begin{array}{l} γ (y, u) = \sum_{i = 1}^{n} w_{i} (u) B_{i} (y), \end{array}$ $\begin{array}{l} γ (y, u) = \sum_{i = 1}^{n} w_{i} (u) B_{i} (y), \end{array}$

si293_e (5.159)

where B_i is the ith basis function and w_i is the expansion coefficient. The cost function is defined as

$\begin{array}{l} J (u) = - \int_{a}^{b} γ (y, u) log γ (y, u) d y + R ∥ u ∥^{2}, \end{array}$ $\begin{array}{l} J (u) = - \int_{a}^{b} γ (y, u) log γ (y, u) d y + R ∥ u ∥^{2}, \end{array}$

si294_e (5.160)

which is a combination of the output entropy and the cost of action. The optimal control action can then be obtained by taking the derivative of J with respect to u. The details can be found in Ref. [82].

Criticisms on entropy-based control

Although entropy provides a good measure of the uncertainty of dynamics and low entropy is a necessary condition for a good operation status of a system, there have been criticisms of entropy-based control [84]:

• Entropy-based control approaches do not solve the problems that traditional control theory cannot solve. Moreover, entropy is only valid for stochastic systems and cannot handle deterministic systems.

• The criterion of entropy minimization is questionable. Entropy is dependent on the relative shape of the distribution and is independent of the relative locations. For example, the two distributions illustrated in Fig. 5.31 have the same distribution since they have the same shape. However, the one on the left is centered around the desired operation status while the other is not. Hence it should be desirable to achieve the distribution on the left; however, entropy-based control cannot distinguish between the two distributions and may provide the undesired one. A remedy is to add a constraint on the expectation of the distribution when minimizing the entropy.

f05-31-9780128019504 — Fig. 5.31 The same distribution shape with different relative locations.

• The entropy of the output in the sense of an alphabet without the definition of distances may not represent the variance of the output in its numerical meaning. This can be well illustrated in the following example [84]. Suppose that the controller and perturbation each have three possible actions (denoted by (q, r, p) and (a, b, c), respectively). The outcomes of the system are given in the following matrix, where the columns and rows represent the actions of controller and perturbation:

$\begin{array}{l} (\begin{matrix} 1 & 9 & 4 \\ 5 & 2 & 8 \\ 7 & 6 & 3 \end{matrix}) \end{array}$ $\begin{array}{l} (\begin{matrix} 1 & 9 & 4 \\ 5 & 2 & 8 \\ 7 & 6 & 3 \end{matrix}) \end{array}$

si295_e

Consider two strategies of the controller: (A) fix the action p; (B) adaptive control action: $a \to q$ $a \to q$ , $b \to r$ $b \to r$ , and $c \to p$ $c \to p$ . It is easy to check that the first strategy gives the outputs {1, 4, 9} while the second strategy results in {4, 5, 6}. If we interpret the outputs as abstract alphabets, the outputs of the two strategies have the same entropy. However, if the outputs are explained as numbers, obviously the second strategy results in much less numerical variance.

5.5.2 Does Practical Control Really Reduce Entropy?

Many existing control strategies are not based on the performance metric of entropy. However, they can also indirectly reduce the entropy, since uncertainty is usually undesirable. In this chapter, we take LQG control [72] as an example. The system dynamics are linear with Gaussian noise (both in the system state evolution and observation). The cost of the control is given by

$\begin{array}{l} J = \sum_{t = 1}^{T} x^{T} (t) Q x (t) + u^{T} (t) R u (t), \end{array}$ $\begin{array}{l} J = \sum_{t = 1}^{T} x^{T} (t) Q x (t) + u^{T} (t) R u (t), \end{array}$

si299_e (5.161)

where Q and R are both nonnegative definite matrices, and T is the final time under consideration. Intuitively, when R = 0 and Q = I, all the efforts are used to reduce the expected square norm of x. When E[∥x∥²] is small, the uncertainty is small, thus making the entropy small. Note that reduction in E[∥x∥²] does not necessarily imply reduction in the entropy; however, if E[∥x∥²] is substantially reduced, there is a high probability that the entropy will also decrease.

Here, we use numerical results to demonstrate the entropy reduction. We consider a power network with N_g generators. For generator n, its dynamics are described by the following swing equation [4]:

$\begin{array}{l} M_{n} \overset{‥}{δ} (t) + D_{n} \overset{\cdot}{δ} (t) = P_{m}^{n} (t) - P_{e}^{n} (t), \end{array}$ $\begin{array}{l} M_{n} \overset{‥}{δ} (t) + D_{n} \overset{\cdot}{δ} (t) = P_{m}^{n} (t) - P_{e}^{n} (t), \end{array}$

(5.162)

where δ is the phase, P_mⁿ is the mechanical power, and P_eⁿ is the electric power. M_n is the rotor inertia constant and D_n is the mechanical damping constant. We denote $f = \overset{\cdot}{δ}$ $f = \overset{\cdot}{δ}$ , which is the frequency of rotation.

Similarly to the seminal work by Thorp [56], we ignore the connection of loads. We assume that the system state is close to an equilibrium point. The standard frequency is denoted by f₀ (e.g., 60 Hz in the United States) and the frequency deviation of generator i is denoted by Δf_i. The angle deviation δ_i − f₀t − θ_i (where θ_i is the initial phase of generator i) is denoted by Δδ_i. Then when Δf_i and Δδ_i, i = 1, …, N_g, are both sufficiently small, the dynamics can be linearized to

$\begin{array}{l} \{\begin{matrix} Δ {\overset{\cdot}{δ}}_{i} (t) = Δ f_{i} (t), \\ M_{i} Δ {\overset{\cdot}{f}}_{i} (t) + D_{i} Δ f_{i} (t) = Δ P_{m}^{i} (t) - \sum_{k \sim i} c_{i k} (Δ δ_{i} - Δ δ_{k}), \end{matrix} \end{array}$ $\begin{array}{l} \{\begin{matrix} Δ {\overset{\cdot}{δ}}_{i} (t) = Δ f_{i} (t), \\ M_{i} Δ {\overset{\cdot}{f}}_{i} (t) + D_{i} Δ f_{i} (t) = Δ P_{m}^{i} (t) - \sum_{k \sim i} c_{i k} (Δ δ_{i} - Δ δ_{k}), \end{matrix} \end{array}$

si302_e (5.163)

where ΔP_mⁱ is the difference between mechanical power and stable power, which is assumed to be the control action. The coefficients {c_ik}_ik can be obtained from the analysis of real power. The details are omitted due to limitations of space, and can be found in Ref. [56]. Obviously, the state of each node is two-dimensional (Δδ_i, Δf_i).

We use the IEEE New England 39-bus model, which is illustrated in Fig. 5.32. The parameters of the transmission lines are obtained from the model. We assume that all generators have the same parameters: momentum M = 6 and damping D = 0 (i.e., we ignore damping). The feedback gain matrix K is obtained from the linear quadratic regulation (LQR) controller synthesis. The state of each bus is given by (f, δ), where f is the frequency and phase.

f05-32-9780128019504 — Fig. 5.32 Illustration of the IEEE 39-bus model.

First we assume that there is no observation noise and the system state can be observed directly (i.e., y(t) = x(t)). We use LQG control by assuming Q = rI, R = I. We also add noise to the system state evolution with variance σ.

We consider the dynamics of Bus 1 by severing the connection to all other buses. We choose 10 random starting points. The 10 corresponding traces are illustrated in Fig. 5.33. We observe that the uncertainty is eliminated with time.

f05-33-9780128019504 — Fig. 5.33 Traces of the system state in two-dimensional space.

We then consider all the 39 buses. The traces of system entropy in four cases, where r is the ratio between Q and R, are shown in Fig. 5.34. We observe that, in all these four cases, the entropy is a monotonically decreasing function of time. Moreover, a larger noise power or a smaller r will decrease the rate of entropy reduction.

f05-34-9780128019504 — Fig. 5.34 Entropy evolution of the 39-bus power network.

We then consider the observation noise and assume σ = 1. Traces of the two-dimensional dynamics are shown in Fig. 5.35. We observe that the entropy still tends to decrease.

f05-35-9780128019504 — Fig. 5.35 Traces of system state in two-dimensional space with observation noise.

In these simulation results, we observe that the entropy is reduced by the LQG controller, although it is not designed to reduce the entropy. Hence the entropy approach for analyzing the communication capacity requirements is valid.

Analytical results of entropy reduction

To further validate the entropy approach, we provide analytic results on the entropy reduction in control systems. We consider the following standard linear dynamics:

$\begin{array}{l} \{\begin{matrix} x (t + 1) = A x (t) + B u (t) + w (t), \\ y (t) = C x (t) + n (t), \end{matrix} \end{array}$ $\begin{array}{l} \{\begin{matrix} x (t + 1) = A x (t) + B u (t) + w (t), \\ y (t) = C x (t) + n (t), \end{matrix} \end{array}$

si303_e (5.164)

where x is the system state with dimension N, u is the control action, y is the observation, and w and n are the noise in the dynamics and observation, respectively. We assume that the noises w(t) and n(t) are both Gaussian with covariance matrices Σ_w and Σ_n.

For the LQG control, the following theorem shows the exact evolution of the covariance matrix:

Theorem 16

The evolution of the covariance matrix of the system state, Σ_x, satisfies

$\begin{array}{l} Σ_{x} (t + 1) & = (A - B K (t)) (Σ_{x} (t) + Σ (t | t)) {(A - B K (t))}^{T} \\ - A Σ_{x} (t | t) A^{T} + Σ_{w}, \end{array}$ $\begin{array}{l} Σ_{x} (t + 1) & = (A - B K (t)) (Σ_{x} (t) + Σ (t | t)) {(A - B K (t))}^{T} \\ - A Σ_{x} (t | t) A^{T} + Σ_{w}, \end{array}$

si304_e (5.165)

where Σ(t|t) is the covariance matrix estimation in Kalman filtering given all observations before time t + 1.

Based on the evolution law of the covariance matrix in LQG control, we obtain the following sufficient condition of the temporal reduction of entropy:

Theorem 17

For a time slot t, we assume that the following regulation assumptions hold:

• The eigenvalues of Σ_x(t) and Σ_x(t + 1) satisfy

$\begin{array}{l} λ (Σ_{x} (t)), λ (Σ_{x} (t + 1)) \in [λ_{low}, λ_{upp}] . \end{array}$ $\begin{array}{l} λ (Σ_{x} (t)), λ (Σ_{x} (t + 1)) \in [λ_{low}, λ_{upp}] . \end{array}$

(5.166)

• There exists a θ > 0 such that

$\begin{array}{l} trace (Σ_{x} (t | t)) \leq θ trace (Σ_{x} (t)) . \end{array}$ $\begin{array}{l} trace (Σ_{x} (t | t)) \leq θ trace (Σ_{x} (t)) . \end{array}$

(5.167)

Consider a positive number a such that

$\begin{array}{l} a \leq \sqrt[N]{\frac{N^{N} λ_{low}^{N}}{λ_{low}^{N} + (N^{N} - 1) λ_{upp}^{N} N}} . \end{array}$ $\begin{array}{l} a \leq \sqrt[N]{\frac{N^{N} λ_{low}^{N}}{λ_{low}^{N} + (N^{N} - 1) λ_{upp}^{N} N}} . \end{array}$

si307_e (5.168)

Then if

$\begin{array}{l} trace [P (a^{'} Σ (t | t - 1) - Σ (t + 1 | t))] \\ + trace [P^{*} (a^{'} Σ (t | t) - Σ (t + 1 | t + 1))] \\ - trace [P (a^{'} Σ (t + 1 | t) - Σ (t + 2 | t + 1))] \\ - trace [(1 - a^{'}) P Σ_{w}] \geq 0, \end{array}$ $\begin{array}{l} trace [P (a^{'} Σ (t | t - 1) - Σ (t + 1 | t))] \\ + trace [P^{*} (a^{'} Σ (t | t) - Σ (t + 1 | t + 1))] \\ - trace [P (a^{'} Σ (t + 1 | t) - Σ (t + 2 | t + 1))] \\ - trace [(1 - a^{'}) P Σ_{w}] \geq 0, \end{array}$

si308_e (5.169)

where P(t) satisfies

$\begin{array}{l} P (t - 1) = Q + A^{T} P (t) A - P^{*} (t + 1), \end{array}$ $\begin{array}{l} P (t - 1) = Q + A^{T} P (t) A - P^{*} (t + 1), \end{array}$

(5.170)

with

$\begin{array}{l} P^{*} (t + 1) = A^{T} P (t) B {(R + B^{T} P (t) B)}^{- 1} B^{T} P (t) A \end{array}$ $\begin{array}{l} P^{*} (t + 1) = A^{T} P (t) B {(R + B^{T} P (t) B)}^{- 1} B^{T} P (t) A \end{array}$

(5.171)

and P(T) = Q, and

$\begin{array}{l} a^{'} = \frac{a (λ_{min} (K R K^{T}) + λ_{min} (Q))}{θ λ_{max} (K R K^{T}) + λ_{max} (Q)}, \end{array}$ $\begin{array}{l} a^{'} = \frac{a (λ_{min} (K R K^{T}) + λ_{min} (Q))}{θ λ_{max} (K R K^{T}) + λ_{max} (Q)}, \end{array}$

si311_e (5.172)

then we have

$\begin{array}{l} h (x (t + 1)) \leq h (x (t)) . \end{array}$ $\begin{array}{l} h (x (t + 1)) \leq h (x (t)) . \end{array}$

(5.173)

Remark 4

The proof of the theorem involves a series of inequalities and the details can be found in Ref. [85]. Essentially, it states that if the LQG cost function decreases sufficiently quickly, the entropy of the system state also decreases with time.

It is also possible that LQG control may increase the entropy. However, a small LQG cost function implies low entropy of the final system state. Take the simple case of Q = I and R = 0, for example. Since J is small, then $\sum_{n = 1}^{N} λ_{n} (Σ_{x} (T))$ $\sum_{n = 1}^{N} λ_{n} (Σ_{x} (T))$ , namely the sum of the eigenvalues of the covariance matrix Σ_x(T) is small, where we apply the following fact:

$\begin{array}{l} E [∥ x ∥^{2}] & = trace (Σ_{x} (T)) \\ = \sum_{n = 1}^{N} λ_{n} (Σ_{x} (T)) . \end{array}$ $\begin{array}{l} E [∥ x ∥^{2}] & = trace (Σ_{x} (T)) \\ = \sum_{n = 1}^{N} λ_{n} (Σ_{x} (T)) . \end{array}$

si314_e (5.174)

Then the entropy is dominated by a monotonic function of J, since

$\begin{array}{l} h_{x} (T) & \propto det (Σ_{x} (T)) \\ = \prod_{n = 1}^{N} λ_{n} (Σ_{x} (T)) \\ \leq {(\frac{\sum_{n = 1}^{N} λ_{n} (Σ_{x} (T))}{N})}^{N} \\ \leq {(\frac{J}{N})}^{N} . \end{array}$ $\begin{array}{l} h_{x} (T) & \propto det (Σ_{x} (T)) \\ = \prod_{n = 1}^{N} λ_{n} (Σ_{x} (T)) \\ \leq {(\frac{\sum_{n = 1}^{N} λ_{n} (Σ_{x} (T))}{N})}^{N} \\ \leq {(\frac{J}{N})}^{N} . \end{array}$

si315_e (5.175)

5.5.3 Discrete-State CPS: Entropy and Communications

In the previous subsections, we discussed the entropy change in the control system, which does not explicitly involve communications; hence it is still not applicable to a CPS. In this subsection, we will study a CPS with communications and discrete-state physical dynamics using the arguments in Ref. [15].

System model

We consider finite state physical dynamics with discrete timing structure. The N states of the physical dynamics are denoted by x₁, …, x_N. In each time slot, the CPS proceeds in the following three stages, as illustrated in Fig. 5.36:

f05-36-9780128019504 — Fig. 5.36 Illustration of the three-stage model.

• Entropy increase stage: At the beginning of the time slot (e.g., the tth time slot), the state of the CPS is denoted by X(t − 1) with distribution p_t−1, and the corresponding entropy is denoted by H(t − 1). The state is perturbed by random perturbations and is changed to X₀(t) with conditional probability Q_mn = P(X₀ = n|X = m) and entropy H₀(t)(> H(t − 1)). Here the impact of random perturbation is represented by the transition probability Q.

• Observation stage: An external sensor makes an observation on the system state and sends a message M(t) to the controller. We assume that the observation is noiseless. The coding scheme and the communication channel are not specified. For notational simplicity, we define P(x_i|m_j) = P(X(t) = x_i|M(t) = m_j) and P₀(x_i|m_j) = P(X₀(t) = x_i|M(t) = m_j). We denote by R the number of possible messages.

• Entropy decrease stage: A control action A(t) is computed by the controller and then actuated. The system state is then changed to X(t) with distribution p_t and entropy H(t). Then the physical dynamics proceeds to the next time slot.

For simplicity, we assume that the sensor is an external device and thus do not consider the entropy at the sensor itself. We do not consider the cost of computing.

We assume that there are D possible control actions that can be taken by the controller, denoted by a₁, …, a_D. Each control action maps X₀(t) to X(t). W assume that the controlled dynamics is deterministic; i.e., given the action A and the state X₀ after the entropy increase stage, the system state X is uniquely determined. We denote the mapping, indexed by the control action, by f_i if A = a_i; i.e., X(t) = f_i(X₀(t)). The mapping from the received messages at the controller to the action is called the control strategy.

To simplify the analysis, for the physical dynamics and control action, we have the following assumptions:

• Each control action provides an injective mapping for the system states.

• For any state pairs i and j, there exists a unique control action that maps i to j.

Note that, for the first assumption, it is easy to prove that a many-to-one mapping can help to decrease the entropy. Hence the assumption implies that we are dealing with the most difficult situation; i.e., open-loop control cannot decrease the entropy. The second assumption also simplifies the analysis, although it is still not clear whether it is easy to extend to a more generic case.

Note that all these assumptions are reasonable in practical cases. Take linear dynamics with continuous-valued state x(t) and dynamics x(t + 1) = Ax(t) + Bu(t), for instance. If B is a square matrix and invertible (hence dim(u) = dim(x)), then the mapping between x(t + 1) and u(t) is one-to-one, given the current system state x(t). Meanwhile, if A is invertible, the mapping between x(t + 1) and x(t) is also one-to-one, given u(t). Note that here the system state is continuously valued, which is different from the assumption in this subsection that the system state is discrete; however, if we partition the state space with sufficient precision, the two assumptions hold approximately.

For the communication and control strategies, we have the following assumptions for the control strategy:

• The message M is determined by the current observation; i.e., M(t) = h(X₀(t)), where h is the mapping mechanism.

• The action is dependent on the current received message, thus making the control Markovian. A non-Markovian strategy, which may improve the performance, will be studied in the future.

• The strategy is deterministic and one-to-one. This is reasonable since a randomized strategy may increase the output entropy. We denote by g the mapping from M to A; i.e., A = g(M).

Entropy reduction in one time slot

We first analyze the entropy reduction within a single time slot. For notational simplicity, we omit the time indices in the notation, since the time slot is fixed. The following lemma is key to bridging the communication requirement and entropy reduction.

Lemma 4

Given the above assumptions, we have

$\begin{array}{l} I (X_{0}; M) = H (X_{0}) - H (X) + I (X; M) . \end{array}$ $\begin{array}{l} I (X_{0}; M) = H (X_{0}) - H (X) + I (X; M) . \end{array}$

(5.176)

Proof

$\begin{array}{l} H (X | M) & = - \sum_{i, j} P (x_{i} | m_{j}) p (m_{j}) log P (x_{i} | m_{j}) \\ = - \sum_{j} \sum_{i} P (x_{i} | m_{j}) p (m_{j}) log P (x_{i} | m_{j}) \\ = - \sum_{j} \sum_{i} P (f_{g (m_{j})}^{- 1} (x_{i}) | m_{j}) p (m_{j}) \\ \times - log P (f_{g (m_{j})}^{- 1} (x_{i}) | m_{j}) \\ = - \sum_{j} \sum_{k} P_{0} (x_{k} | m_{j}) p (m_{j}) log P_{0} (x_{k} | m_{j}) \\ = H (X_{0} | M), \end{array}$ $\begin{array}{l} H (X | M) & = - \sum_{i, j} P (x_{i} | m_{j}) p (m_{j}) log P (x_{i} | m_{j}) \\ = - \sum_{j} \sum_{i} P (x_{i} | m_{j}) p (m_{j}) log P (x_{i} | m_{j}) \\ = - \sum_{j} \sum_{i} P (f_{g (m_{j})}^{- 1} (x_{i}) | m_{j}) p (m_{j}) \\ \times - log P (f_{g (m_{j})}^{- 1} (x_{i}) | m_{j}) \\ = - \sum_{j} \sum_{k} P_{0} (x_{k} | m_{j}) p (m_{j}) log P_{0} (x_{k} | m_{j}) \\ = H (X_{0} | M), \end{array}$

si317_e (5.177)

where the fourth equation is due to the assumptions of one-to-one mapping of the control output and the one-to-one mapping of the control strategy.

Then we have

$\begin{array}{l} I (X_{0}; M) & = H (X_{0}) - H (X | M) \\ = H (X_{0}) - (H (X) - I (X; M)) . \end{array}$ $\begin{array}{l} I (X_{0}; M) & = H (X_{0}) - H (X | M) \\ = H (X_{0}) - (H (X) - I (X; M)) . \end{array}$

(5.178)

This concludes the proof.

Remark 5

The conclusion in Lemma 4 is equivalent to

$\begin{array}{l} H (X_{0}) - H (X) = I (X_{0}; M) - I (X; M), \end{array}$ $\begin{array}{l} H (X_{0}) - H (X) = I (X_{0}; M) - I (X; M), \end{array}$

(5.179)

where the left-hand side is the reduction of entropy of the system dynamics in this time slot, while the right-hand side is the difference between two mutual pieces of information. As illustrated in Fig. 5.37, the mutual information I(X₀;M), namely the information on the system dynamics after the entropy increase stage, obtained by the controller, cannot be fully used to reduce the entropy; some residual, I(X;M), will be unused, intuitively speaking. Interestingly, this is similar to the Carnot heat engine [86, 87], in which the energy from a high-temperature source cannot be fully used to generate work, unless the absolute temperature of the colder source is zero.

f05-37-9780128019504 — Fig. 5.37 Illustration of the entropy flow.

Based on the conclusion in Lemma 4, the following corollary can be obtained, which states that the information sent out by the sensor may not be fully used to reduce the entropy.

Corollary 3

The mutual information I(X₀;M) cannot be fully utilized to reduce the system entropy (i.e., I(X₀;M) ≥ H(X₀) − H(X)), if all conditional probabilities {P₀(x_i|m_j)}_i, j are different from each other.

Proof

According to Lemma 4, H(X₀) − H(X) = I(X₀;M) when I(X;M) = 0; i.e., X and M are mutually independent. This implies

$\begin{array}{l} P (x_{i} | m_{j}) = P (x_{i} | m_{k}) = P (x_{i}), \end{array}$ $\begin{array}{l} P (x_{i} | m_{j}) = P (x_{i} | m_{k}) = P (x_{i}), \end{array}$

(5.180)

∀i, j, k. Furthermore, we have

$\begin{array}{l} P (x_{i} | m_{j}) = P_{0} (f_{g (m_{j})}^{- 1} (x_{i}) | m_{j}) . \end{array}$ $\begin{array}{l} P (x_{i} | m_{j}) = P_{0} (f_{g (m_{j})}^{- 1} (x_{i}) | m_{j}) . \end{array}$

(5.181)

Hence the equation H(X₀) − H(X) = I(X₀;M) can hold only when

$\begin{array}{l} P_{0} (f_{g (m_{j})}^{- 1} (x_{i}) | m_{j}) = P_{0} (f_{g (m_{k})}^{- 1} (x_{i}) | m_{k}), \end{array}$ $\begin{array}{l} P_{0} (f_{g (m_{j})}^{- 1} (x_{i}) | m_{j}) = P_{0} (f_{g (m_{k})}^{- 1} (x_{i}) | m_{k}), \end{array}$

(5.182)

for all i, j, and k.

Based on Lemma 4, we obtain one of the main conclusions of this section in the following theorem, which provides upper and lower bounds for the entropy reduction in one cycle of the CPS. These bounds are independent of the controller design (or equivalently, valid for all possible controllers).

Theorem 18

Given the finite-state CPS and the above assumptions, regardless of the controllerdesign, the following bounds for entropy reduction are valid:

• Lower bound:

$\begin{array}{l} H (X_{0}) - H (X) \geq I (X_{0}; M) - H (X_{0}) \\ + (1 - {(ϕ^{*})}^{- 1} (H (X_{0} | M))) H^{*} (M), \end{array}$ $\begin{array}{l} H (X_{0}) - H (X) \geq I (X_{0}; M) - H (X_{0}) \\ + (1 - {(ϕ^{*})}^{- 1} (H (X_{0} | M))) H^{*} (M), \end{array}$

si323_e (5.183)

where the function ϕ* is defined in Ref. [88], which is an increasing function and satisfies ϕ*(0) = 0, and H*(M) is defined as the entropy of the conditional probability of M given the event

$\begin{array}{l} arg max_{x} P (x_{0} | M) = X_{0} . \end{array}$ $\begin{array}{l} arg max_{x} P (x_{0} | M) = X_{0} . \end{array}$

(5.184)

• Upper bound:

$\begin{array}{l} H (X_{0}) - H (X) \leq I (X_{0}; M) - \frac{1}{2 ln 2} {(min_{k} P (M_{j}))}^{2} \\ \times min_{k_{1} \neq k_{2}, j_{1} \neq j_{2}} | P_{0} (x_{k 1} | m_{j 1}) - P_{0} (x_{k 2} | m_{j 2}) |^{2} . \end{array}$ $\begin{array}{l} H (X_{0}) - H (X) \leq I (X_{0}; M) - \frac{1}{2 ln 2} {(min_{k} P (M_{j}))}^{2} \\ \times min_{k_{1} \neq k_{2}, j_{1} \neq j_{2}} | P_{0} (x_{k 1} | m_{j 1}) - P_{0} (x_{k 2} | m_{j 2}) |^{2} . \end{array}$

si325_e (5.185)

Both bounds in Theorem 18 are tight under certain conditions, which is stated in the following corollary:

Corollary 4

The lower bound in Eq. (5.183) is an equality when

$\begin{array}{l} I (X_{0}; M) = H (X_{0}) . \end{array}$ $\begin{array}{l} I (X_{0}; M) = H (X_{0}) . \end{array}$

(5.186)

The upper bound becomes an equality when

$\begin{array}{l} P_{0} (x_{k 1} | m_{j 1}) = P_{0} (x_{k 2} | m_{j 2}) = \frac{1}{N}, \forall k_{1} \neq k_{2}, j_{1} \neq j_{2}; \end{array}$ $\begin{array}{l} P_{0} (x_{k 1} | m_{j 1}) = P_{0} (x_{k 2} | m_{j 2}) = \frac{1}{N}, \forall k_{1} \neq k_{2}, j_{1} \neq j_{2}; \end{array}$

(5.187)

at this time, the information obtained by the controller is given by

$\begin{array}{l} I (X_{0}; M) = D (U | | P_{0}), \end{array}$ $\begin{array}{l} I (X_{0}; M) = D (U | | P_{0}), \end{array}$

(5.188)

where U is a uniformdistribution among the N states; the gap between the information provided by M and theuncertainty in X₀ is given by

$\begin{array}{l} H (X_{0}) - I (X_{0}; M) = log N . \end{array}$ $\begin{array}{l} H (X_{0}) - I (X_{0}; M) = log N . \end{array}$

(5.189)

Remark 6

• Unfortunately, we are still unable to further simplify the lower bound; the major difficulty is the entropy H*(M), which is the entropy of M given that the estimation on X₀ is correct.

• When the upper bound becomes tight, although the residual entropy I(X;M) becomes zero, the message M cannot provide all information about X₀; in particular, if X₀ is uniformly distributed, M provides no information about X₀ at all. Hence it may not maximize the entropy reduction if we minimize I(X;M).

Entropy change in the long term

Since we assume that the number of states is finite and the system evolution (including the control policy, communication mechanism, and entropy increase mechanism) is time invariant, the system will converge to a stationary distribution of system states. This implies that the entropy will converge to a deterministic value H*. It is an open problem to obtain an exact explicit expression for H*, although we can compute the stationary distribution numerically. A lower bound of H* is provided for a two-state system, when the communication channel is sufficiently good and the perturbation in the entropy increase stage is symmetric.

Proposition 1

Consider a two-state physical dynamics, in which Q₁₁ =Q₂₂ = δ. We assume that I(X₀;M) = αH(X₀), where 0 ≤ α ≤ 1; i.e., the proportion of the information contained in X₀ sent to the controller is α. When α is sufficiently close to 1 and δ is sufficiently close to 1, we have

$\begin{array}{l} H^{*} \geq \frac{(1 - α) (H_{α, δ} - c_{δ} (1 - α) H_{δ})}{1 - (1 - α) c_{δ}}, \end{array}$ $\begin{array}{l} H^{*} \geq \frac{(1 - α) (H_{α, δ} - c_{δ} (1 - α) H_{δ})}{1 - (1 - α) c_{δ}}, \end{array}$

(5.190)

where $H_{δ} = - δ log δ - (1 - δ) log (1 - δ)$ $H_{δ} = - δ log δ - (1 - δ) log (1 - δ)$ and

$\begin{array}{l} c_{δ} & = (1 - 2 δ) {\frac{d H^{- 1} (h)}{d h}|}_{h = H_{δ}} \\ \times log \frac{(1 - δ) - H^{- 1} (H_{δ}) (1 - 2 δ)}{H^{- 1} (H_{δ}) (1 - 2 δ) + δ}, \end{array}$ $\begin{array}{l} c_{δ} & = (1 - 2 δ) {\frac{d H^{- 1} (h)}{d h}|}_{h = H_{δ}} \\ \times log \frac{(1 - δ) - H^{- 1} (H_{δ}) (1 - 2 δ)}{H^{- 1} (H_{δ}) (1 - 2 δ) + δ}, \end{array}$

si332_e (5.191)

where H(x) (x ≤ 0.5) is the entropy function for a binary distribution with probabilities xand 1 − x, and

$\begin{array}{l} H_{α, δ} & = (H^{- 1} ((1 - α) H_{δ}) (1 - 2 δ) + δ) \\ \times log \frac{1}{H^{- 1} ((1 - α) H_{δ}) (1 - 2 δ) + δ} \\ + (1 - δ) - H^{- 1} ((1 - α) H_{δ}) (1 - 2 δ)) \\ \times log \frac{1}{(1 - δ) - H^{- 1} ((1 - α) H_{δ}) (1 - 2 δ)} . \end{array}$ $\begin{array}{l} H_{α, δ} & = (H^{- 1} ((1 - α) H_{δ}) (1 - 2 δ) + δ) \\ \times log \frac{1}{H^{- 1} ((1 - α) H_{δ}) (1 - 2 δ) + δ} \\ + (1 - δ) - H^{- 1} ((1 - α) H_{δ}) (1 - 2 δ)) \\ \times log \frac{1}{(1 - δ) - H^{- 1} ((1 - α) H_{δ}) (1 - 2 δ)} . \end{array}$

si333_e (5.192)

Remark 7

It is easy to observe that H_δ equals $H_{X_{0}}$ $H_{X_{0}}$ when the system state X has entropy 0. Hence H*≥ (1 − α)H_δ is a trivial lower bound. According to the proof of Proposition 1, we have H_α, δ > H_δ. Hence the bound in Proposition 1 is tighter than the trivial one.

5.5.4 Continuous-State CPS: Entropy and Communications

In the previous discussion, we assumed that the dynamics is discretely valued. However, in practice, many systems are continuously valued. Here we follow Ref. [89] to study the relationship between entropy and communications in continuously valued dynamics.

System model

We consider the system illustrated in Fig. 5.38. Here d is the random disturbance. x is the state of the plant, while y is the output. y is fed back to a causal controller, which is also subject to noise c. The plant (physical dynamics) is described as follows:

$\begin{array}{l} \{\begin{matrix} x (t + 1) = A x (t) + B e (t), \\ y (t) = C x (t) . \end{matrix} \end{array}$ $\begin{array}{l} \{\begin{matrix} x (t + 1) = A x (t) + B e (t), \\ y (t) = C x (t) . \end{matrix} \end{array}$

si335_e (5.193)

f05-38-9780128019504 — Fig. 5.38 A causal feedback control system.

Here the feedback control can be considered as a noisy communication channel, which is continuous in both values and time. This is different from modern digital communication systems. However, the equivalent analog communication system can provide insight into digital ones.

The following assumptions are made:

• The noises in the observation c, the system disturbance d, and the initial state x(0) are mutually independent.

• The control action is given by u(t) = K(t, y(0 : t), c(0 : t)), where K can be a time-varying deterministic function and is dependent on the history of observations and observation noises, and is thus casual.

Traditional Bode’s law

The fundamental limit of the control system in Fig. 5.38 can be characterized by Bode’s Law, which will be explained subsequently. First we define the sensitivity function S(z) as the transfer function between disturbance d and error e. Since we expect small errors given the disturbance, we desire small S(z). However, Bode’s law states that for a strictly proper loop gain we have

$\begin{array}{l} \frac{1}{2 π} \int_{- π}^{π} log | S (e^{j w}) | d w = \sum_{λ \in Ω} log | λ |, \end{array}$ $\begin{array}{l} \frac{1}{2 π} \int_{- π}^{π} log | S (e^{j w}) | d w = \sum_{λ \in Ω} log | λ |, \end{array}$

si336_e (5.194)

where Ω is the set of unstable poles in open-loop control. Obviously, the right-hand term is independent of the feedback control scheme. Hence the sensitivity function S cannot be arbitrarily reduced over all the frequencies. Note that here we consider the system as deterministic (the perturbation is also deterministic). In the context of stochastic systems, we need to use the power spectrum density, instead of the signal spectrum. The extension to the stochastic system has been carried out in Ref. [89] and will be introduced subsequently.

Entropy reduction and Bode’s law

In Ref. [89], the fundamental limit of the control system is studied from the viewpoint of entropy. The following theorem is of key importance in the analysis:

Theorem 19

For the feedback control system illustrated in Fig. 5.38, we have

$\begin{array}{l} h (e (0 : t)) \geq h (d (0 : t)) + I (x (0); e (0 : t)), \forall t = 0, 1, 2, \dots \end{array}$ $\begin{array}{l} h (e (0 : t)) \geq h (d (0 : t)) + I (x (0); e (0 : t)), \forall t = 0, 1, 2, \dots \end{array}$

(5.195)

Remark 8

The nonnegativity of mutual information implies that the differential entropy in the feedback control system is increased when the randomness is passed from the external perturbation to the error signal. Note that this does not mean that the communication increases the entropy and thus does not contradict our argument that the communications can reduce entropy, since the comparison is between the disturbance and the error signal, instead of the open-loop and closed-loop controlled systems.

A rigorous proof of Theorem 19 is given in Ref. [89].

Based on the inequality in Theorem 19, Ref. [89] proved the following extension of the Bode-like performance limitation.

Theorem 20

For the system shown in Fig. 5.38, we assume

$\begin{array}{l} sup_{t} E [x^{T} (t) x (t)] < \infty . \end{array}$ $\begin{array}{l} sup_{t} E [x^{T} (t) x (t)] < \infty . \end{array}$

(5.196)

Then we have:

• The following inequality holds:

$\begin{array}{l} h_{\infty} (e) \geq h_{\infty} (d) + \sum_{i = 1}^{N} max {0, log (λ_{i} (A))} . \end{array}$ $\begin{array}{l} h_{\infty} (e) \geq h_{\infty} (d) + \sum_{i = 1}^{N} max {0, log (λ_{i} (A))} . \end{array}$

si339_e (5.197)

• If e is asymptotically stationary, we have

$\begin{array}{l} \frac{1}{4 π} \int_{- π}^{π} log (2 π e F_{e} (w)) d w \geq h_{\infty} (d) + \sum_{i = 1}^{N} max {0, log (λ_{i} (A))}, \end{array}$ $\begin{array}{l} \frac{1}{4 π} \int_{- π}^{π} log (2 π e F_{e} (w)) d w \geq h_{\infty} (d) + \sum_{i = 1}^{N} max {0, log (λ_{i} (A))}, \end{array}$

si340_e (5.198)

where F_e(w) is the powerspectral density of e.

• If e is asymptotically stationary and d is Gaussian autoregressive and asymptotically stationary, we have

$\begin{array}{l} \frac{1}{2 π} \int_{- π}^{π} log (S_{d, e} (w)) d w \geq \sum_{i = 1}^{N} max {0, log (λ_{i} (A))}, \end{array}$ $\begin{array}{l} \frac{1}{2 π} \int_{- π}^{π} log (S_{d, e} (w)) d w \geq \sum_{i = 1}^{N} max {0, log (λ_{i} (A))}, \end{array}$

si341_e (5.199)

where S_d, e(w) is the ratio between the power spectral densities of e and d, which is the power spectrum density version of the sensitivity function.

Remark 9

We notice that Eq. (5.199) is very similar to Eq. (5.194). The difference is that Eq. (5.199) is for stochastic systems while Eq. (5.194) is for deterministic systems. Moreover, Eq. (5.199) is an inequality while Eq. (5.194) is an equality.

Communication requirement

The following lemma was shown in Ref. [89] to illustrate the requirement of information flow in the feedback, or equivalently the required amount of communication in the feedback control:

Lemma 5

In the feedback control system in Fig. 5.38, the following inequality holds:

$\begin{array}{l} lim sup_{t \to \infty} \frac{I (x (0), d (0 : t); u (0 : t))}{t} \geq lim sup_{t \to \infty} \frac{I (x (0); e (0 : t))}{t} + I_{\infty} (d; u) . \end{array}$ $\begin{array}{l} lim sup_{t \to \infty} \frac{I (x (0), d (0 : t); u (0 : t))}{t} \geq lim sup_{t \to \infty} \frac{I (x (0); e (0 : t))}{t} + I_{\infty} (d; u) . \end{array}$

(5.200)

Remark 10

The left-hand side of Eq. (5.200) is the average amount of information (in terms of bits) hidden in the control action about the random initial state and random perturbation, or equivalently the effective information conveyed to the controller through the feedback. The right-hand side consists of two parts: the final information about the initial state remaining in the error signal and the information about the random perturbation carried by the control actions. This provides a lower bound for the communication requirement since the information flow through the feedback can be considered as communication.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 5.4 Stochastic systems: stability

Create new playlist

Sign In

Sign Up

5.4 Stochastic systems: stability

5.4.1 System Model

5.4.2 Inadequacy of Channel Capacity

5.4.3 Anytime Capacity

Necessity

Sufficiency

Encoding procedure

5.5 Stochastic systems: reduction of shannon entropy

5.5.1 Cybernetics Argument

Law of requisite variety

Shannon entropy in discrete-value dynamics

Shannon entropy in continuous-value dynamics

Controller design based on entropy

Criticisms on entropy-based control

5.5.2 Does Practical Control Really Reduce Entropy?

Analytical results of entropy reduction

5.5.3 Discrete-State CPS: Entropy and Communications

System model

Entropy reduction in one time slot

Entropy change in the long term

5.5.4 Continuous-State CPS: Entropy and Communications

System model

Traditional Bode’s law

Entropy reduction and Bode’s law

Communication requirement

Table of Contents for
5.4 Stochastic systems: stability