11.2 Discrete-Time Mixed H2
In this section, we discuss the state-feedback mixed H2
∑da : {˙xk+1 = f(xk)+g1(xk)wk+g2(xk)uk; x(k0) =x0 zk = h1(xk)+k12(xk)uk yk = xk , |
(11.29) |
where all the variables and system matrices have their previous meanings and dimensions respectively. We assume similarly that the system has a unique equilibrium at x = 0, and is such that f(0) = 0 and h10) = 0. For simplicity, we similarly also assume the following hold for the system matrices.
Assumption 11.2.1 The system matrices are such that
hT1(x)k12(x) = 0,kT12(x)k12(x) = I. } |
(11.30) |
Again, as in the continuous-time case, the standard problem is to design a static state-feedback controller, Kda, such that the ℌ2-norm of the closed-loop system which is defined as
‖Kda ο ∑da‖ℓ2 ≜ sup0≠w0∈S‖z‖P‖w0‖S′ ,
and the H∞
‖Kda ο ∑da‖ℓ∞ ≜ sup0≠w1∈P′‖z‖2‖w1‖2 ,
are minimized over a time horizon [k0, K] ⊂ Z, where
P′ ≜ {w: w∈ℓ∞, Rww(k),Sww(jω) exist for all k and all ω resp., ‖w‖P′ <∞}
S′ ≜ {w: w∈ℓ∞, Rww(k),Sww(jω) exist for all k and all ω resp., ‖Sww(jω)‖∞ < ∞}
‖z‖2P′ ≜ limk→∞ 12KK∑k=−K‖zk‖2‖w0‖2S′ = ‖Sw0w0(jw)‖∞
and Rww,Sww(jω))
However, we do not solve the above standard problem in this section. Instead, we solve an associated suboptimal problem in which w ∈ ℓ2[k0, ∞) is a single disturbance, and the objective is to minimize the output energy ‖z‖ℓ2
Definition 11.2.1 (Discrete-Time State-Feedback Mixed H2 / H∞
(A) Finite-Horizon Problem (K < ∞): Find (if possible!) a time-varying static state-feedback control law of the form:
u=˜α2d(k,xk) , ˜α2d(k,0)=0, k∈Z
such that:
(a) the closed-loop system
∑clda : {xk+1 = f(xk)+g1(xk)wk+g2(xk)˜α2d(xk) zk = h1(xk)+k21(xk)˜α2d(xk) |
(11.31) |
is stable with w = 0 and has locally finite ℓ2-gain from w to z less or equal to γ⋆, starting from x0 = 0, for all k ∈ [k0, K] and for a given number γ⋆ > 0.
(b) the output energy ‖z‖ℓ2
(B) Infinite-Horizon Problem (K → ∞ ): In addition to the items (a) and (b) above, it is also required that
(c) the closed-loop system Σclda defined above with w ≡ 0 is locally asymptotically-stable about the equilibrium-point x = 0.
The problem is similarly formulated as a two-player nonzero-sum differential game with two cost functionals:
minu∈U , w∈W J1d(u,w) = 12K∑k=k0(γ2 ‖wk‖2− ‖zk‖2) |
(11.32) |
minu∈U , w∈W J2d(u,w) = 12K∑k=k0‖zk‖2. |
(11.33) |
Again, sufficient conditions for the solvability of the above dynamic game (11.32), (11.33), (11.29), and the existence of Nash-equilibrium strategies are given by the following pair of discrete-time Hamilton-Jacobi-Isaac’s difference equations (DHJIEs):
W(x,k) = infw {W(f(x)+g1(x)w+g2(x)u⋆,k+1)+12(γ2‖wk‖2− ‖z⋆k‖2)}, W(x,K+1)=0, |
(11.34) |
U(x,k) = infu {U(f(x)+g1(x)w⋆+g2(x)u,k+1)+12 ‖z⋆k‖2}, U(x,K+1)=0, |
(11.35) |
for some smooth negative and positive-definite functions W,U: X×Z→ℜ respectively, and where z⋆k = h1(x)+k12(x)u⋆(x).
To solve the problem, we define the Hamiltonian functions Hi: X×W×U→ℜ→ℜ, i = 1,2 corresponding to the cost functionals (11.32), (11.32) respectively and the system equations (11.29):
H1(x,w,u,W) = W(f(x)+g1(x)w+g2(x)u,k+1)−W(x)+ 12(γ2‖w‖2−‖zk‖), |
(11.36) |
(11.37) |
Similarly, as in Chapter 8, let
∂2H(x) ≜ [∂2H2∂u2∂2H2∂w∂u∂2H1∂u∂w∂2H1∂w2] (x) ≜ [r11(x)r12(x)r21(x)r22(x)] F⋆(x) ≜ f(x) + g1(x)w⋆(x)+g2(x)u⋆(x), |
(11.38) |
and therefore
r11(x) = gT2(x) ∂2U∂λ2|λ=F⋆(x)g2(x)+Ir12(x) = gT2(x) ∂2U∂λ2|λ=F⋆(x)g1(x)r21(x) = gT1(x) ∂2W∂λ2|λ=F⋆(x)g2(x)r22(x) = γ2I+ gT1(x) ∂2EW∂λ2|λ=F⋆(x)g1(x).} |
(11.39) |
The following theorem then presents sufficient conditions for the solvability of the finite-horizon problem.
Theorem 11.2.1 Consider the discrete-time nonlinear system (11.29), and the finite-horizon DSFBMH2HINLCP with cost functionals (11.32), (11.33). Suppose there exists a pair of negative and positive-definite C2 (with respect to the first argument)-functions W,U: M×Z→ℜ locally defined in a neighborhood M of the origin x = 0, such that W(0,k) = 0 and U(0,k) = 0, and satisfying the coupled DHJIEs:
W(x,k) = W(f(x)+g1(x)w⋆+g2(x)u⋆,k+1)+12(γ2‖w‖2−‖z⋆k‖2), W(x,K+1) =0, |
(11.40) |
(11.41) |
together with the conditions
(11.42) |
Then the state-feedback controls defined implicitly by
(11.43) |
(11.44) |
solve the finite-horizon DSFBMH2HINLCP for the system. Moreover, the optimal costs are given by
(11.45) |
(11.46) |
Proof: We prove item (a) in Definition 11.2.1 first. Assume there exist solutions W < 0, U > 0 of the DHJIEs (11.40), (11.41), and consider the Hamiltonian functions H1(.,.,.), H2(.,.,.). Applying the necessary conditions for optimality
∂H1∂w(x,u,w)=0, ∂H2∂u(x,u,w)=0,
and solving these for w⋆, u* respectively, we get the Nash-equilibrium strategies (11.43), (11.44). Moreover, if the conditions (11.42) are satisfied, then the matrix
∂2H|w=0,u=0(0) = [r11(0)r12(0)r21(0)r22(0)] = [Ir12(0)r−122(0)0I] [r11(0)−r12(0)r−122(0)r21(0)00r22(0)] [I0r−122(0)r21(0)I]
is nonsingular. Therefore, by the Implicit-function Theorem, there exist open neighborhoods X1 of x = 0, W1 of w = 0 and U1 of u = 0, such that the equations (11.43), (11.44) have unique solutions.
Now suppose, (u*, w⋆ ) have been obtained from (11.43), (11.44), then subsituting in the DHJIEs (11.34), (11.35) yield the DHJIEs (11.40), (11.41). Moreover, by Taylor-series expansion, we can write
H1(x,w,u⋆(x),W) = H1(x,w⋆(x).u⋆(x),W)+12(w−w⋆(x))T[r22(x)+ O(‖w−w⋆(x)‖)](w−w⋆(x)).
In addition, since r22(0) > 0 implies r22(x) > 0 for all x in a neighborhood X2 of x = 0 by the Inverse-function Theorem [234], it then follows from above that there exists also a neighborhood W2 of w = 0 such that
H1(x,w,u⋆(x),W) ≥ H1(x,w⋆(x),u⋆(x),W) =0 ∀x∈ X2, ∀w∈ W2 , ⇔W(f(x)+g1(x)w+g2(x)u⋆(x),k+1)−W(x,k)+12(γ2‖w‖2−‖u⋆(x)‖2−‖h1‖2) ≥0 ∀x∈ X2, ∀w∈ W2 .
Setting now w = 0, we have
˜W(f(x)+g2(x)u⋆(x),k+1)−˜W(x,k)≤−12‖u⋆(x)‖2−12‖h1(x)‖2≤0
for some function ˜W = − W > 0. Hence, the closed-loop system is Lyapunov-stable. To prove item (b), consider the Hamiltonian function H2 (., w⋆, ., .) and expand it in Taylor’sseries:
H2(x,w⋆,u,U) = H2(x,w⋆,u⋆,U)+12(u−u⋆)T[r11(x)+ O(‖u−u⋆‖)](u−u⋆).
Since
r11(0) = I+gT2(0) ∂2W∂λ2(0) g2(0)≥I,
again there exists a neighborhood ˜X2 of x = 0 such that r11(x) > 0 by the Inverse-function Theorem. Therefore,
H2(x,w⋆,u,U) ≥ H2(x,w⋆,u⋆,U) =0 ∀u∈U
and the H2-cost is minimized.
Finally, we determine the optimal costs of the strategies. For this, consider the cost functional J1d(u*, w⋆ ) and write it as
J1d(u,w)+W(xk+1,K+1)−W(xk0,k0) =K∑k=k0{12(γ2‖w⋆k‖2−‖z⋆k‖2)+ W(xk+1,K+1)−W(xk,k)} =K∑k=k0H1(x,w⋆,u⋆,W) = 0.
Since W(xk+1,K+1)=0, we have the result. Similarly, for J2(w⋆, u* ), we have
J2d(u,w)+U(xk+1,K+1)−U(xk0,k0) =K∑k=k0{ 12‖z⋆k‖2+U(xk+1,K+1)−U(xk,k)} =K∑k=k0H2(x,w⋆,u⋆,U) = 0.
and since U(xK+1, K + 1) = 0, the result also follows. □
The above result can be specialized to the linear discrete-time system
(11.47) |
where all the variables and matrices have their previous meanings and dimensions. Then we have the following corollary to Theorem 11.2.1.
Corollary 11.2.1 Consider the linear system Σdl under the Assumption 11.1.2. Suppose there exist P1,k < 0 and P2,k > 0 symmetric solutions of the cross-coupled discrete-Riccati difference-equations (DRDEs):
P2,k = AT{P2,k−2P2,kB1B−1γ,kΓ1,k−2P2,kB2Λ−1kBT2P2,kΓ2,k + 2ΓT1,kB−Tγ,kB1P2,kB2Λ−1kBT2,kP2,kΓ2,k+ ΓT2,kP2,kB2Λ−TkΛ−1kBT2P2,kΓ2,k} A+CT1C1, P2,Κ=0 |
(11.49) |
(11.50) |
for all k in [k0, K]. Then, the Nash-equilibrium strategies uniquely specified by
(11.51) |
(11.52) |
where
Λk := (I+BT2P2,kB2), ∀k, Γ1,k := [BT1P1,k−B2Λ−1kBT2P2,k] ∀k, Γ2,k := [I−B1B−1γ,k(BT1P1,k−B2Λ−1kBT2P2,k)] ∀k,
solve the finite-horizon DSFBMH2HINLCP for the system. Moreover, the optimal costs for the game are given by
(11.53) |
(11.54) |
Proof: Assume the solutions to the coupled HJIEs are of the form,
(11.55) |
(11.56) |
Then, the Hamiltonians H1(., ., ., .), H2(., ., ., .) are given by
H1,l(x,w,u,W) = 12(Ax+B1w+B2u)TP1,k(Ax+B1w+B2u)− 12xTp1,kx+12γ2‖w‖2−12‖z‖2,
H2,l(x,w,u,U) = 12(Ax+B1w+B2u)TP2,k(Ax+B1w+B2u)− 12xTp2,kx+12‖z‖2.
Applying the necessary conditions for optimality, we get
(11.57) |
(11.58) |
Solving the last equation for u we have
u=−(I+BT2P2,kB2)−1BT2P2,k(Ax+B1w),
which upon substitution in the first equation gives
BT1P1,k{Ax+B1w−B2(I+BT2P2,kB2)−1BT2P2,k(Ax+B1w)}+γ2w=0 ⇔ w⋆l,k = −B−1γ,k[BT1P1,k−B2Λ−1kBT2P2,k]Axk = −B−1γ,kΓ1,kAxk k∈[k0, K], |
(11.59) |
if and only if
Bγ,k := [γ2I−B2Λ−1kBT2P2,kB1+BT1P1,kB1] >0 ∀k,
where
Λk := (I+BT2P2,kB2), ∀k, Γ1,k := [BT1P1,k−B2Λ−1kBT2P2,k] ∀k.
Notice that Λk is nonsingular for all k since P2,k is positive-definite. Now, substitute w⋆ in the expression for u to get
u⋆l,k = Λ−1kBT2P2,k[I−B1B−1γ,k(BT1P1,k−B2Λ−1kBT2P2,k)]Axk, k∈[k0,K] =−Λ−1kBT2P2,kΓ2,kAxk |
(11.60) |
where
Γ2,k := [I−B1B−1γ,k(BT1P1,k−B2Λ−1kBT2P2,k)].
Finally, substituting (u⋆l,k , w⋆l,k) in the DHJIEs (11.40), (11.41), we get the DRDEs (11.48), (11.49). The optimal costs are also obtained by substitution in (11.45), (11.46). □
Remark 11.2.1 Note, in the above Corollary 11.2.1 for the solution of the linear discrete-time problem, it is better to consider strictly positive-definite solutions of the DRDEs (11.48), (11.49) because the condition B γ,k > 0 must be respected for all k.
11.2.1 The Infinite-Horizon Problem
In this subsection, we consider similarly the infinite-horizon DSFBMH2HINLCP for the affine discrete-time nonlinear system Σda. We let K → ∞, and seek time-invariant functions and feedback gains that solve the DSFBMH2HINLCP. Again we require that the closed-loop system be locally asymptotically-stable, and for this, we need the following definition of detectability for the discrete-time system Σda.
Definition 11.2.2 The pair {f,h} is said to be locally zero-state detectable if there exists a neighborhood ˜O of x = 0 such that, if xk is a trajectory of xk+1 = f(xk) satisfying x(k0) ∈˜O, then h(xk) is defined for all k ≥ k0, and h(xk) = 0 for all k ≥ ks, implies limk→∞ xk =0. Moreover {f,h} is said to be zero-state detectable if ˜O= X.
Theorem 11.2.2 Consider the nonlinear system Σda defined by (11.29) and the infinite-horizon DSFBMH2HINLCP with cost functionals (11.32), (11.8). Suppose
(H1) the pair {f, h1 } is zero-state detectable;
(H2) there exists a pair of negative and positive-definite C2-functions ˜W,˜U: ˜M×Z→ℜ locally defined in a neighborhood ˜M of the origin x = 0, such that ˜W(0) = 0 and ˜U(0) = 0, and satisfying the coupled DHJIEs:
(11.61) |
(11.62) |
together with the conditions
(11.63) |
Then, the state-feedback controls defined implicitly by
(11.64) |
(11.65) |
solve the infinite-horizon DSFBMH2HINLCP for the system. Moreover, the optimal costs are given by
(11.66) |
(11.67) |
Proof: We only prove item (c) in the definition, since the proofs of items (a) and (b) are exactly similar to the finite-horizon problem. Accordingly, using similar manipulations as in the proof of item (a) of Theorem 11.2.1, it can be shown that with w ≡ 0,
˜W(f(x)+g2(x)u⋆)−˜W(x) = −12‖z‖2 .
Therefore, the closed-loop system is Lyapunov-stable. Further, the condition ˜W(f(x)+g2(x)u⋆(x)) ≡ ˜W(x) ∀k≥kc, for some kc ≥ k0, implies that u* ≡ 0, h1(x) ≡ 0 ∀ k ≥ kc. By hypothesis (H1), this implies limt→∞ xk = 0, and by LaSalle’s invariance-principle, we conclude asymptotic-stability. □
The above theorem can again be specialized to the linear system Σdl in the following corollary.
Corollary 11.2.2 Consider the discrete linear system Σdl under the Assumption 11.1.2. Suppose there exist ˉP1 < 0 and ˉP2 >0 symmetric solutions of the cross-coupled discrete- algebraic Riccati equations (DAREs):
ˉP1 = AT{ˉP1−2ˉP1B1B−1γΓ1−2ˉP1B2Λ−1BT2ˉP2+ 2ΓT1B−TγB1ˉP1B2Λ−1B2ˉP2Γ2+ ΓT1B−1γB1ˉP1B1B−1γΓ1+ΓT2ˉP2B2Λ−TB2ˉP1B2Λ−1BT2ˉP2Γ2+γ2ΓT1B−TγB−1γΓT1 − ΓT2ˉP2B2Λ−TΛ−1BT2ˉP2Γ2 } A−CT1C1, |
(11.68) |
ˉP2 = AT{ˉP2−2ˉP2B1B−1γΓ1−2ˉP2B2Λ−1BT2ˉP2Γ2 + 2ΓT1B−TγB1ˉP2B2Λ−1kBT2ˉP2Γ2+ ΓT2ˉP2B2Λ−1Λ−1BT2ˉP2Γ2} A+CT1C1. |
(11.69) |
(11.70) |
Then the Nash-equilibrium strategies uniquely specified by
(11.71) |
(11.72) |
where
Λ := (I+BT2ˉP2B2),Γ1 := [BT1ˉP1−B2Λ−1BT2P2],Γ2 := [I−B1B−1γ(BT1P1−B2Λ−1BT2ˉP2)],
solve the infinite-horizon DSF BMH2HINLCP for the system. Moreover, the optimal costs for the game are given by
(11.73) |
(11.74) |
Proof: Take
Y(xk) = 12xTkˉP1xk, ˉP1 <0V(x) = 12xTkˉP2xk, ˉP2 > 0
and apply the results of the theorem. □
11.3 Extension to a General Class of Discrete-Time Nonlinear Systems
In this subsection, we similarly extend the results of the previous subsection to a more general class of nonlinear discrete-time systems which is not necessarily affine. We consider the following state-space model defined on X⊂ℜn in local coordinates (x1,…, xn)
∑ : {˙xk+1 = ˜F(xk,wk,uk), x(t0)=x0 zk = ˜Z(xk,uk) yk = xk , |
(11.75) |
where all the variables have their previous meanings, while ˜F: X×W×U→X , ˜Z: X×U→ℜs are smooth functions of their arguments. In addition, we assume that ˜F(0,0,0)=0 and ˜Z(0,0)=0. Furthermore, define similarly the Hamiltonian functions corresponding to the cost functionals (11.32), (11.33), ˜Ki: X×W×U×ℜ→ℜ, i=1,2 respectively:
˜K1(x,w,u,˜W) = ˜W(˜F(x,w,u))−˜W(x)+12γ2‖w‖2−‖˜z(x,u)‖2,
˜K2(x,w,u,ˉU) = ˜U(˜F(x,w,u))−˜U(x)+12‖˜z(x,u)‖2 ,
for some smooth functions ˜W,˜U: X→ℜ. In addition, define also
∂2˜K(x) ≜ [∂2˜K2∂u2∂2˜K2∂w∂u∂2˜K1∂u∂w∂2˜K1∂w2] (x) = [s11(x)s12(x)s21(x)S22(x)] ,
where
s11(0) = [(∂˜F∂u)T ∂2˜U∂λ2(0)∂˜F∂u+(∂˜Z∂u)T∂˜Z∂u]x=0,w=0,u=0 ,
s12(0) = [(∂˜F∂u)T ∂2˜U∂λ2(0)∂˜F∂w]x=0,w=0,u=0 ,
s21(0) = [(∂˜F∂w)T ∂2˜W∂λ2(0)∂˜F∂u]x=0,w=0,u=0 ,
s22(0) = [(∂˜F∂w)T ∂2˜W∂λ2(0)∂˜F∂w+γ2I]x=0,w=0,u=0 .
We then make the following assumption.
Assumption 11.3.1 For the Hamiltonian functions, ˜K1,˜K2, we assume
s22(0)>0, det[s11(0)−s12(0)s−122s21(0)] ≠0.
Under the above assumption, the Hessian matrix ∂2˜K(0) is nonsingular, and therefore by the Implicit-function Theorem, there exists an open neighborhood M0 of x = 0 such that the equations
∂˜K1∂w(x,˜w⋆(x),˜u⋆(x)) = 0,∂˜K2∂u(x,˜w⋆(x),˜u⋆(x)) = 0
have unique solutions ˜u⋆(x),˜w⋆(x), with ˜u⋆(0)=0, ˜w⋆(0)=0 . Moreover, the pair (˜u⋆,˜w⋆) constitutes a Nash-equilibrium solution to the dynamic game (11.32), (11.33), (11.75). The following theorem then summarizes the solution to the infinite-horizon problem for the general class of discrete-time nonlinear systems (11.75).
Theorem 11.3.1 Consider the discrete-time nonlinear system (11.75) and the DSFBMH2HINLCP for this system. Suppose Assumption 11.3.1 holds, and also the following:
(Ad1) the pair {˜F(x,0,0) , ˜Z(x,0)} is zero-state detectable;
(Ad2) there exists a pair of C2 locally negative and positive-definite functions ˜W,˜U: ˜M→ℜ respectively, defined in a neighborhood ˜M of x = 0, vanishing at x = 0 and satisfying the pair of coupled DHJIEs:
˜W(˜F(x,˜w⋆(x),˜u⋆(x)))−˜W(x)+12γ2‖˜w⋆(x)‖2−12‖˜Z(x,˜u⋆(x))‖2 = 0, ˜W (0) =0,
˜U(˜F(x,˜w⋆(x),˜u⋆(x)))−˜U(x)+12‖˜Z(x,˜u⋆(x))‖2 = 0, ˜U (0) =0 ;
(A3) the pair {˜F(x,˜w⋆(x),0),˜Z(x,0)} is locally zero-state detectable.
Then the state-feedback controls (˜u⋆(x),˜w⋆(x)) solve the dynamic game problem and the DSFBMH2HINLCP for the system (11.75). Moreover, the optimal costs of the policies are given by
˜J⋆1d(˜w⋆,˜u⋆) = ˜W(x0),˜J⋆2d(˜w⋆,˜u⋆) = ˜U(x0).
Proof: The proof can be pursued along the same lines as the previous results. □
This chapter is mainly based on the paper by Lin [180]. The approach adopted throughout the chapter was originally inspired by the paper by Limebeer et al. [179] for linear systems. The chapter mainly extended the results of the paper to the nonlinear case. But in addition, the discrete-time problem has also been developed. Finally, application of the results to tracking control for Robot manipulators can be found in [80].