4 Hamiltonian Mechanics and Hamilton-Jacobi Theory

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Hamiltonian mechanics is a transformation theory that is an off-shoot of Lagrangian mechanics. It concerns itself with a systematic search for coordinate transformations which exhibit specific advantages for certain types of problems, notably in celestial and quantum mechanics. As such, the Hamiltonian approach to the analysis of a dynamical system, as it stands, does not represent an overwhelming development over the Lagrangian method. One ends up with practically the same number of equations as the Lagrangian approach. However, the real advantage of this approach lies in the fact that the transformed equations of motion in terms of a new set of position and momentum variables are easily integrated for specific problems, and also the deeper insight it provides into the formal structure of mechanics. The equal status accorded to coordinates and momenta as independent variables provides a new representation and greater freedom in selecting more relevant coordinate systems for different types of problems.

In this chapter, we study Lagrangian systems from the Hamiltonian standpoint. We shall consider natural mechanical systems for which the kinetic energy is a positive-definite quadratic form of the generalized velocities, and the Lagrangian function is the difference between the kinetic energy and the potential energy. Furthermore, as will be reviewed shortly, it will be shown that the Hamiltonian transformation of the equations of motion of a mechanical system always leads to the Hamilton-Jacobi equation (HJE) which is a first-order nonlinear PDE that must be solved in order to obtain the required transformation generating-function. It is therefore our aim in this chapter to give an overview of HJT with emphasis to the HJE.

4.1 The Hamiltonian Formulation of Mechanics

To review the approach, we begin with the following definition.

Definition 4.1.1 A differentiable manifold M with a fixed positive-definite quadratic form 〈ξ, ξ〉 on every tangent space T M_x, x ∈ M, is called a Riemannian manifold. The quadratic form is called a Riemannian metric.

Now, let the configuration space of the system be defined by a smooth n-dimensional Riemannian manifold M. If (φ, U) is a coordinate chart, we write φ = q = (q₁,…, q_n) for the local coordinates and q˙i=∂∂qi ${\dot{q}}_{i} = \frac{\partial}{\partial_{q i}}$ in the tangent bundle T M|_{U $U$} = T U. We shall be considering natural mechanical systems which are defined as follows.

Definition 4.1.2 A Lagrangian mechanical system on a Riemannian manifold is called natural if the Lagrangian function L:TM×R→R $L : T M \times ℜ \to ℜ$ is equal to the difference between the kinetic energy and the potential energy of the system defined as

L(q,q˙,t)=T(q,q˙,t)−V(q,t), $L (q, \dot{q}, t) = T (q, \dot{q}, t) - V (q, t),$

(4.1)

where T:TM×R→R $T : T M \times ℜ \to ℜ$ is the kinetic energy which is given by the symmetric Riemannian quadratic form

T=12⟨υ,υ⟩, υ∈TqM $T = \frac{1}{2} 〈 υ, υ 〉, υ \in T_{q} M$

and V:M×R→R $V : M \times ℜ \to ℜ$ is the potential energy of the system (which may be independent of time).

More specifically, for natural mechanical systems, the kinetic energy is a positive-definite symmetric quadratic form of the generalized velocities,

T(q,q˙,t)=12q˙TΨ(q,t)q˙. $T (q, \dot{q}, t) = \frac{1}{2} {\dot{q}}^{T} Ψ (q, t) \dot{q} .$

(4.2)

Further, it is well known from Lagrangian mechanics and as can be derived using Hamilton’s principle of least action [37, 115, 122] (see also Theorem 4.2.1), that the equations of motion of a holonomic conservative¹ mechanical system satisfy Langrange’s equation of motion given by

ddt(∂L∂q˙i)−∂L∂qi=0, i=1,….,n. $\frac{d}{d t} (\frac{\partial L}{\partial {\dot{q}}_{i}}) - \frac{\partial L}{\partial q_{i}} = 0, i = 1, …., n .$

(4.3)

Therefore, the above equation (4.3) may always be written in the form

q¨=g(q,q˙,t), $\ddot{q} = g (q, \dot{q}, t),$

(4.4)

for some function g:TU×R→Rn. $g : T U \times ℜ \to ℜ^{n} .$

On the other hand, in the Hamiltonian formulation, we choose to replace all the q˙i ${\dot{q}}_{i}$ by independent coordinates, p_i, in such a way that

Pi:=∂L∂q˙i, i=1,…,n. $P_{i} : = \frac{\partial L}{\partial {\dot{q}}_{i}}, i = 1, …, n .$

(4.5)

If we let

pi=h(q,q˙), i=1,….,n, $p_{i} = h (q, \dot{q}), i = 1, …., n,$

(4.6)

then the Jacobian of h with respect to q˙ $\dot{q}$ , using (4.1), (4.2) and (4.5), is given by Ψ(q) which is positive-definite, and hence equation (4.5) can be inverted to yield

q˙i=fi(q1,…,qn,p1,…,pn,t), i=1,…,n, ${\dot{q}}_{i} = f_{i} (q_{1}, …, q_{n}, p_{1}, …, p_{n}, t), i = 1, …, n,$

(4.7)

for some continuous functions f_i, i = 1,…, n. In this framework, the coordinates q = (q₁, q₂,…, q_n)^T are referred to as the generalized-coordinates and p = (p₁, p₂,…, p_n)^T are the generalized-momenta. Together, these variables form a new system of coordinates for the system known as the phase-space of the system. If (U, φ) where φ = (q₁, q₂,…, q_n) is a chart on M, then since pi:TU→R, i=1,…,n, $p_{i} : T U \to ℜ, i=1, …,n,$ they are elements of T⋆U $T^{⋆} U$ , and together with the q_i’s form a system of 2n local coordinates (q₁,…, q_n, p₁,…, p_n) for the phase-space T⋆U $T^{⋆} U$ of the system in U.

Now define the Hamiltonian function of the system H:T⋆M×R→R $H : T^{⋆} M \times ℜ \to ℜ$ as the Legendre transform² of the Lagrangian function with respect to q˙ $\dot{q}$ by

H(q,p,t)=pTq˙−L(q,q˙,t), $H (q, p, t) = p^{T} \dot{q} - L (q, \dot{q}, t),$

(4.8)

and consider the differential of H with respect to q, p and t as

dH(∂H∂p)Tdp+(∂H∂q)Tdq+∂H∂tdt. $d H {(\frac{\partial H}{\partial p})}^{T} d p + {(\frac{\partial H}{\partial q})}^{T} d q + \frac{\partial H}{\partial t} d t .$

(4.9)

The above expression must be equal to the total differential of H given by (4.8) for p=∂L∂q˙: $p = \frac{\partial L}{\partial \dot{q}} :$

dH=q˙Tdp−(∂L∂q)Tdq−(∂L∂t)Tdt. $d H = {\dot{q}}^{T} d p - {(\frac{\partial L}{\partial q})}^{T} d q - {(\frac{\partial L}{\partial t})}^{T} d t .$

(4.10)

Thus, in view of the independent nature of the coordinates, we obtain a set of three relationships:

q˙ = ∂H∂p, ∂L∂q = −∂H∂q, and ∂L∂t =−∂H∂t. $\dot{q} = \frac{\partial H}{\partial p}, \frac{\partial L}{\partial q} = - \frac{\partial H}{\partial q}, and \frac{\partial L}{\partial t} = - \frac{\partial H}{\partial t} .$

Finally, applying Lagrange’s equation (4.3) together with (4.5) and the preceeding results, one obtains the expression for q˙ $\dot{q}$ . Since we used Lagrange’s equation, q˙=dqdt and p˙=dpdt, $\dot{q} = \frac{d q}{d t} and \dot{p} = \frac{d p}{d t},$ and the resulting Hamiltonian canonical equations of motion are then given by

dqdt=∂H∂p(q,p,t), $\frac{d q}{d t} = \frac{\partial H}{\partial p} (q, p, t),$

(4.11)

dpdt=−∂H∂q(q,p,t). $\frac{d p}{d t} = - \frac{\partial H}{\partial q} (q, p, t) .$

(4.12)

Therefore, we have proven the following theorem.

Theorem 4.1.1 The system of Lagrange’s equations (4.3) is equivalent to the system of 2n first-order Hamilton’s equations (4.11), (4.12).

In addition, for time-independent conservative systems, H(q, p) has a simple physical interpretation. From (4.8) and using (4.5), we have

H(q,p) = pTq˙− L(q,q˙) = q˙T∂L∂q˙ − (T(q,q˙)−V(q)) = q˙T∂L∂q˙ − T(q,q˙)+V(q) =2T(q,q˙)−T(q,q˙)+V(q) =T(q,q˙)+V(q). $\begin{array}{l} H (q, p) = p^{T} \dot{q} - L (q, \dot{q}) = {\dot{q}}^{T} \frac{\partial L}{\partial \dot{q}} - (T (q, \dot{q}) - V (q)) \\ = {\dot{q}}^{T} \frac{\partial L}{\partial \dot{q}} - T (q, \dot{q}) + V (q) \\ = 2 T (q, \dot{q}) - T (q, \dot{q}) + V (q) \\ = T (q, \dot{q}) + V (q) . \end{array}$

That is, H(q, p, t) is the total energy of the system. This completes the Hamiltonian formulation of the equations of motion, and can be seen as an off-shoot of the Lagrangian formulation. It can also be seen that, while the Lagrangian formulation involves n second-order equations, the Hamiltonian description sets up a system of 2n first-order equations in terms of the 2n variables p and q. This remarkably new system of coordinates gives new insight and physical meaning to the equations. However, the system of Lagrange’s equations and Hamilton’s equations are completely equivalent and dual to one another.

Furthermore, because of the symmetry of Hamilton’s equations (4.11), (4.12) and the even dimension of the system, a new structure emerges on the phase-space T⋆M $T^{⋆} M$ of the system. This structure is defined by a nondegenerate closed differential 2-form which in the above local coordinates is defined as:

ω2=dpΛdq=∑i=1ndpiΛdqi. $ω^{2} = d p Λ d q = \sum_{i = 1}^{n} d p_{i} Λ d q_{i} .$

(4.13)

Thus, the pair (T ^⋆ M, ω²) form a symplectic-manifold, and together with the C^r Hamiltonian function H:T⋆M→R, $H : T^{⋆} M \to ℜ,$ define a Hamiltonian mechanical system. With this notation, we have the following representation of a Hamiltonian system.

Definition 4.1.3 Let (T ^⋆ M, ω²) be a symplectic-manifold and let H:T⋆M→R, $H : T^{⋆} M \to ℜ,$ be a Hamiltonian function. Then, the vector-field X_H determined by the condition

ω2(XH,Y)=dH(Y) $ω^{2} (X_{H}, Y) = d H (Y)$

(4.14)

for all vector-fields Y, is called the Hamiltonian vector-field with energy function H. We call the tuple (T ^⋆ M, ω², X_H) a Hamiltonian system.

Remark 4.1.1 It is important to note that the nondegeneracy³ of ω² guarantees that X_H exists, and is a C^r−1 vector-field. Moreover, on a connected symplectic-manifold, any two Hamiltonians for the same vector-field X_H have the same differential (4.14), so differ by a constant only.

We also have the following proposition [1].

Proposition 4.1.1 Let (q₁,…, q_n, p₁,…, p_n) be canonical coordinates so that ω² is given by (4.13). Then, in these coordinates

XH = (∂H∂p1,…,∂H∂pn,−∂H∂q1,…,−∂H∂qn) = J.∇H $X_{H} = (\frac{\partial H}{\partial p_{1}}, …, \frac{\partial H}{\partial p_{n}}, - \frac{\partial H}{\partial q_{1}}, …, - \frac{\partial H}{\partial q_{n}}) = J . \nabla H$

where J = (0−II0) $(\begin{matrix} 0 & I \\ - I & 0 \end{matrix})$ . Thus, (q(t), p(t)) is an integral curve of X_H if and only if Hamilton’s equations (4.11), (4.12) hold.

4.2 Canonical Transformation

Now suppose that a transformation of coordinates is introduced q_i → Q_i, p_i → P_i, i = 1,…, n such that every Hamiltonian function transforms as H(q₁,…, q_n, p₁,…, p_n, t) → K(Q₁,…, Q_n, P₁,…, P_n, t) in such a way that the new equations of motion retain the same form as in the former coordinates, i.e.,

dQdt=∂K∂p(Q,P,t) $\frac{d Q}{d t} = \frac{\partial K}{\partial p} (Q, P, t)$

(4.15)

dPdt=−∂K∂q(Q,P,t). $\frac{d P}{d t} = - \frac{\partial K}{\partial q} (Q, P, t) .$

(4.16)

Such a transformation is called canonical and can greatly simplify the solution to the equations of motion, especially if Q, P are selected such that K(., ., .) is a constant independent of Q and P. When this happens, then Q and P will also be constants and the solution to the equations of motion are immediately available (given the transformation). We simply transform back to the original coordinates under the assumption that the transformation is univalent and invertible. It therefore follows from this that:

1. The identity transformation is canonical;

2. The inverse of a canonical transformation is a canonical transformation;

3. The product of two or more canonical transformations is also a canonical transformation;

4. A canonical transformation must preserve the differential-form ω² = dp ∧ dq or preserve the canonical nature of the equations of motion (4.15), (4.16).

The use of canonical invariants such as Poisson brackets [38, 115] can often be used to check whether a given a transformation (q, p) → (Q, P ) is canonical or not. For any two given C¹-functions u(q, p), v(q, p), their Poisson bracket is defined as

[u,υ]q,p=∑i=1n(∂u∂qi ∂υ∂pi−∂u∂pi ∂υ∂qi). ${[u, υ]}_{q, p} = \sum_{i = 1}^{n} (\frac{\partial u}{\partial q_{i}} \frac{\partial υ}{\partial p_{i}} - \frac{\partial u}{\partial p_{i}} \frac{\partial υ}{\partial q_{i}}) .$

(4.17)

It can then be shown that a transformation (q, p) ↦ (Q, P ) is canonical if and only if:

[Qi, Qk]q,p=0, [Pi, Pk]q,p=0, [Pi, Qk]q,p=δik, i,k=1,2,…,n ${[Q_{i}, Q_{k}]}_{q, p} = 0, {[P_{i}, P_{k}]}_{q, p} = 0, {[P_{i,} Q_{k}]}_{q, p} = δ_{i k,} i, k = 1, 2, …, n$

(4.18)

are satisfied, where δ_ik is the Kronecker delta.

Hamilton (1838) has developed a method for obtaining the desired transformation equations using what is today known as Hamilton’s principle which we introduce hereafter.

Definition 4.2.1 Let γ = {(t, q) : q = q(t), t₀ ≤ t ≤ t₁} be a curve in the (t, q) plane. Define the functional Φ(γ) (which we assume to be differentiable) by

Φ(γ) = ∫t1t0L(q(τ),q˙(τ))dτ. $Φ (γ) = \int_{t_{0}}^{t_{1}} L (q (τ), \dot{q} (τ)) d τ .$

Then, the curve γ is an extremal of the functional Φ(.) if δΦ(γ) = 0 or dΦ(γ) = 0 ∀t ∈ [t₀, t₁], where δ is the variational operator.

Theorem 4.2.1 (Hamilton’s Principle of Least-Action) [37, 115, 122, 127]. The motion of a mechanical system with Lagrangian function L(., ., .), coincides with the extremals of the functional Φ(γ).

Accordingly, define the Lagrangian function of the system L : T M × ℜ → ℜ as the Legendre transform [37] of the Hamiltonian function by

L(q,q˙,t)=pTq˙−H(q,p,t). $L (q, \dot{q}, t) = p^{T} \dot{q} - H (q, p, t) .$

(4.19)

Then, in the new coordinates, the new Lagrangian function is

L¯¯¯(Q,Q˙,t)=PTQ˙−K(Q,P,t). $\bar{L} (Q, \dot{Q}, t) = P^{T} \dot{Q} - K (Q, P, t) .$

(4.20)

Since both L(., ., .) and L¯¯¯ $\bar{L}$ (., ., .) are conserved, each must separately satisfy Hamilton’s principle. However, L(., ., .) and L¯¯¯ $\bar{L}$ (., ., .) need not be equal in order to satisfy the above requirement. Indeed, we can write

L(q,q˙,t)=L¯¯¯(Q,Q˙,t)+dSdt(q,p,Q,P,t) $L (q, \dot{q}, t) = \bar{L} (Q, \dot{Q}, t) + \frac{d S}{d t} (q, p, Q, P, t)$

(4.21)

for some arbitrary function S:X×X¯¯¯¯×R→R where X,X¯¯¯¯⊂T⋆M $S : X \times \bar{X} \times ℜ \to ℜ where X, \bar{X} \subset T^{⋆} M$ (see also [122], page 286). Since dS is an exact differential (i.e., it is the derivative of a scalar function),

δ[∫t1t0dSdt(q,p,Q,P,t)dt]=S(q,p,Q,P,t)∣∣∣t1t0=0. $δ [\int_{t_{0}}^{t_{1}} \frac{d S}{d t} (q, p, Q, P, t) d t] = S (q, p, Q, P, t) |_{t_{0}}^{t_{1}} = 0.$

(4.22)

Now applying Hamilton’s principle to the time integral of both sides of equation (4.21), we get

$δ [\int_{t_{0}}^{t_{1}} L (q, \dot{q}, t) d t] = δ [\int_{t_{0}}^{t_{1}} \bar{L} (Q, \dot{Q}, t) d t] + δ [\int_{t_{0}}^{t_{1}} \frac{d S}{d t} (q, p, Q, P, t) d t] = 0;$

(4.23)

and therefore by (4.22),

$δ [\int_{t_{0}}^{t_{1}} \bar{L} (Q, \dot{Q}, t) d t] = 0.$

(4.24)

Thus, to guarantee that a given change of coordinates, say,

$q_{i} = ϕ_{i} (Q, P, t)$

(4.25)

$p_{i} = ψ_{i} (Q, P, t)$

(4.26)

is canonical, from (4.19), (4.20) and (4.21), it is enough that

$p^{T} \dot{q} - H = P^{T} \dot{Q} - K + \frac{d S}{d t} .$

(4.27)

This condition is also required [122]. Consequently, the above equation is equivalent to

$p^{T} d q - P^{T} d Q = (H - K) (q, p, Q, P, t) d t + d S (q, p, Q, P, t),$

(4.28)

which requires on the expression on the left side to be also an exact differential. Further, it can be verified that the presence of S(.) in (4.21) does not alter the canonical structure of the Hamiltonian equations. Applying Hamilton’s principle to the right-hand-side of (4.21), we have from (4.24), the Euler-Lagrange equation (4.3), and the argument following it

$\frac{d Q}{d t} = \frac{\partial K}{\partial p} (Q, P, t)$

(4.29)

$\frac{d P}{d t} = - \frac{\partial K}{\partial q} (Q, P, t) .$

(4.30)

Hence the canonical nature of the equations is preserved.

4.2.1 The Transformation Generating Function

As proposed in the previous section, the equations of motion of a given Hamiltonian system can often be simplified significantly by a suitable transformation of variables such that all the new position and momentum coordinates (Q_i, P_i) are constants. In this subsection, we discuss Hamilton’s approach for finding such a transformation.

We have already seen that an arbitrary generating function S does not alter the canonical nature of the equations of motion. The next step is to show that, first, if such a function is known, then the transformation we so anxiously seek follows directly. Secondly, that the function can be obtained by solving a certain partial-differential equation (PDE).

The generating function S relates the old to the new coordinates via the equation

$s = \int (L - \bar{L}) d t = f (q, p, Q, P, t) .$

(4.31)

Therefore, S is a function of 4n + 1 variables of which only 2n are independent. Hence, no more than four independent sets of relationships among the dependent coordinates can exist. Two such relationships expressing the old sets of coordinates in terms of the new set are given by equations (4.25), (4.26). Consequently, only two independent sets of relationships among the coordinates remain for defining S and no more than two of the four sets of coordinates may be involved. Therefore, there are four possibilities:

$S_{1} = f_{1} (q, Q, t); S_{2} = f_{2} (q, P, t);$

(4.32)

$S_{3} = f_{3} (p, Q, t); S_{4} = f_{4} (p, P, t) .$

(4.33)

Any one of the above four types of generating functions may be selected, and a transformation obtained from it. For example, if we consider the generating function S₁, taking its differential, we have

$d S_{1} = \sum_{i = 1}^{n} \frac{\partial S_{1}}{\partial q_{i}} d q_{i} + \sum_{i = 1}^{n} \frac{\partial S_{1}}{\partial Q_{i}} d Q_{i} + \frac{\partial S_{1}}{\partial t} d t .$

(4.34)

Again, taking the differential as defined by (4.28), we have

$d S_{1} = \sum_{i = 1}^{n} p_{i} d q_{i} - \sum_{i = 1}^{n} P_{i} d Q_{i} + (K - H) d t .$

(4.35)

Finally, using the independence of coordinates, we equate coefficients, and obtain the desired transformation equations

$\begin{array}{l} p_{i} = \frac{\partial S_{1}}{\partial q_{i}} (q, Q, t) \\ P_{i} = - \frac{\partial S_{1}}{\partial Q_{i}} (q, Q, t) \\ K - H = \frac{\partial S_{1}}{\partial t} (q, Q, t) \end{array}}, i = 1, …, n .$

(4.36)

Similar derivation can be applied to the remaining three types of generating functions, and in addition, we can also apply Legendre transformation. Thus, for the generating functions S₂(., ., .), S₃(., ., .) and S₄(., ., .), we have

$\begin{array}{l} p_{i} = \frac{\partial S_{2}}{\partial q_{i}} (q, P, t) \\ Q_{i} = \frac{\partial S_{2}}{\partial P_{i}} (q, P, t) \\ K - H = \frac{\partial S_{2}}{\partial t} (q, P, t) \end{array}}, i = 1, …, n,$

(4.37)

$\begin{array}{l} q_{i} = - \frac{\partial S_{3}}{\partial p_{i}} (p, Q, t) \\ P_{i} = - \frac{\partial S_{3}}{\partial Q_{i}} (p, Q, t) \\ K - H = \frac{\partial S_{3}}{\partial t} (p, Q, t) \end{array}}, i = 1, …, n,$

(4.38)

$\begin{array}{l} q_{i} = - \frac{\partial S_{4}}{\partial p_{i}} (p, P, t) \\ Q_{i} = \frac{\partial S_{4}}{\partial P_{i}} (p, P, t) \\ K - H = \frac{\partial S_{4}}{\partial t} (p, P, t) \end{array}}, i = 1, …, n,$

(4.39)

respectively. It should however be remarked that most of the canonical transformations that are expressed using arbitrary generating functions often have the consequence that the distinct meaning of the generalized coordinates and momenta is blurred. For example, consider the generating function S = S₁(q, Q) = q^T Q. Then, it follows from the foregoing that

$\begin{array}{l} p_{i} = \frac{\partial S_{1}}{\partial q_{i}} = Q_{i} \\ P_{i} = - \frac{\partial S_{1}}{\partial Q_{i}} = - q_{i} \\ K = \frac{\partial S_{1}}{\partial t} = H (- P, Q, t) \end{array}}, i = 1, …, n,$

(4.40)

which implies that P_i and q_i have the same units except for the sign change.

One canonical transformation that allows only the tranformation of corresponding coordinates is called a point transformation. In this case, Q(q, t) does not depend on p but only on q and possibly t and the meaning of the coordinates is preserved. This ability of a point transformation can also be demonstrated using the genarating function S₂. Consider for instance the transformation Q = ψ(q, t) of the coordinates among each other such that

$S = S_{2} (q, P, t) = ψ^{T} (q, t) P .$

(4.41)

Then, the resulting canonical equations are given by

$\begin{array}{l} p = \frac{\partial S_{2}}{\partial q} = {(\frac{\partial ψ}{\partial q})}^{T} P, \\ Q = \frac{\partial S_{2}}{\partial P} = ψ (q, t), \\ K = H + \frac{\partial ψ^{T} (q, t)}{\partial t} P \end{array}},$

(4.42)

and it is clear that the meaning of the coordinates is preserved in this case.

4.2.2 The Hamilton-Jacobi Equation (HJE)

In this subsection, we turn our attention to the last missing link in the Hamiltonian transformation theory, i.e., an approach for determining the transformation generating function, S. There is only one equation available

$H (q, p, t) + \frac{\partial S}{\partial t} = K (P, Q, t) .$

(4.43)

However, there are two unknown functions in this equation, namely, S and K. Therefore, the best we can do is to assume a solution for one and then solve for the other. A convenient and intuitive strategy is to arbitrarily set K to be identically zero! Under this condition, $\dot{Q}$ and $\dot{P}$ vanish, resulting in Q = α, and P = β, as constants. The inverse transformation then yields the motion q(α, β, t), p(α, β, t) in terms of these constants of integration.

Consider now generating functions of the first type. Having forced a solution K ≡ 0, we must now solve the PDE:

$H (q, \frac{\partial S}{\partial q}, t) + : \frac{\partial S}{\partial t} = 0$

(4.44)

for S, where $\frac{\partial S}{\partial q} = {(\frac{\partial S}{\partial q_{1}}, …, \frac{\partial S}{\partial q_{n}})}^{T} .$ This equation is known as the Hamilton-Jacobi equation (HJE), and was improved and modified by Jacobi in 1838. For a given function H(q, p, t), this is a first-order PDE in n + 1 variables for the unknown function S(q, α, t) which is traditionally called Hamilton’s principal function. We need a solution for this equation which depends on n arbitrary independent constants of integration α₁, α₂,…, α_n. Such a solution S(q, α, t) is called a “complete solution” of the HJE (4.44), and solving the HJE is equivalent to finding the solution to the equations of motion (4.11), (4.12). On the other hand, the solution of (4.44) is simply the solution of the equations (4.11), (4.12) using the method of characteristics [95]. However, it is generally not simpler to solve (4.44) instead of (4.11), (4.12).

If a complete solution S(q, α, t) of (4.44) can be found and if the generating function S = S₁(q, Q, t) is used, then one obtains

$\frac{\partial S_{1}}{\partial q_{i}} = p_{i}, i = 1, …, n,$

(4.45)

$\frac{\partial S_{1}}{\partial α_{i}} = - β_{i,} i = 1, …, n .$

(4.46)

Moreover, since the constants α_i are independent, the Jacobian matrix $\frac{\partial^{2} S_{1}}{\partial_{q} \partial_{α}}$ is nonsingular and therefore by the Implicit-function Theorem, the above two equations can be solved to recover the original variables q(α, β, t) and p(α, β, t).

4.2.3 Time-Independent Hamilton-Jacobi Equation and Separation of Variables

The preceding section has laid down a systematic approach to the solution of the equations of motion via a transformation theory that culminates in the HJE. However, implementation of the above procedure is difficult, because the chances of success are limited by the lack of efficient mathematical techniques for solving nonlinear PDEs. At present, the only general technique is the method of separation of variables. If the Hamiltonian is explicitly a function of time, then separation of variables is not readily achieved for the HJE. On the other hand, if the Hamiltonian is not explicitly a function of time or is independent of time, which arises in many dynamical systems of practical interest, then the HJE separates easily. The solution to (4.44) can then be formulated in the form

$S (q, α, t) = W (q, α) - α_{1} t .$

(4.47)

Consequently, the use of (4.47) in (4.44) yields the following restricted time-independent HJE in W:

$H (q, \frac{\partial W}{\partial q}) = α_{1},$

(4.48)

where α₁, one of the constants of integration is equal to the constant value of H or is an energy constant (if the kinetic energy of the system is homogeneous quadratic, the constant equals the total energy, E). Moreover, since W does not involve time, the new and the old Hamiltonians are equal, and it follows that K = α₁. The function W, known as Hamilton’s characteristic function, thus generates a canonical transformation in which all the new coordinates are cyclic.⁴ Further, if we again consider generating functions of the first kind, i.e., S = S₁(q, Q, t), then from (4.45), (4.46) and (4.47), we have the following system

$\begin{array}{l} \frac{\partial W}{\partial q_{i}} = p_{i,} i = 1, 2…, n, \\ \frac{\partial W}{\partial α_{1}} = t + β_{1}, \\ \frac{\partial W}{\partial α_{i}} = β_{i}, i = 2, …, n . \end{array}}$

(4.49)

The above system of equations can then be solved for the q_i in terms of α_i, β_i and time t.

At this point, it might appear that little practical advantage has been gained in solving a first-order nonlinear PDE, which is notoriously difficult to solve, instead of a system of 2n ODEs. Nevertheless, under certain conditions, and when the Hamiltonian is independent of time, it is possible to separate the variables in the HJE, and the solution can then be obtained by integration. In this event, the HJE becomes a useful computational tool.

Unfortunately, there is no simple criterion (for orthogonal coordinate systems the socalled Staeckel conditions [115] have proven to be useful) for determining when the HJE is separable. For some problems, e.g., the three-body problem, it is impossible to separate the variables, while for others it is transparently easy. Fortunately, a great majority of systems of current interest in quantum mechanics and atomic physics are of the latter class. Moreover, it should also be emphasized that the question of whether the HJE is separable depends on the system of generalized coordinates employed. Indeed, the one-body central force problem is separable in polar coordinates, but not in cartesian coordinates.

To illustrate the Hamilton-Jacobi technique for the time-independent case, we consider an example of the harmonic oscillator.

Example 4.2.1 [115]. Consider the harmonic oscillator with Hamiltonian function

$H_{h} = \frac{p^{2}}{2 m} + \frac{k q^{2}}{2} .$

The corresponding HJE (4.48) is given by

$\frac{1}{2 m} {(\frac{\partial w}{\partial q})}^{2} + \frac{k q^{2}}{2} = α$

which can be immediately integrated to yield

$W (q, α) = \sqrt{m k} \int \sqrt{\frac{2 α}{k} -} q^{2} d q .$

Thus,

$S (q, α) = \sqrt{m k} \int \sqrt{\frac{2 α}{k} -} q^{2} d q - α t,$

and

$β = \frac{\partial S}{\partial α} = \sqrt{\frac{m}{k}} \int \frac{d q}{\sqrt{\frac{2 α}{k} - q^{2}}} - t .$

The above equation can now be integrated to yield

$t + β = - \sqrt{\frac{m}{k}} \cos^{- 1} (q \sqrt{\frac{k}{2 α}}) .$

Now, if we let ω = $\sqrt{\frac{k}{m}}$ , then the above equation can be solved for q to get the solution

$q (t) = \sqrt{\frac{2 α}{k}} \cos (ω t + β)$

with α, β constants of integration.

4.3 The Theory of Nonlinear Lattices

In this section, we discuss the theory of nonlinear lattices as a special class and an example of Hamiltonian systems that are integrable. Later, we shall also show how the HJE arising from the A₂-Toda lattice can be solved.

Historically, the exact treatment of oscillations in nonlinear lattices became serious in the early 1950’s when Fermi, Pasta and Ulam (FPU) numerically studied the problem of energy partition. Fermi et al. wanted to verify by numerical experiment if there is energy flow between the modes of linear-lattice systems when nonlinear interactions are introduced. He wanted to verify what is called the equipartition of energy in statistical mechanics. However, to their disappointment, only a little energy partition occurred, and the state of the systems was found to return periodically to the initial state.

Later, Ford and co-workers [258] showed that by using pertubation and by numerical calculation, though resonance generally enhances energy sharing, it has no intimate connection to a periodic phenomenon, and that nonlinear lattices have rather stable-motion (periodic, when the energy is not too high) or pulses (also known as solitons), which he called the nonlinear normal modes. This fact also indicates that there will be some nonlinear lattice which admits rigorous periodic waves, and certain pulses (lattice solitons) will be stable there. This remarkable property led to the finding of an integrable one-dimensional lattice with exponential interaction also known as the Toda lattice.

The Toda lattice as a Hamiltonian system describes the motion of n particles moving in a straight line with “exponential interaction” between them. Mathematically, it is equivalent to a problem in which a single particle moves in ℜⁿ. Let the positions of the particles at time t (in ℜ) be q₁(t),…, q_n(t), respectively. We assume that each particle has mass 1. The momentum of the i-th particle at time t is therefore $p_{i} = {\dot{q}}_{i} .$ The Hamiltonian function for the finite (or non-periodic) lattice is defined to be

$H (q, p) = \frac{1}{2} \sum_{j = 1}^{n} p_{j}^{2} + \sum_{j = 1}^{n - 1} e^{2 (q_{j} - q_{j} + 1)} .$

(4.50)

Therefore, the canonical equations for the system are given by

$\begin{array}{l} \frac{d q_{j}}{d t} = p_{j} j = 1, . . ., n, \\ \frac{d p_{1}}{d t} = - 2 e^{2 (q_{1} - q_{2})}, \\ \frac{d p_{j}}{d t} = - 2 e^{2 (q_{j} - q_{j} + 1)} + 2 e^{2 (q_{j} - 1 - q_{j})}, j = 2, . . ., n - 1, \\ \frac{d p_{n}}{d t} = 2 e^{2 (q_{n - 1 - q n})} . \end{array}}$

(4.51)

It may be assumed in addition that $\sum_{j = 1}^{n} q_{j} = \sum_{j = 1}^{n} p_{j} = 0$ , and the coordinates q₁,…, q_n can be chosen so that this condition is satisfied. While for the periodic lattice in which the first particle interacts with the last, the Hamiltonian function is defined by

$\tilde{H} (q, p) = \frac{1}{2} \sum_{j = 1}^{n} p_{j}^{2} + \sum_{j = 1}^{n - 1} e^{2 (q_{j} - q_{j} + 1)} + e^{2 (q_{n} - q_{1})} .$

(4.52)

We may also consider the infinite lattice, in which there are infinitely many particles.

Nonlinear lattices can provide models for nonlinear phenomena such as wave propagation in nerve systems, chemical reactions, certain ecological systems and a host of electrical and mechanical systems. For example, it is easily shown that a linear lattice is equivalent to a ladder network composed of capacitors C and inductors L, while a one-dimensional nonlinear lattice is equivalent to a ladder circuit with nonlinear L or C. To show this, let I_n denote the current, Q_n the charge on the capacitor, Φ_n the flux in the inductance, and write the equations for the circuit as

$\begin{array}{l} \frac{d Q n}{d t} = I_{n} - I_{n - 1,} \\ \frac{d Φ_{n}}{d t} = V_{n} - V_{n + 1} . \end{array}} .$

(4.53)

Now assume that the inductors and capacitors are nonlinear in such a way that

$\begin{array}{l} Q_{n} = C υ_{0} ln (1 + V_{n} / υ_{o}) \\ Φ_{n} = L i_{0} ln (1 + I_{n} / i_{o}) \end{array}$

where (C, v₀, L, i₀) are constants. Then equations (4.53) give

$\begin{array}{l} \frac{d Q_{n}}{d t} = i_{0} (e^{\frac{Φ_{n - 1}}{L i_{0}}} - e^{\frac{Φ_{n}}{L i_{0}}}) \\ \frac{d Φ_{n}}{d t} = υ_{0} (e^{\frac{Q_{n - 1}}{c_{υ_{0}}}} - e^{\frac{Q_{n}}{c_{υ_{0}}}}) \end{array}$

which are in the form of a lattice with exponential interaction (or Toda system).

Stimulated by Ford’s numerical work which revealed the likely integrability of the Toda lattice, Henon and Flaschka [258] independently showed the integrability of the Toda lattice analytically, and began an analytical survey of the lattice. At the same time, the inverse scattering method of solving the initial value problem for the Kortoweg-de Vries equation (KdV) had been firmly formulated by Lax [258], and this method was applied to the infinite lattice to derive a solution using matrix formalism which led to a simplification of the equations of motion. To introduce this formalism, define the following matrices

$L = (\begin{array}{l} p_{1} Q_{1, 2} 0 \dots 0 0 \\ Q_{1, 2} p_{2} Q_{2, 3} \dots 0 0 \\ 0 Q_{2, 3} p_{3} \dots 0 0 \\ ⋮ ⋮ ⋮ ⋮ ⋮ \\ 0 0 0 \dots p_{n - 1} Q_{n - 1, n} \\ 0 0 0 \dots Q_{n - 1, n} p_{n} \end{array})$

(4.54)

$M = (\begin{array}{l} 0 Q_{1, 2} 0 \dots 0 0 \\ - Q_{1, 2} 0 Q_{2, 3} \dots 0 0 \\ 0 - Q_{2, 3} 0 \dots 0 0 \\ ⋮ ⋮ ⋮ ⋮ ⋮ \\ 0 0 0 \dots 0 Q_{n - 1, n} \\ 0 0 0 \dots - Q_{n - 1, n} 0 \end{array})$

(4.55)

where Q_i,j = e^(qi−qj). We then have the following propoposition [123].

Proposition 4.3.1 The Hamiltonian system for the non-periodic Toda lattice (4.50)-(4.51) is equivalent to the Lax equation $\dot{L} = [L, M]$ , where the function L, M take values in sl(n, ℜ)⁵ and [., .] is the Lie bracket operation in sl(n, ℜ).

Using the above matrix formalism, the solution of the Toda system (4.51) can be derived [123, 258].

Theorem 4.3.1 The solution of the Hamiltonian system for the Toda lattice is given by

$L (t) = A d {(\exp t V)}_{1}^{- 1} V,$

where $V = L (0), A d (g) X = \frac{d}{d t} g exp(t X {)g}^{- 1} |_{t = 0} = g X g^{- 1} f o r a n y X \in S L (n, ℜ), g \in s l (n, ℜ)$ , and the subscript 1 represents the projection ${(\exp - t W)}_{1} = \exp - t π_{1} W = \exp - t W_{1}$ onto the first component in the decomposition of $W = W_{1} W_{2} \in s L (n, ℜ)$ .

The solution can be explicitly written in the case of n = 2. Letting q₁ = −q, q₂ = q, p₁ = −p and p₂ = p, we have

$L = (\begin{matrix} p & Q \\ Q & - p \end{matrix}), M = (\begin{matrix} 0 & Q \\ - Q & 0 \end{matrix}),$

(4.56)

where Q = c^−2q. Then the solution of $\dot{L} = [L, M]$ with

$L (0) = (\begin{matrix} 0 & υ \\ υ & 0 \end{matrix}),$

$L (t) = A d {[\exp t (\begin{matrix} 0 & υ \\ υ & 0 \end{matrix})]}_{I}^{- 1} (\begin{matrix} 0 & υ \\ υ & 0 \end{matrix}) .$

Now

$\exp t (\begin{matrix} 0 & υ \\ υ & 0 \end{matrix}) = (\begin{matrix} \cosh t υ & s i n h t υ \\ s i n h t υ & \cosh t υ \end{matrix}) .$

The decomposition $S L_{2} (2, ℜ) = S O_{2} \hat{N}'_{2}$ is given by

$(\begin{matrix} a & b \\ c & d \end{matrix}) = [\frac{1}{\sqrt{b^{2} + d^{2}}} (\begin{matrix} d & b \\ - b & d \end{matrix})] [\frac{1}{\sqrt{b^{2} + d^{2}}} (\begin{matrix} 1 & 0 \\ a b + c d & b^{2} + d^{2} \end{matrix})] .$

Hence,

${[\exp t (\begin{matrix} 0 & υ \\ υ & 0 \end{matrix})]}_{1} = \frac{1}{\sqrt{\sinh^{2} t υ + \cosh^{2} t υ}} (\begin{matrix} \cosh t υ & \sinh t υ \\ - \sinh t υ & \cosh t υ \end{matrix}) .$

Therefore,

$L (t) = \frac{v}{{sinh}^{2} t υ + \cosh^{2} t υ} (\begin{matrix} - 2 sinh t υ \cosh t υ & 1 \\ 1 & 2 sinh t υ \cosh t υ \end{matrix}) .$

Which means that

$p (t) = - υ \frac{sinh 2 t υ}{\cosh 2 t υ}, Q (t) = \frac{υ}{\cosh 2 t υ} .$

Furthermore, if we recall that Q(t) = e^−2q(t), it follows that

$\begin{array}{l} q (t) = - \frac{1}{2} \log (\frac{υ}{\cosh 2 t υ}) \\ = - \frac{1}{2} \log υ + \frac{1}{2} \log \cosh 2 υ t . \end{array}$

(4.57)

4.3.1 The G₂-Periodic Toda Lattice

In the study of the generalized periodic Toda lattice, Bogoyavlensky [72] showed that various models of the Toda lattice which admit the [L, M]-Lax representation correspond to certain simple Lie-algebras which he called the $A$ , $B$ , $C$ , $D$ and $G$ ₂ periodic Toda systems. In particular, the $G$ ₂ is a two-particle system and corresponds to the Lie algebra g₂ which is 14-dimensional, and has been studied extensively in the literature [3, 4, 201]. The Hamiltonian for the g₂ system is given by

$H (q, p) = \frac{1}{2} (p_{1}^{2} + p_{2}^{2}) + e^{(1 / \sqrt{3}) q_{1}} + e^{- (\sqrt{3} / 2) q_{1} + (1 / 2) q_{2}} + e^{- q_{2}},$

(4.58)

and the Lax equation corresponding to this system is given by dA/dt = [A, B], where

$\begin{array}{l} A (t) = a_{1} (t) (X_{- β 3} + X_{β 3}) + a_{2} (t) (X_{- γ1} + X_{γ1}) + a_{3} (t) (s^{- 1} X_{- γ3} + s X_{γ3}) + \\ b_{1} (t) H_{1} + b_{2} H_{2} \\ B (t) = a_{1} (t) (X_{- β 3} - X_{β 3}) + a_{2} (t) (X_{- γ1} - X_{γ1}) + a_{3} (t) (s^{- 1} X_{- γ3} - s X_{γ3}), \end{array}$

s is a parameter, β_i, i = 1, 2, 3 and the γ_j, j = 1, 2, 3 are the short and long roots of the g₂ root system respectively, while $X$ ₍₋₎ are the corresponding Chevalley basis vectors. Using the following change of coordinates [201]:

$\begin{array}{l} a_{1} (t) = \frac{1}{2 \sqrt{6}} e^{(1 / 2 \sqrt{3}) q_{1} (t)}, a_{2} (t) = \frac{1}{2 \sqrt{2}} e^{- (\sqrt{3} / 4) q_{1} (t) + (1 / 4) q_{2} (t)}, \\ a_{3} (t) = \frac{1}{2 \sqrt{2}} e^{- (1 / 2) q_{2} (t)}, \\ b_{1} (t) = \frac{- 1}{2 \sqrt{3}} p_{1} (t) + \frac{1}{4} p_{2} (t), b_{2} (t) = \frac{1}{2 \sqrt{3}} p_{1} (t), \end{array}$

we can represent the g₂ lattice as

$\begin{array}{l} {\dot{a}}_{1} = a_{1} b_{2}, {\dot{a}}_{2} = a_{2} (b_{1} - b_{2}), {\dot{a}}_{3} = a_{3} (- 2 b_{1} - b_{2}), \\ {\dot{b}}_{1} = 2 (a_{1}^{2} - a_{2}^{2} + a_{3}^{2}), {\dot{b}}_{2} = - 4 a_{1}^{2} + 2 a_{2}^{2}, \\ H = \frac{1}{2} 〈 A (t), A (t) 〉 = 8 (3 a_{1}^{2} + a_{2}^{2} + a_{3}^{2} + a_{3}^{2} + b_{1}^{2} + b_{1} b_{2} + b_{2}^{2}) . \end{array}$

Here, the coordinate a₂(t) may be regarded as superfluous, and can be eliminated using the fact that $4 a_{1}^{3} a_{2}^{2} a_{3} = c$ (a constant) of the motion.

4.4 The Method of Characteristics for First-Order Partial-Differential Equations

In this section, we present the wellknown method of characteristics for solving first-order PDEs. It is by far the most generally known method for handling first-order nonlinear PDEs in n independent variables. It involves converting the PDE into an appropriate system of first-order ordinary differential-equations (ODE), which are in turn solved together to obtain the solution of the original PDE. It will be seen during the development that the Hamilton’s canonical equations are nothing but the characteristic equations of the Hamilton-Jacobi equation; and thus, solving the canonical equations is equivalent to solving the PDE and vice-versa. The presentation will follow closely those from Fritz-Johns [109] and Evans [95].

4.4.1 Characteristics for Quasi-Linear Equations

We begin with a motivational discussion of the method by considering quasi-linear equations, and then we consider the general first-order nonlinear equation.

The general first-order equation for a function v = v(x, y,…, z) in n variables is of the form

$f (x, y, …, z, υ, υ_{x}, υ_{y}, …, υ_{z}) = 0,$

(4.59)

where x, y,…, z ∈ℜ, v : ℜⁿ →ℜ. The HJE and many first-order PDEs in classical and continuum mechanics, calculus of variations and geometric optics are of the above type. A simpler case of the above equation is the quasi-linear equation in two variables:

$a (x, y, υ) υ_{x} + b (x, y, υ) υ_{y} = c (x, y, υ)$

(4.60)

in two independent variables x, y. The function v(x, y) is represented by a surface z = v(x, y) called an integral surface which corresponds to a solution of the PDE. The functions a(x, y, z), b(x, y, z) and c(x, y, z) define a field of vectors in the xyz-space, while (v_x, v_y, −1) is the normal to the surface z = v(x, y).

We associate to the field of characteristic directions (a, b, c) a family of characteristic curves that are tangent to these directions. Along any characteristic curve (x(t), y(t), z(t)), where t is a parameter, the following system of ODEs must be satisfied:

$\frac{d x}{d t} = a (x, y, z), \frac{d y}{d t} = b (x, y, z), \frac{d z}{d t} = c (x, y, z) .$

(4.61)

If a surface S : z = v(x, y) is a union of characteristic curves, then S is an integral surface; for then through any point P of S, there passes a characteristic curve Γ contained in S.

Next, we consider the Cauchy problem for the quasi-linear equation (4.60). It is desired to find a definite method for finding solutions of the PDE from a given “data” on the problem. A simple way of selecting a particular candidate solution v(x, y) out of an infinite set of solutions, consists in prescribing a curve Γ in xyz-space which is to be contained in the integral surface z = v(x, y). Without any loss of generality, we can represent Γ parametrically by

$x = f (s), y = g (s), z = h (s)$

(4.62)

and we seek for a solution v(x, y) such that

$h (s) = υ (f (s), g (s)), \forall s .$

(4.63)

The above problem is the Cauchy problem for (4.60). Our first aim is to derive conditions for a local solution to (4.60) in the vicinity of x₀ = f(s₀), y₀ = g(s₀). Accordingly, assume the functions f(s), g(s), h(s) ∈ C¹ in the neighborhood of some point P₀ that is parameterized by s₀, i.e.,

$P_{0} = (x_{0}, y_{0}, z_{0}) = (f (s_{0}), g (s_{0}), h (s_{0})) .$

(4.64)

Assume also the coefficients a(x, y, z), b(x, y, z), c(x, y, z) ∈ C¹ near P₀. Then, we can describe Γ near P₀ by the solution

$x = X (s, t), y = Y (s, t), z = Z (s, t)$

(4.65)

of the characteristic equations (4.61) which reduces to f(s), g(s), h(s) at t = 0. Therefore, the functions X, Y, Z must satisfy

$X_{t} = a (X, Y, Z), Y_{t} = b (X, Y, Z), Z_{t} = c (X, Y, Z)$

(4.66)

identically in s, t and also satisfy the initial conditions

$X (s, 0) = f (s), Y (s, 0) = g (s), Z (s, 0) = h (s) .$

(4.67)

By the theorem on existence and uniqueness of solutions to systems of ODEs, it follows that there exists unique set of functions X (s, t), Y (s, t), Z(s, t) of class C¹ satisfying (4.66), (4.67) for (s, t) near (s₀, 0). Further, if we can solve equation (4.65) for s, t in terms of x, y, say s = S(x, y) and t = T (x, y), then z can be expressed as

$z = υ (x, y) = Z (S (x, y), T (x, y)),$

(4.68)

which represents an integral surface Σ. By (4.64), (4.67), x₀ = X (s₀, 0), y₀ = Y (s₀, 0), and by the Implicit-function Theorem, there exist solutions s = S(x, y), t = T (x, y) of

$x = X (S (x, y), T (x, y)), y = Y (S (x, y), T (x, y))$

(4.69)

of class C¹ in a neighborhood of (x₀, y₀) and satisfying

$s_{0} = S (x_{0}, y_{0}), 0 = T (x_{0}, y_{0})$

provided the Jacobian determinant

$| \begin{matrix} X_{s} (s_{0}, 0) & Y_{S} (s_{0}, 0) \\ X_{t} (s_{0}, 0) & Y_{t} (s_{o}, 0) \end{matrix} | \neq 0.$

(4.70)

By (4.66), (4.67) the above condition is further equivalent to

$| \begin{matrix} f_{s} (s_{0}) & g_{s} (s_{0}) \\ a (x_{o}, y_{o}, z_{o}) & b (x_{0}, y_{0}, z_{0}) \end{matrix} | \neq 0.$

(4.71)

The above gives the local existence condition for the solution of the Cauchy problem for the quasi-linear equation. Uniqueness follows from the following theorem [109].

Theorem 4.4.1 Let P = (x₀, y₀, z₀) lie on the integral surface z = v(x, y), and Γ be the characteristic curve through P. Then Γ lies completely on S.

Example 4.4.1 [109] Consider the initial value problem for the quasi-linear equation

$υ_{y} + c υ_{x} = 0, c a c o n s t a n t, a n d υ (x, 0) = h (x) .$

Solution:

Parametrize the initial curve Γ corresponding to the initial condition above by

$x = s, y = 0, z = h (x) .$

Then the characteristic equations are given by

$\frac{d x}{d t} = c, \frac{d y}{d t} = 1, \frac{d z}{d t} = 0.$

Solving these equations gives

$x = X (s, t) = s + c t, y = Y (s, t) = t, z = Z (s, t) = h (s) .$

Finally, eliminating s and t from the above solutions, we get the general solution of the equation

$z = υ (x, y) = h (x - c y) .$

Next, we develop the method for the general first-order equation (4.59) in n independent variables.

4.4.2 Characteristics for the General First-Order Equation

We now consider the general nonlinear first-order PDE (4.59) written in vectorial notation as

$F (D υ, υ, x) = 0, x \in U \subseteq ℜ^{n}, subject to the boundary condition υ = g o n O$

(4.72)

where $D υ = (υ_{x_{1}}, υ_{x_{2}}, …, υ_{x_{n}}), O \subseteq \partial U, g : O \to ℜ, and F \in C^{\infty} (ℜ^{n} \times ℜ \times U), g \in C^{\infty} (ℜ) .$

Now suppose v solves (4.72), and fix any point x ∈ U. We wish to calculate v(x) by finding some curve lying within $U$ , connecting x with a point x⁰ ∈ O and along which we can compute v. Since v(x⁰) = g(x⁰), we hope to be able to find v along the curve connecting x⁰ and x.

To find the characteristic curve, let us suppose that it is described parametrically by the function x(s) = (x¹(s), x²(s),…, xⁿ(s)), the parameter s lying in some subinterval of ℜ. Assume v is a C² solution of (4.72), and let

$z (s) = υ (x (s)),$

(4.73)

$p (s) = D υ (x (s));$

(4.74)

i.e., p(s) = (p¹(s), p²(s),…, pⁿ(s)) = (v_x₁(s), v_x₂(s),…, v_{x_n}(s)). Then,

${\dot{p}}^{i} (s) = \sum_{j = 1}^{n} υ_{x_{i} x_{j}} (X (s)) {\dot{x}}^{j} (s),$

(4.75)

where the differentiation is with respect to s. On the other hand, differentiating (4.72) with respect to x_i, we get

$\sum_{j = 1}^{n} \frac{\partial F}{\partial p_{j}} (D υ, υ, x) υ_{x_{j} x_{i}} + \frac{\partial F}{\partial z} (D υ, υ, x) υ_{x}_{i} + \frac{\partial F}{\partial x_{i}} (D υ, υ, x) = 0.$

(4.76)

Now, if we set

$\frac{d x_{j}}{d s} (s) = \frac{\partial F}{\partial p_{j}} (p (s), z (s), x (s)), j = 1, \dots, n,$

(4.77)

and assuming that the above relation holds, then evaluating (4.76) at x = x(s), we obtain the identity

$\sum_{j = 1}^{n} \frac{\partial F}{\partial p_{j}} (p (s), z (s), x (s)) υ_{x_{j} x_{i}} + \frac{\partial F}{\partial z} (p (s), z (s), x (s)) p^{i} (s) + \frac{\partial F}{\partial x_{i}} (p (s), z (s), x (s)) = 0.$

(4.78)

Next, substituting (4.77) in (4.75) and using the above identity (4.78), we get

${\dot{p}}^{i} (s) = - \frac{\partial F}{\partial x_{i}} (p (s), z (s), x (s)) - \frac{\partial F}{\partial z} (p (s), z (s), x (s)) p^{i} (s), i = 1, \dots, n .$

(4.79)

Finally, differentiating z we have

$\dot{z} (s) = \sum_{j = 1}^{n} \frac{\partial υ}{\partial x_{j}} (x (s)) {\dot{x}}^{j} (s) = \sum_{j = 1}^{n} p^{j} (s) \frac{\partial F}{\partial p_{j}} (p (s), z (s), x (s)) .$

(4.80)

Thus, we finally have the following system of ODEs:

$\begin{array}{l} \dot{p} (s) = - D_{x} F (p (s), z (s), x(s)) - D_{z} F (p (s), z (s), x (s)) p (s) \\ \dot{z} (s) = D_{p} F (p (s), z (s), x (s)) . p (s) \\ \dot{x} (s) = D_{p} F (p (s), z (s), x (s)), \end{array}}$

(4.81)

where D_x, D_p, D_z are the derivatives with respect to x, p, z respectively. The above system of 2n + 1 first-order ODEs comprises the characteristic equations of the nonlinear PDE (4.72). The functions p(s), z(s), x(s) together are called the characteristics while x(s) is called the projected characteristic onto the physical region $U$ ⊆ ℜⁿ. Furthermore, if v ∈ C² solves the nonlinear PDE (4.72) in $U$ and assume x solves the last equation in (4.81), then p(s) solves the first equation and z(s) solves the second for those s such that x(s) ∈ $U$ .

Example 4.4.2 [95] Consider the fully nonlinear equation

$\begin{array}{l} υ_{x_{1}} υ_{x_{2}} = υ, x \in U = {x_{1} > 0} \\ υ = x_{2}^{2} o n Γ = {x_{1} = 0} = \partial U . \end{array}$

Solution

Thus, F (Dv, v, x) = F (p, z, x) = p₁p₂ − z, and the characteristic equations (4.81) become

$\begin{array}{l} {\dot{p}}^{1} = p^{1} \\ {\dot{p}}^{2} = p^{2} \\ {\dot{x}}^{1} = p^{2} \\ {\dot{x}}^{2} = p^{1} \\ \dot{z} = 2 p^{1} p^{2} . \end{array}$

Integrating the above system, we get

$\begin{array}{l} x^{1} (s) = p_{2}^{0} (e^{s} - 1), \\ x^{2} (s) = x^{0} + p_{1}^{0} (e^{s} - 1), x^{0} \in ℜ \\ p^{1} (s) = p_{1}^{0} e^{s} \\ p^{2} (s) = p_{2}^{0} e^{s} \\ z (s) = z^{0} + p_{1}^{0} p_{2}^{0} (e^{2 s} - 1), z^{0} = {(x^{0})}^{2} . \end{array}$

We must now determine the initial parametrization: $p^{0} = (p_{1}^{0}, p_{2}^{0}) .$ Since $v = x_{2}^{2} o n Γ, t h e n p_{2}^{0} = v_{x_{2}} (0, x^{0}) = 2 x^{0} .$ Then from the PDE, we get $v_{x_{1}} = v / v_{x_{2}} \Rightarrow p_{1}^{0} = {(x^{0})}^{2} / 2 x^{0} = x^{0} / 2.$ Upon substitution now in the above equations, we get

$\begin{array}{l} x^{1} (s) = 2 x^{0} (e^{s} - 1) \\ x^{2} (s) = \frac{x^{0}}{2} (e^{s} + 1) \\ p^{1} (s) = \frac{x^{0}}{2} e^{s} \\ p^{2} (s) = 2 x^{0} e^{s} \\ z (s) = {(x^{0})}^{2} e^{2 s} . \end{array}$

Finally, we must eliminate s and x⁰ in the above system to obtain the general solution. In this regard, fix (x₁, x₂) ∈ $U$ and select x⁰ such that $(x_{1}, x_{2}) = (x^{1} (s), x^{2} (s)) = (2 x^{0} (e^{s} - 1), \frac{x^{0}}{2} (e^{s} + 1)) .$ Consequently, we get

$x^{0} = \frac{4 x_{2} - x_{1}}{4}, e^{s} = \frac{x_{1} + 4 x_{2}}{4 x_{2} - x 1},$

and

$υ (x_{1}, x_{2}) = z (s) = {(x^{0})}^{2} e^{2 s} = \frac{{(x_{1} + 4 x_{2})}^{2}}{16}$

4.4.3 Characteristics for the Hamilton-Jacobi Equation

Let us now consider the characteristic equations for our Hamilton-Jacobi equation discussed in the beginning of the chapter, which is a typical nonlinear first-order PDE:

$G (D υ,, υ_{t}, υ, x, t) = υ_{t} + H (D υ, x) = 0,$

(4.82)

where Dv = D_xv and the remaining variables have their usual meaning. For convenience, let q = (Dv, v_t) = (p, p_n+1), y = (x, t). Therefore,

$G (q, z, y) = p_{n + 1} + H (p, x);$

(4.83)

and

$D_{q} G = (D_{p} H (p, x), 1), D_{y} G = (D_{x} H (p, x), 0), D_{z} G = 0.$

Thus, the characteristic equations (4.81) become

$\begin{array}{l} {\dot{x}}^{i} (s) = \frac{\partial H}{\partial p_{i}} (p (s), x (s)), (i = 1, 2, …, n), \\ {\dot{x}}^{n + 1} (s) = 1, \\ {\dot{p}}^{i} (s) = - \frac{\partial H}{\partial x_{i}} (p (s), x (s)), (i = 1, 2, …, n), \\ {\dot{p}}^{n + 1} (s) = 0 \\ \dot{z} (s) = D_{p} H (p (s), x (s)) . p (s) + p^{n + 1}, \end{array}}$

(4.84)

which can be rewritten in vectorial form as

$\begin{array}{l} \dot{p} (s) = - D_{x} H (p (s), x (s)) \\ \dot{x} (s) = D_{p} H (p (s), x (s)) \\ \dot{z} (s) = D_{p} H (p (s), x (s)) . p (s) - H (p (s), x (s)) . \end{array}}$

(4.85)

The first two of the above equations are clearly Hamilton’s canonical equations, while the third equation defines the characteristic surface. The variable z is also termed as the actionvariable which represents the cost-functional for the variational problem

$z (t) =_{x (.)}^{\min} \int_{0}^{t} L (x (s), \dot{x} (s)) d s$

corresponding to the Hamilton-Jacobi equation, where $L (x, \dot{x})$ is the Lagrangian function

$L (x, \dot{x}) = P \dot{x} - H (x, p) .$

Thus, we have made a connection between Hamilton’s canonical equations and the Hamilton-Jacobi equation, and it is clear that a solution for one implies a solution for the other. Nevertherless, neither is easy to solve in general, although, for some systems, the PDE does sometimes offer some leeway, and in fact, this is the motivation behind Hamilton-Jacobi theory.

4.5 Legendre Transform and Hopf-Lax Formula

Though the method of characteristics provides a remarkable way of integrating the HJE, in general the characteristic equations and in particular the Hamilton’s canonical equations (4.85) are very difficult to integrate. Thus, other approaches for integrating the HJE had to be sought. One method due to Hopf and Lax [95] which applies to Hamiltonians that are independent of q deserves mention. For simplicity we shall assume that M is an open subset of ℜⁿ, and consider the initial-value problem for the HJE

$\begin{array}{l} υ_{t} + H (D υ) = 0 i n ℜ^{n} \times (0, \infty) \\ υ = g o n ℜ^{n} \times {t = 0} \end{array}}$

(4.86)

where g : ℜⁿ →ℜ and H : ℜⁿ →ℜ is the Hamiltonian function which is independent of q. Let the Lagrangian function L : T M →ℜ satisfy the following assumptions.

Assumption 4.5.1 The Lagrangian function is convex and $\lim_{q \to \infty} \frac{L (q)}{| q |} = + \infty$ .

Note that the convexity in the above assumption also implies continuity. Furthermore, for simplicity, we have dropped the $\dot{q}$ dependence of L. We then have the following definition.

Definition 4.5.1 The Legendre transform of L is the Hamiltonian function H defined by

$\begin{array}{l} H (p) = \sup_{q \in ℜ^{n}} {p . q - L (q)}, p \in T_{q}^{⋆} ℜ^{n} = ℜ^{n} \\ = p . q^{⋆} - L (q^{⋆}) \\ = p . q (p) - L (q (p)), \end{array}$

for some q^⋆ = q(p).

We note that the “sup” in the above definition is really a “max,” i.e., there exists some q^⋆ ∈ℜⁿ for which the mapping q ↦ p.q − L(q) has a maximum at q = q^⋆. Further, if L is differentiable, then the equation p = DL(q^⋆) is solvable for q in terms of p, i.e., q^⋆ = q(p), and hence the last expression above.

An important property of the Legendre transform [37] is that it is involutive, i.e., if L_g is the Legendre transform, then $L_{g}^{2} (L) = L$ and L_g(H) = L. A stronger result is the following.

Theorem 4.5.1 (Convex duality of Hamiltonians and Lagrangians). Assume L satisfies Assumption 4.5.1, and define H as the Legendre transform of L. Then, H also satisfies the following:

(i) the mapping p ↦ H(p) is convex,

(ii) $\lim_{| p | \to \infty} \frac{H (p)}{p} = + \infty .$

We now use the variational principle to obtain the solution of the initial-value problem (4.86), namely, the Hopf-Lax formula. Accordingly, consider the following variational problem of minimizing the action function:

$\int_{0}^{t} L (\dot{w} (s)) d s + g (w (0))$

(4.87)

over functions w : [0, t] →ℜⁿ with w(t) = x. The value-function (or cost-to-go) for this minimization problem is given by

$υ (x, t) : = i n f {\int_{0}^{t} L (\dot{w} (s)) d s + g (q) | w (0) = y, w (t) = x},$

(4.88)

with the infimum taken over all C¹ functions w(.) with w(t) = x. Then we have the following result.

Theorem 4.5.2 (Hopf-Lax Formula). Assume g is Lipschitz continuous, and if x ∈ℜⁿ, t > 0, then the solution v = v(x, t) to the variational problem (4.87) is

$υ (x, t) = \min_{y \in ℜ^{n}} {t L (\frac{x - y}{t}) + g (y)} .$

(4.89)

The proof of the above theorem can be found in [95].

The next theorem asserts that the Hopf-Lax formula indeed solves the initial-value problem of the HJE (4.86) whenever v in (4.89) is differentiable.

Theorem 4.5.3 Assume H is convex, $\lim_{| p | \to \infty} \frac{H (p)}{| p |} = \infty .$ Further, suppose x ∈ℜⁿ, t > 0, and v in (4.89) is differentiable at a point (x, t) ∈ℜⁿ × (0, ∞). Then (4.89) satisfies the HJE (4.86) with the initial value v(x, 0) = g(x).

Again the proof of the above theorem can be found in [95]. The Hopf-Lax formula provides a reasonably weak solution (a Lipschitz-continuous function which satisfies the PDE almost everywhere) to the initial-value problem for the HJE. The Hopf-Lax formula is useful for variational problems and mechanical systems for which the Hamiltonians are independent of configuration coordinates, but is very limited for the case of more general problems.

4.5.1 Viscosity Solutions of the HJE

It was long recognized that the HJE being nonlinear, may not admit classical (or smooth solutions) even for simple situations [56, 98, 287]. To overcome this difficulty, Crandall and Lions [186] introduced the concept of viscosity (or generalized) solutions in the early 1980s [56, 83, 98, 186, 287] which have had wider application. Under the assumption of differentiability of v, any solution v of the HJE will be referred to as a classical solution if it satisfies it for all x ∈ $X$ . In most cases however, the Hamiltonian function H fails to be differentiable at some point x ∈ $X$ , and hence may not satisfy the HJE everywhere in $X$ . In such cases, we would like to consider solutions that are closest to being differentiable in an extended sense. The closest such idea is that of Lipschitz continuous solutions. This leads to the concept of generalized solutions which we now define [83, 98].

Definition 4.5.2 Consider the more general form of the Hamiltonian function H : T ^⋆ M → ℜ and the Cauchy problem

$H (x, D_{x} υ (x)) = 0, υ (x, 0) = g (x)$

(4.90)

where D_xv(x) denotes some derivative of v at x, which is not necessarily a classical derivative. Now suppose v is locally Lipschitz on N, i.e., for every compact set O ⊂ N and x₁, x₂ ∈ O there exists a constant k_O > 0 such that

$| υ (x_{1}) - υ (x_{2}) | \leq k_{O} ‖ x_{1} - x_{2} ‖$

(it is Lipschitz if K_O = k, independent of O), then v is a generalized solution of (4.90) if it satisfies it for almost all x ∈ $X$ .

Moreover, since every locally Lipschitz function is differentiable at almost all points x ∈ N, the idea of generalized solutions indeed makes sense. However, the concept also implies the lack of uniqueness of generalized solutions. Thus, there can be infinitely many generalized solutions. In this section, we shall restrict ourselves to the class of generalized solutions referred to as viscosity solutions, which are unique. Other types of generalized solutions such as “minimax” and “proximal” are also available in the literature [83]. We define viscosity solutions next.

Assume v is continuous in N, and define the following sets which are respectively the superdifferential and subdifferential of v at x ∈ N

$D^{+} υ (x) = {p \in ℜ^{n} : \lim_{x ′ \to x} \sup_{x ′ \in N} \frac{υ (x ′) - υ (x) - p . (x ′ - x)}{‖ x ′ - x ‖} \leq 0},$

(4.91)

$D^{-} υ (x) = {q \in ℜ^{n} : \lim_{x ′ \to x} \inf_{x ′ \in N} \frac{υ (x ′) - υ (x) - q . (x ′ - x)}{‖ x ′ - x ‖} \geq 0} .$

(4.92)

Remark 4.5.1 If both D⁺v(x) and D⁻v(x) are nonempty at some x, then D⁺v(x) = D⁻v(x) and v is differentiable at x. We now have the following definitions of viscosity solutions.

Definition 4.5.3 A continuous function v is a viscosity solution of HJE (4.90) if it is both a viscosity subsolution and supersolution, i.e., it satisfies respectively the following conditions:

$H (x, p) \leq 0; \forall x \in N, \forall p \in D^{+} υ (x)$

(4.93)

$H (x, q) \geq 0; \forall x \in N, \forall q \in D^{-} υ (x)$

(4.94)

respectively.

An alternative definition of viscosity subsolutions and supersolutions is given in terms of test functions as follows.

Definition 4.5.4 A continuous function v is a viscosity subsolution of HJE (4.90) if for any φ ∈ C¹,

$H (x, D φ (x)) \leq 0$

at any local maximum point x of v − φ. Similarly, v is a viscosity supersolution if for any φ ∈ C¹,

$H (x, D φ (x)) \geq 0$

at any local minimum point x of v − φ.

Finally, for the theory of viscosity solutions to be meaningful, it should be consistent with the notion of classical solutions. Thus, we have the following relationship between viscosity solutions and classical solutions [56].

Proposition 4.5.1 If v ∈ C(N) is a classical solution of HJE (4.90), then v(x) is a viscosity solution, and conversely if v ∈ C¹(N) is a viscosity solution of (4.90), then v is a classical solution.

Which states in essense that, every classical solution is a viscosity solution. Furthermore, the following proposition gives a connection with Lipschitz-continuous solutions [56].

Proposition 4.5.2 (a) If v ∈ C(N) is a viscosity solution of (4.90), then

$H (x, D_{x} v) = 0$

at any point x ∈ N where v is differentiable; (b) if v is locally Lipschitz-continuous and it is a viscosity solution of (4.90), then

$H (x, D_{x} v) = 0$

almost everywhere in N.

Lastly, the following proposition guarantees uniqueness of viscosity solutions [95, 98].

Proposition 4.5.3 Suppose H(x, p) satisfies the following Lipschitz conditions:

$\begin{array}{l} | H (x, p) - H (x, q) | \leq k ‖ p - q ‖ \\ | H (x, p) - H (x ′, p) | \leq k ‖ x - x ′ ‖ (1 + ‖ p ‖) \end{array}$

for some k ≥ 0, x, x′ , p, q ∈ℜⁿ, then there exists at most one viscosity solution to the HJE (4.90).

The theory of viscosity solutions is however not limited to the HJE. Indeed, the theory applies to any first-order equation of the types that we have discussed in the beginning of the chapter and also second-order equations of parabolic type.

4.6 Notes and Bibliography

The material in Sections 1-4 on Hamiltonian mechanics is collected from the References [1, 37, 115, 122, 200] and [127], though we have relied more heavily on [122, 127] and [115]. On the other hand, the geometry of the Hamilton-Jacobi equation and a deeper discussion of its associated Lagrangian-submanifolds can be found in [1, 37, 200]. In fact, these are the standard references on the subject. A more classical treatment of the HJE can also be found in Whittaker [271] and a host of hundreds of excellent books in many libraries.

Section 4.3, dealing with an introduction to nonlinear lattices is mainly from [258], and more advanced discussions on the subject can be found in the References [3]-[5, 72, 123, 201].

Finally, Section 4.4 on first-order PDEs is mainly from [95, 109]. More exhaustive discussion on viscosity and generalized solutions of HJEs can be found in [56, 83] from the deterministic point of view, and in [98, 287] from the stochastic point of view.

¹Holonomic if the constraints on the system are expressible as equality constraints. Conservative if there exists a time-dependent potential.

²To be defined later, see also [37].

³ω² is nondegenerate if ω²( $X$ ₁, X₂) = 0 ⇒ $X$ ₁ = 0 or $X$ ₂ = 0 for all vector-fields $X$ ₁, X₂ which are smooth sections of TT^⋆M.

⁴A coordinate q_i is cyclic if it does not enter into the Lagrangian, i.e., $\frac{\partial L}{\partial_{q i}} = 0.$

⁵The Lie-algebra of SL(n, ℜ), the special linear group of matrices on ℜ with determinant ±1 [38].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 4 Hamiltonian Mechanics and Hamilton-Jacobi Theory

Create new playlist

Sign In

Sign Up

Table of Contents for
4 Hamiltonian Mechanics and Hamilton-Jacobi Theory