III.49 Linear and Nonlinear Waves and Solitons

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

III.49 Linear and Nonlinear Waves and Solitons

Richard S. Palais

1 John Scott Russell and the Great Wave of Translation

To the world at large, John Scott Russell is known as the naval architect who designed The Great Eastern, a steamship larger than any built before. But long after The Great Eastern has been forgotten, Russell will be remembered by mathematicians as the man who, despite limited mathematical training and background, was the first person to recognize the highly important mathematical concept known as a soliton, which he referred to as “the great wave of translation.” Here is his oft-quoted passage in which he describes how he first became acquainted with it:

I was observing the motion of a boat which was rapidly drawn along a narrow channel by a pair of horses, when the boat suddenly stopped—not so the mass of water in the channel which it had put in motion; it accumulated round the prow of the vessel in a state of violent agitation, then suddenly leaving it behind, rolled forward with great velocity, assuming the form of a large solitary elevation, a rounded, smooth and well-defined heap of water, which continued its course along the channel apparently without change of form or diminution of speed. I followed it on horseback, and overtook it still rolling on at a rate of some eight or nine miles an hour, preserving its original figure some thirty feet long and a foot to a foot and a half in height. Its height gradually diminished, and after a chase of one or two miles I lost it in the windings of the channel. Such, in the month of August 1834, was my first chance interview with that singular and beaurifu1 phenomenon which I have called the Wave of Translation.

Russell (1844)

You may feel that there is nothing unusual about what Russell describes here, and indeed many before and since have watched this same scenario play out without noticing anything out of the ordinary. But Russell was very familiar with wave phenomena and had a scientist’s keenly observant eye. What struck him was the remarkable stability of the bow wave as it traveled over a long distance. He knew that if one tried to create a traveling water wave on, say, a calm lake, it would soon disperse into a train of smaller wavelets—it would not just go marching along as a single “heap” over a long distance. There was clearly something very special about water waves traveling in a narrow and shallow channel.

Russell became fascinated—even a little obsessed—with his discovery. He built a wave tank behind his home and proceeded to do extensive experiments, recording the results as data and sketches in his notebooks. He found, for example, that the speed of a soliton depended on its height, and he was even able to discover the correct formula for the speed as a function of height. More surprising still, in Russell’s notebooks one finds remarkable sketches of a two-soliton interaction—something that would evoke amazement when it was rediscovered as a rigorous solution to the KdV equation (see section 3 below) more than a hundred years later.

However, as we shall see, solitons are very much a nonlinear phenomenon, and when some of the best mathematicians of Russell’s day, notably Stokes and Airy, tried to understand Russell’s observations using the linearized theory of water waves that was then available, they failed to find any trace of soliton-like behavior and expressed doubts that what Russell had seen was real.

It was not until after Russell’s death, with the more sophisticated nonlinear mathematical treatment by Boussinesq in 1871 and by Korteweg and de Vries in 1895, that Russell’s careful observations and experiments were at last seen to be in complete agreement with mathematical theory. And it took another seventy years before the full importance of the great wave of translation was recognized, after which it became an object of intensive study for the rest of the twentieth century.

2 The Korteweg-de Vries Equation

Korteweg and de Vries were the first people to derive the appropriate differential equation to describe the motion of a wave in a shallow channel. We can write their equation, usually called the KdV equation, in a succinct form as follows:

u_t + uu_x + δ²u_xxx = 0.

Here, u is a function of two variables, x and t, which represent space and time, respectively. “Space” is one dimensional, so x is a real number, and u(x, t) represents the height of the wave at x at time t. The notation u_t is shorthand for ∂u/∂t; similarly, u_X stands for ∂u/∂x and u_xxx stands for ∂³ u/∂x³.

This is an example of an evolution equation: if, for each t, we write u(t) for the function from to that takes x to u(x, t), then it describes how the function u(t) “evolves” over time. The Cauchy problem for an evolution equation is the problem of determining this evolution from knowledge of its initial value u(0).

2.1 Some Model Equations

To put the KdV equation into perspective, it is useful to think briefly about three other evolution equations. The first is the classic WAVE EQUATION [I.3 §5.4]

u_tt - c²u_xx = 0.

To solve the Cauchy problem for this equation, we factor the wave operator (∂²/∂t²) - c²(∂²/∂x²) as a product ((∂/∂t) - c(∂/∂x))((∂/∂t) + c(∂/∂x)). Then we transform to so-called characteristic coordinates ξ = x - ct, η = x + ct. The equation becomes ∂²u/∂ξ∂η = 0, which clearly has the general solution u(ξ, η) = F(ξ) + G(η). Transforming back to “laboratory coordinates” x, t, the general solution is u(x, t) = F(x - ct) + G(x + ct). If the initial shape of the wave is u(x, 0) = u₀ (x) and its initial velocity is u_t(x, 0) = υ(x, 0) = υ₀(x), then an easy algebraic computation gives the following very explicit formula:

known as “d’Alembert’s solution” of the Cauchy problem for the wave equation.

Note the geometric interpretation in the important “plucked string” case, υ₀ = 0; the initial profile u₀ breaks up into the sum of two “traveling waves,” both with the same profile u₀ one traveling to the right, and the other to the left, both with speed c. It is an easy exercise to derive d’Alembert’s solution using the following hint: since u₀(x) F(x) + G(x), (x) = F´(x) + G´(x), while υ₀(x) = u_t(x, 0) = -cF´(x) + cG´(x).

The next equation to think about is

which we can obtain from the KdV equation if we drop the nonlinear term uu_x. This equation is not just linear but also translation invariant (meaning that if u(x, t) is a solution, then so is u(x - x₀, t - t₀) for any constants x₀ and t₀). Such equations can be solved using THE FOURIER TRANSFORM [III.27]. Let us try to find a“plane-wave” solution of the form u (x, t) = e^i(kx-ωt). If we substitute this into (1), then we obtain the equation

-iωe^i(kx-ωt) = ik³e^i(kx-ωt),

and therefore the simple algebraic equation ω+k³ = 0. This is called the dispersion relation of (1): with the help of the Fourier transform it is not hard to show that every solution is a superposition of solutions of the form e^i(kx-ωt), and the dispersion relation tells us how the “wave number” k is related to the “angular frequency” ω in each of these elementary solutions.

The function e^i(kx-ωt) represents a wave that travels at a speed of ω/k, which we have just shown to be equal to -k². Therefore, the different plane-wave components of the solution travel at different speeds: the higher the angular frequency, the greater the speed. For this reason, the equation (1) is called dispersive.

What happens if instead we omit the u_xxx term from the KdV equation? Then we obtain the inviscid Burgers equation

The term uu_x can be rewritten as (∂/∂x) ( u²). Let us consider the integral u(x, t) dx, which is a function of t. The derivative of this function is u_t dx, which equation (2) tells us is equal to

which equals Therefore, if u(x, t)² vanishes at infinity, then u(x, t) dx is a “constant of the motion.” We say that the inviscid Burgers equation is a conservation law.(The argument we have just used can be used for any equation of the form u_t = (F(u)_x, where F is a smooth function of u and its partial derivatives with respect to x. This is known as the general conservation law. For example, taking F(u) = -(u² + δ²+u_xx) gives rise to the KdV equation.)

The inviscid Burgers equation (and other conservation laws where F is a function just of u) can be solved using the method of characteristics. The idea of this method is to look for smooth curves (x(s), t(s)) in the xt-plane along which the solution to the Cauchy problem is constant. Suppose that so is such that t(s₀) = 0, and write x0 for x(s₀). Then the constant value that the solution u(x, t) will have to take along this curve is u(x₀, 0), which we also write as u₀(x₀). The derivative of u along this so-called characteristic curve is (d/ds)u(x(s), t(s)) = u_xx´ + u_tt´, so if we want the solution to be constant along the curve, then we need this to be 0. Therefore, using the fact that u_t = -uu_x, we find that

so the characteristic curve is a straight line of slope u₀(x₀). In other words, u has the constant value u₀(x₀) along the line x = x₀ + u₀(x₀)t.

Note the following geometric interpretation of this last result: to find the wave profile at time t (i.e., the graph of the map x u(x, t)), we translate each point (x, u₀(x)) of the initial profile to the right by the amount u₀ (x)t. Suppose we look at a portion of the initial profile where u₀ is decreasing. Then the earlier, and higher, parts of the initial wave are translated at a greater speed (since u₀ (x) is larger), so that the negative slope of the wave becomes more negative. Indeed, after a finite time the earlier part of the wave “catches up” with the later part, which means that we no longer have a graph of a function. The first time at which this sort of problem happens is called the “breaking time,” since one can visualize it as the breaking of a wave. This process is usually referred to as shock formation, or steepening and breaking of the wave profile: once again, the phenomenon occurs for many other conservation laws.

2.2 Split-Stepping

Now let us return to the KdV equation itself, in the form u_t = -uu_x - u_xxx. Why is it that this equation gives rise to the remarkable stability of the solutions that was observed experimentally b_y Russell? Intuitively, the reason is that there is a balance between the dispersing effect of the u_xxx term and the shock-forming effect of the uu_x term.

There turns out to be a very general technique for analyzing balances of this kind. In the pure-mathematics community it is usually called the Trotter product formula, while in the applied-mathematics and numerical-analysis communities it is called split- stepping. The rough idea is simple: as t increases to t + Δt, you first change u to u - u_xxxΔt as would be required by the equation u_t = -u_xxx, and then you take a further step to u - u_xxXΔt - uu_xΔt, the small change required by the equation u_t = -uu_x. To work out the function u(t, x), you start at the initial function u₀ and take a succession of alternating small steps of this form. You then take the limit as the step size tends to zero.

Split-stepping suggests a way to understand the mechanism by which dispersion from u_xxx balances shock formation from uu_x in KdV. If we imagine the evolution of the wave profile as made up of a succession of pairs of small steps in this way, then when u, u_x, and u_xxx are not too large, the steepening mechanism will dominate. But as the time t approaches the breaking time T_B, u remains bounded (since it is made out of horizontally translated parts of u₀). It is not hard to prove that the maximum slope (that is, the maximum value of u_x) blows up like the function (T_B - t)^-1, while at the same place u_xxx blows up like the function (T_B - t)^-5. Thus, near the breaking time, and breaking point, the u_xxx term will dwarf the nonlinearity and will disperse the incipient shock. Thus, the stability is caused by a kind of negative feedback. Computer simulations show just such a scenario playing out.

3 Solitons and Their Interactions

We have just seen that the KdV equation expresses a balance between dispersion from its third-derivative term and the shock-forming tendency of its nonlinear term, and in fact many models of one-dimensional physical systems that exhibit mild dispersion and weak nonlinearity lead to KdV as the controlling equation at some level of approximation.

In their 1894 paper, Korteweg and de Vries introduced the KdV equation and gave a convincing mathematical argument that this was the equation that governed wave motion in a shallow canal. They also showed by explicit computation that it admitted traveling-wave solutions that had exactly the properties that had been described by Russell, including the relation of height to speed that Russell had determined experimentally with the help of his wave tank.

But it was only much later that further remarkable properties of the KdV equation became evident. In 1954, Fermi, Pasta, and Ulam (FPU) used one of the very first digital computers to perform numerical experiments on an elastic string with a nonlinear restoring force, and their results contradicted the then current expectations of how energy should distribute itself among the normal modes of such a system. A decade later, Zabusky and Kruskal reexamined the FPU results in a famous paper in which they showed that the FPU string was well approximated by the KdV equation. They then did their own computer experiments, solving the Cauchy problem for KdV with initial conditions corresponding to those used in the FPU experiments. In the results of these simulations they observed the first example of a “soliton,” a term that they coined to describe a remarkable particle-like behavior (elastic scattering) exhibited by certain KdV solutions. Zabusky and Kruskal showed how the coherence of solitons explained the anomalous results observed by Fermi, Pasta, and Ulam. But in solving that mystery they had uncovered a larger one: the behavior of KdV solitons was unlike anything seen before in applied mathematics, and the search for an explanation of their remarkable behavior led to a series of discoveries that changed the course of applied mathematics for the next thirty years. We shall now fill in some of the mathematical details behind the above sketch, beginning with a discussion of explicit solutions to the KdV equation.

It is straightforward to find the traveling-wave solutions of KdV. First, we substitute a traveling wave u(x, t) = f(x - ct) into KdV, obtaining the ordinary differential equation -cf´ + 6ff´ + f´´´ = 0. If we add as a boundary condition that f should vanish at infinity, then a routine computation leads to the following two-parameter family of traveling-wave solutions:

u(x, t) = 2a² sech²(a(x - 4a²t + d)).

These are the solitary waves seen by Russell, and they are now usually referred to as the 1-soliton solutions of KdV. Note that their amplitude, 2a², is just half their speed, 4a², while their “width” is proportional to a^-1. Thus, taller solitary waves are thinner and move faster.

Next, following Toda, we will “derive”¹ the 2-soliton solutions of KdV. Rewrite the 1-soliton solution as u(x, t) = 2(∂²/∂x²) log cosh(a(x - 4a²t + δ)), or u(x, t) = 2(∂²/2∂x²) log K(x, t), where K(x, t) = (1 + e²a(x-4a²t+δ)). We now try to generalize, looking for solutions of the form u(x, t) = 2(∂²/∂x²) log K(x, t), with , where , and we shall choose the A_i and d_i by substituting into KdV and seeing what works. One can check that KdV is satisfied for u(x, t) of this form and arbitrary A₁, A₂, a_l, a₂, d_l, d₂, provided that we define A₃ = ((a₂ - a_l)/(a_l+ a₂))²A₁A₂, and solutions of KdV arising in this way are called the KdV 2-soliton solutions.

It can now be shown that for these choices of al and a₂,

In particular, u(x, 0) = 6 sech² (x), u(x, t) is asymptotically equal to 2 sech² (x-4t-)+8 sech²(x- 16t+ ) when t is large and negative, and u(x, t) is asymptotically equal to 2 sech²(x-4t+)+ 8 sech² (x-16t-) when t is large and positive, where = log(3).

Note what this says. If we follow the evolution from -T to T (where T is large and positive), we first see the superposition of two 1-solitons: a larger and thinner one to the left of, and catching up with, a shorter, fatter, and slower-moving one to the right. Around t = 0 they merge into a single lump (with the shape 6 sech² (x)), and then they separate again, with their original shapes restored—but now the taller and thinner one is to the right. It is almost as if they had passed right through each other. The only effect of their interaction is the pair of phase shifts: the slower one is retarded slightly from where it would have been, and the faster one is slightly ahead of where it would have been. Except for these phase shifts, the final result is what we might expect from a linear interaction. It is only if we look closely at the interaction as the two solitons meet that we can detect its highly nonlinear nature. (Note, for example, that at time t = 0, the maximum amplitude, 6, of the combined wave is actually less than the maximum amplitude, 8, of the taller wave when they are separated.) But of course the really striking fact is the resilience of the two individual solitons: their ability to put themselves back together after the collision. Not only is no energy radiated away, but their actual shapes are preserved. (Remarkably, Russell (1844, p. 384) gives a sketch of a 2-soliton interaction experiment that he had carried out in his wave tank!)

Now back to the computer experiment of Zabusky and Kruskal. For numerical reasons, they chose to deal with the case of periodic boundary conditions: in effect, studying the KdV equation u_t + uu_x + δ²u_xxx = 0 (which they label (1)) on the circle instead of on the line. For their published report, they chose δ = 0.022 and used the initial condition u(x, 0) = cos(πx). With the above background in mind, it is interesting to read the following extract from their 1965 report, which contains the first use of the term “soliton”:

(I) Initially the first two terms of Eq. (1) dominate and the classical overtaking phenomenon occurs; that is u steepens in regions where it has negative slope. (II) Second, after u has steepened sufficiently, the third term becomes important and serves to prevent the formation of a discontinuity. Instead, oscillations of small wavelength (of order δ) develop on the left of the front. The amplitudes of the oscillations grow, and finally each oscillation achieves an almost steady amplitude (that increases linearly from left to right) and has the shape of an individual solitary-wave of (1). (III) Finally, each “solitary wave pulse” or soliton begins to move uniformly at a rate (relative to the background value of u from which the pulse rises) which is linearly proportional to its amplitude. Thus, the solitons spread apart. Because of the periodicity, two or more solitons eventually overlap spatially and interact nonlinearly. Shortly after the interaction they reappear virtually unaffected in size or shape. In other words, solitons “pass through” one another without losing their identity. Here we have a nonlinear physical process in which interacting localized pulses do not scatter irreversibly.

Zabusky and Kruskal (1965)

Table of Contents for
III.49 Linear and Nonlinear Waves and Solitons

III.49 Linear and Nonlinear Waves and Solitons

Richard S. Palais

1 John Scott Russell and the Great Wave of Translation

2 The Korteweg-de Vries Equation

2.1 Some Model Equations

2.2 Split-Stepping

3 Solitons and Their Interactions

Further Reading

Table of Contents for III.49 Linear and Nonlinear Waves and Solitons

Create new playlist

Sign In

Sign Up

III.49 Linear and Nonlinear Waves and Solitons

Richard S. Palais

1 John Scott Russell and the Great Wave of Translation

2 The Korteweg-de Vries Equation

2.1 Some Model Equations

2.2 Split-Stepping

3 Solitons and Their Interactions

Further Reading

Table of Contents for
III.49 Linear and Nonlinear Waves and Solitons