III.64 Optimization and Lagrange Multipliers

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

III.64 Optimization and Lagrange Multipliers

Keith Ball

1 Optimization

Soon after being introduced to calculus, most students learn of its application to optimization: that is, to the problem of finding the largest or smallest value of a given differentiable function, which is usually referred to as the objective function. A very helpful observation is that if f is an objective function that is maximized or minimized at x, then the tangent to the graph at the point (x, f(x)) will be horizontal, since otherwise we can find some value x′ close to x for which f(x′) is higher. This means that we can narrow down the search for the maximum and minimum values of f by looking just at the values of f(x) for which f′(x) = 0.

Now suppose that we have an objective function of more than one variable, such as, for example, the function

F(x, y) = 2x + l0y - x² + 2xy - 3y².

The “graph” of F is obtained by plotting the values F(x, y) of F as heights above the corresponding points (x, y) of the plane, so now it is a surface instead of a curve. A smooth surface possesses not a tangent line at each point, but a tangent plane. If F has a maximum value, it will occur at a point where the tangent plane is horizontal.

The tangent plane at each point (x, y) is the graph of the linear function that best approximates F near (x, y). For small values of h and k, F(x + h, y + k) will be approximately equal to F(x, y) plus a function of the form

(h, k) ah + bk,

that is, F(x, y) plus a linear function of h and k. As explained in SOME FUNDAMENTAL MATHEMATICAL DEFINITIONS [I.3 §5.3], the derivative of F at (x, y) is this linear map. The map can be represented by the pair of numbers (a, b), which can in turn be thought of as a vector in ². This derivative vector is usually called the gradient of the function F at the point (x, y) and is written ∇F(x, y). In vector notation (writing x for (x, y) and h for (h, k)), the approximation to F near (x, y) is

Thus, ∇F points in the direction in which F increases most rapidly if you start at x, and the magnitude of ∇F is the slope of the “graph” of F in this direction.

The components a and b of the gradient can be calculated using partial differentiation. The number a tells us how quickly F(x, y) changes as we vary x while keeping y fixed: so to find a, we differentiate F(x, y) = 2x + l0y - x² + 2xy - 3y² with respect to x, treating y as a constant. In this case we get the partial derivative

Similarly,

Now, if we want to locate points where the tangent plane is horizontal, then we want to find the points at which the gradient is zero: that is, the points at which the vector (a, b) is the zero vector. So we solve the pair of simultaneous equations

2 - 2x + 2y = 0,

10 + 2x - 6y = 0

to get x = 4, y = 3. Thus, the only candidate for the maximum is the point (4, 3), where F takes the value 19. It can be checked that 19 is indeed the maximum value of F.

2 The Gradient and Contours

One of the most common ways of representing surfaces (landscapes on maps, for example) is by means of contour lines, or curves of constant height. In the xy-plane, we plot several curves of the form F(x, y) = V, for various “representative” values of V. For the function we considered earlier,

F(x, y) = 2x + l0y - x² + 2xy - 3y²,

the values 0, 8, 14, 18, 19 yield the contour plot shown in figure 1. The 14 contour, for example, contains all the points at which the surface has height 14. The figure indicates that this particular surface is an elliptical hump whose peak occurs at (4, 3) and has height 19.

There is a simple geometrical relationship between the contour lines and the gradient vector. The vector equation (1) shows that the direction h in which F is instantaneously constant is the direction which makes the scalar product h · ∇F equal to 0: the direction perpendicular to ∇F. At each point, the gradient vector is perpendicular to the contour through that point. This fact underlies the method of Lagrange multipliers that we shall discuss in the next section.

Figure 1 A contour plot.

Figure 2 Constrained optimization.

3 Constrained Optimization and Lagrange Multipliers

It often happens that we are interested in the maximum or minimum value of an objective function that depends upon several variables whose values are constrained to satisfy certain equations or inequalities. Consider, for example, the following problem.

Find the maximum value of

F(x, y) = 4y - x

over all pairs (x,y) satisfying the constraint

Figure 2 shows the curve in the xy-plane defined by G(x,y) = 0 (an ellipse), and also a number of contour lines of the function 4y - x. Our aim is to find the largest that 4y - x can be if (x, y) is a point on the curve. So we want to find the largest value of V for which the corresponding contour 4y - x = V contains a point on the curve. The value of V increases as the lines move up the diagram, and the uppermost line that touches the curve is the one labeled 4y - x = 7. So the maximum value we are looking for is 7, and it occurs at the point where the line 4y - x = 7 touches the curve. It is easy to check that this point is (1, 2).

How could we locate this point algebraically, rather than by drawing? The important thing to notice is that the optimizing line is tangent to the curve: the line and the curve are parallel at their common point. The line was chosen to be a contour of the function F. The curve is also a contour: the 0 contour of G. From the discussion in the previous section we know that these contours are perpendicular to the gradients of F and G, respectively (at the point in question). So the two gradient vectors are parallel to one another and are therefore multiples of one another: ∇F = ∇G, say.

We thus have a way to hunt for solutions to the constrained optimization problem

maximize F(x, y) subject to G(x, y) = 0.

We look for a point (x, y) and a number such that

For our example (2), the gradient equation gives two partial derivative equations,

-1 = (2x - y - 1), 4 = (-x + 2y + 1),

from which we conclude that

Substituting these values into the equation G(x, y) = 0, we obtain

which has two solutions: = 1 and = -1. If we substitute = 1 into (4), we get the point (1, 2) where F is at its maximum. (Taking = -1 yields the minimum.)

The number that we introduced to solve the problem is called a Lagrange multiplier. It is possible to reformulate the problem by defining the Lagrangian

(x, y, ) = F(x, y) - G(x, y)

and then condensing the equations (3) into a single equation

∇ = 0.

The reason this works is that if we differentiate with respect to , then we obtain G(x, y), so requiring this partial derivative to be zero is equivalent to requiring G(x, y) to be zero. And asking for the other two partial derivatives to be zero is equivalent to requiring that ∇F = ∇G. The remarkable fact about this reformulation is that it has turned a constrained optimization problem involving x and y into into an unconstrained problem involving x, y, and .

4 The General Method of Lagrange Multipliers

In real problems we may wish to optimize a function F of many variables x₁, . . . , x_n under many constraints G1(x₁, . . . , x_n) = 0, G₂(x₁, . . . , x_n) = 0, . . . , G_m(x₁, . . . , x_n) = 0. In this case we introduce a Lagrange multiplier for each constraint and define the Lagrtingian by the formula

The partial derivative of with respect to _i is zero if and only if G_i(x₁, . . . , x_n) = 0. And the partial derivatives with respect to the x_i will all be zero if and only if . This tells us that any direction that is perpendicular to all the gradients ∇G_i (and therefore lies in all their “contour hypersurfaces”) will be perpendicular to the gradient ∇F as well, so we cannot find a direction in which F increases while all the constraints are satisfied.

Problems of this kind occur frequently in economics, where the objective function F is a cost (which we are probably trying to minimize), and the constraints force us to allocate spending among different items so as to satisfy certain overall demands. For instance, we might want to minimize the total cost of supplies of various different foodstuffs that between them had to satisfy various nutritional demands. In this case, the Lagrange multipliers have an interpretation as “notional prices.” As we have just seen, at the optimum point we have an equation of the form . This tells us how much F will vary as we vary the G_i by small amounts: that is, it tells us the costs associated with increasing the various demands.

For a further use of Lagrange multipliers, see THE MATHEMATICS OF TRAFFIC IN NETWORKS [VII.4].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for III.64 Optimization and Lagrange Multipliers

Create new playlist

Sign In

Sign Up

III.64 Optimization and Lagrange Multipliers

Keith Ball

1 Optimization

2 The Gradient and Contours

3 Constrained Optimization and Lagrange Multipliers

4 The General Method of Lagrange Multipliers

Table of Contents for
III.64 Optimization and Lagrange Multipliers