III.84 The Simplex Algorithm

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

III.84 The Simplex Algorithm

Richard Weber

1 Linear Programming

The simplex algorithm is the preeminent tool for solving some of the most important mathematical problems arising in business, science, and technology. In these problems, which are called linear programs, we are to maximize (or minimize) a linear function subject to linear constraints. An example is the diet problem posed by the U.S. Air Force in 1947: find quantities of seventy-seven differently priced foodstuffs (cheese, spinach, etc.) to satisfy a man’s minimum daily requirements for nine nutrients (protein, iron, etc.) at least cost. Further applications occur in choosing the elements of an investment portfolio, rostering an airline’s crew, and finding optimal strategies in two-person games. The study of linear programming has inspired many of the central ideas of optimization theory, such as DUALITY [III.19], the importance of convexity, and COMPUTATIONAL COMPLEXITY [IV.20].

The input data of a linear program (LP) consists of two vectors b ∈ ^m and c ∈ ⁿ, and an m × n matrix A = (a_ij). The problem is to find values for n non-negative decision variables, x₁, . . . , x_n, to maximize the objective function c₁x₁ + . . . + c_nx_n, subject to m constraints, a_i1x₁ + . . . + a_inx_n ≤ b_i, i = 1, . . . , m. In the diet problem, n = 77 and m = 9. In the following simple example (not a diet problem), n = 2 and m = 3. In serious real-life problems, n and m can be greater than 100 000.

Figure 1 Feasible region “P” of an LP.

Maximize x₁ + 2x₂

subject to -x₁ + 2x₂ ≤ 2,

x₁ + x₂ ≤ 4,

2x₁ - x₂ ≤ 5,

x₁, x₂ ≥ 0.

The constraints define a feasible region for (x₁, x₂), a convex polygon that is depicted by the shaded region “P” in figure 1. The two dotted lines mark those x where the value of the objective function value is 4 and where it is 6. Clearly, it is maximized at point C.

The general story is similar to that of the example. If the feasible region P = {x : Ax ≤ b, x ≥ 0} is nonempty, then it is a convex polytope in Rn, and an optimal solution can be found at one of its vertices. It is helpful to introduce “slack variables” x₃, x₄, x₅ ≥ to take up the slack on the left of the inequality constraints. We can write

We now have three equations in five variables, so we can set any two of the variables x₁, . . . , x₅ equal to 0, and solve the equations for the other three variables (or solve a perturbation of them if they happen not to be independent). There are ten ways to choose two variables from five. Not all of the ten corresponding solutions satisfy x₁, x₂, x₃, x₄, x₅ ≥ 0, but five of them do. These are called basic feasible solutions (BFSs), and correspond to the vertices of P marked O, B, C, D, E.

2 How the Algorithm Works

George Dantzig invented the simplex algorithm in 1947 as a means of solving the Air Force’s diet problem mentioned at the start. The word “program” was not yet used to mean computer code, but was a military term for a logistic plan or schedule. The fundamental fact on which the algorithm relies is that if an LP has a bounded optimal solution, then the optimum value is attained at a BFS, i.e., at a vertex (or so-called “extreme point”) of the polytope of feasible points, P. Another name for the feasible polytope is “simplex,” which is where the algorithm gets its name. It works as follows.

Step 0. Pick a BFS.

Step 1. Test whether this BFS is optimal. If so, stop. If not, go to step 2.

Step 2. Find a better BFS.

Repeat from step 1.

Since there are only finitely many BFSs (i.e., vertices of P), the algorithm must stop.

Now that we have an overview, let us look at the details. Suppose that at step 0 we pick the BFS of x = (x₁, x₂, x₃, x₄, x₅) = (0, 0, 2, 4, 5), corresponding to vertex O. At step 1 we wish to know whether the objective function can be increased if x₁ or x₂ is increased from 0. So we write x₃, x₄, x₅, and the objective function c^Tx in terms of x₁ and x₂, and display this as dictionary 1.

The last equation in the dictionary shows that we can increase the value of c^Tx by increasing either x₁ or x₂ from 0. Suppose that we increase x₂. The first and second equations show that x₃ and x₄ must decrease, and we cannot increase x₂ beyond 1, at which point x₃ = 0 and x₄ = 3, x₅ = 6. Increasing x₂ as much as possible, we complete step 2 and arrive at a new BFS of x = (0, 1, 0, 3, 6), which is vertex B. Now we are ready for step 1 again, and so we write x₂, x₄, x₅, and c^Tx in terms of the variables that are now zero, namely x₁, x₃, to give dictionary 2.

This shows that c^Tx can be increased by increasing x₁ from 0, but that x₁ can increase no further than 2 because at that point x₄ = 0. This brings us to a new solution (2, 2, 0, 0, 3), which is vertex C. Once more, we are ready for step 1, and so compute dictionary 3, now writing things in terms of x₃ and x₄, which are 0. The algorithm now stops because, as we require x₃, x₄ ≥ 0, the bottom line of dictionary 3 proves that c^Tx ≤ 6 for all feasible x.

There is other important information in the final dictionary. If b is changed to b + ψ, for small ψ^T = (ψ_l, ψ₂, ψ₃), then the maximum value of c^Tx will change to . The coefficient is called a “shadow price,” because it is what we should be willing to pay per unit increase in b₁.

3 How the Algorithm Performs

In running the simplex algorithm the serious work comes in computing the dictionaries. To find dictionary 2, we could use the first equation of dictionary 1 to rewrite x₂ in terms of x₁ and x₃, and then substitute for x₂ in the other equations. Versions of the simplex algorithm have been invented that reduce the computing effort by taking advantage of special structure in the matrix A, such as the fact that most of its entries are zero. The dictionary data is often held in a so-called tableau of coefficients.

There are many other practical and theoretical issues. One concerns the selection of the pivot, that is, the variable that is to be increased from 0. Starting at 0, and depending on which of x₁ and x₂ we choose as the first variable to increase from zero, the path to C can be 0, E, D, C or 0, B, C. There is no known way to guarantee that the algorithm takes the shortest path.

The question of how many steps the simplex algorithm really needs is related to the famous Hirsch conjecture: that for any bounded n-dimensional polytope with m faces, the diameter (defined as the maximum number of edges on the shortest edge-traversing path between any two vertices) is at most m - n. If this were true, it would suggest that some version of the simplex algorithm might run in a number of steps that grows only linearly in the numbers of variables and constraints. However, Klee and Minty (1972) have given an example based on a perturbed n-dimensional cube (m = 2n faces and diameter n), in which if the algorithm selects among possible pivots by choosing the one for which the objective function increases at the greatest rate per unit increase in that variable, then it visits all 2ⁿ vertices before reaching the optimum. Indeed, for most deterministic pivot selection rules, examples are known in which the number of steps grows exponentially in n.

Fortunately, things are usually much better in practical problems than in worst-case examples. Typically, only 0(m) steps are needed to solve a problem with m constraints. Moreover, Khachian (1979) proved (by analysis of the so-called ellipsoid algorithm) that linear programs can in principle be solved by an algorithm whose running time grows only polynomially in n. Thus linear programming is much easier than “integer linear programming,” in which x₁, . . . , x_n are required to be integers and for which no algorithm with polynomial running time is known.

Karmarkar (1984) pioneered development of “interior” methods for linear programming problems. These move through the interior of the polytope P, rather than among its vertices, and can sometimes solve large LPs more quickly than the simplex algorithm. Modern computer software uses both methods and can easily solve LPs with millions of variables and constraints.

Table of Contents for
III.84 The Simplex Algorithm

III.84 The Simplex Algorithm

Richard Weber

1 Linear Programming

2 How the Algorithm Works

3 How the Algorithm Performs

Further Reading

Table of Contents for III.84 The Simplex Algorithm

Create new playlist

Sign In

Sign Up

III.84 The Simplex Algorithm

Richard Weber

1 Linear Programming

2 How the Algorithm Works

3 How the Algorithm Performs

Further Reading

Table of Contents for
III.84 The Simplex Algorithm