MATH 5330: Computational Methods of Linear Algebra
Lecture Note 17: The Simplex Method
Xianyi Zeng
Department of Mathematical Sciences, UTEP
1
The Pivots
At the end of the previous lecture we showed that if an optimal feasible solution to a linear program
exists, it is always possible to find such a solution in the subset of basic solutions. The brute force
method will try all the n!/(m!(n − m)!) possible combinations of the m non-zero components of x.
The method as it is of course does not make practical sense due to high computational cost;
however, it sheds some light on a more efficient way to solve the problem. The basic idea is, if we
start with one basic solution w.r.t. to a m × m sub-matrix B1 of A, we want to move on to the
next m × m sub-matrix B2 so that it is one step closer to the true solution. That is, we still move
on the subset of all n!/(m!(n − m)!) basic solutions but take the action wisely to avoid too many
steps.
The tool of making the moves is called the pivots, and it is described as follows. With a chosen
m×m sub-matrix B, the corresponding variables of x are called the basic ones and the other n−m
variables are called nonbasic ones. A pivot means that a basic variable is swapped by a nonbasic
one. The first question to ask is what computations are required to perform a pivot. With this
tool in hand, we need to find the appropriate variables to swap in and swap out. The basic idea is
that: first we want to find any basic feasible solution, then starting here all the subsequent moves
remain on the basic feasible ones with the target to reduce the objective value with each move.
A single pivot. Let the column vectors of A be a1 ,···,an , and we write the constraint Ax = b as:
x1 a1 + x2 a2 + ··· + xn an = b .
It is customary to write this equality in the following tableau:
a11 a12
a21 a22
..
..
.
.
am1 am2
··· a1n b1
··· a2n b2
..
.
..
. ..
.
··· amn bm
(1.1)
Suppose we have chosen a set of m basic variables, say the first m, and suppose that the submatrix
A(1 : m,1 : m) is non-singular. Then with the help of the algorithms in our first lectures, say the
Gaussian elimination, we can convert the corresponding tableau to:
1
0
..
.
0
1
..
.
···
···
..
.
0 y1,m+1 y1,m+2
0 y2,m+1 y2,m+2
..
..
..
.
.
.
0 0 ··· 1 ym,m+1 ym,m+2
1
··· y1n y10
··· y2n y20
..
.
..
. ..
.
··· ymn ym0
(1.2)
Clearly, the solution to (1.2) is given by x = [y10 y20 ··· ym0 0 ··· 0]t . The form (1.2) is called a
canonical form, i.e., if an m × m sub-matrix is a permutation matrix.
We still denote the columns corresponding to (1.2) by a1 ,···,an , to avoid too complicated notations. Now we want to swap out the basic variable xp for some 1 ≤ p ≤ m and swap in xq with
m < q ≤ n, and put the new tableau into the canonical form w.r.t. to the new basic variables. The
target is clearly to use row operations to transform the column aq into the unit vector ep ∈ Rm ,
without changing ai for 1 ≤ i ≤ n and i 6= p. This task is essentially representing all the vectors as
a linear combination of ai ,1 ≤ i ≤ n,i 6= p and aq , instead of ai ,1 ≤ i ≤ n. From:
aq =
m
X
yiq ai + ypq ap ,
i=1,i6=p
and assuming ypq 6= 0, there is:
m
X
yiq
1
ap =
aq −
ai .
ypq
ypq
i=1,i6=p
Hence we have for all j > m:
aj =
m
X
i=1
yij ai =
m X
i=1,i6=p
yiq
ypj
yij −
ypj ai +
aq .
ypq
ypq
This formula also applies to a0 , defined as the rightmost column of the tableau: a0 =[y10 y20 ··· ym0 ]t .
0 , there is:
To this end, if we denote the elements of the new tableau by yij
yiq
0
yij = yij − ypq ypj , i 6= p
(1.3)
y 0 = ypj
pj
ypq
We can check that (1.3) holds for all 1 ≤ 1 ≤ m and 0 ≤ j ≤ n. For example, if 1 ≤ j ≤ m and i,j 6= p,
there is:
yiq
0
yij
= δij −
δpj = δij ,
ypq
exactly what we expected. Thus (1.3) is used to obtain the entire new tableau after the pivot,
including the right most column.
Note that the formula (1.3) holds for all scenarios, i.e., any m basic variables. But in this case,
the number p for the basic variable to be swapped out needs to be interpreted as xi such that
ai = ep ∈ Rm , see Exercise 1.
2
x2
H 4x1 + 2x2 = 12
−x1 + x2 = 1
I
G
E
D
F
x1 + 2x2 = 4
A
O
B
C
x1
Figure 1: Pivots and move along edge segments.
In order to have a geometrical interpretation with the pivoting, we consider the example in the
previous lecture as depicted in Figure 1. Here it is understood that we’re dealing with inequality
constraints, and each variable is the slack variable corresponding to one of them. In this figure,
a basic solution is the intersection of two constraint lines including x1 = 0 and x2 = 0; the solid
dots denote the basic feasible solutions whereas the empty ones are basic infeasible solutions. The
effect of a pivot is to move from one basic solution to another along one of the constraint lines. For
example: (1) moving from D to E is a motion within the feasible region; (2) moving from C to B
is a motion that leads into the feasible region; (3) moving from F to I leads the way out of the
feasible region; and (4) moving from A to C is also allowed in the pivots, which will first enter the
feasible region and then move out.
Pivots maintaining the feasibility. Next we think about where can we move from a particular
basic solution, say the point A of Figure 1. As it has been described before, we always need to
move along constraint lines; and in this case it is either the points D,E,I or the points O,B,C, so
there are six candidate basic solutions. There are two ways to divide these candidates into groups:
• Suppose we first decide on the nonbasic variable xq to be swapped in. It means that in the
current state xq = 0, and the equality part of the corresponding constraint is satisfied – once
xq is swapped in we’ll likely to have the strict inequality instead. Using the current linear
program as an example, giving up the equality x2 = 0 means moving to one of D, E, and I;
similarly giving up the equality −x1 + x2 = 1 indicates the next basic solution is one of O, B,
and C. In either case, we can identify which basic variable to be swapped out at the three
points – if B is chosen, the slack variable for 4x1 + 2x2 ≤ 12 is to be swapped out.
In higher dimensions, this interpretation still holds – we move off a constraint hyperplane
that corresponds to a nonbasic variable while remain on all the constraint hyperplanes that
3
correspond to the other n−m−1 nonbasic variables. Hence the possible moves are along the
line defined by these n − m − 1 hyperplanes.
• Alternatively, we can first decide on the basic variable xp to be swapped out – that is, we
choose a constraint line away from the current basic solution and move towards that line. For
example, if we choose to swap out the variable corresponding to x1 + 2x2 = 4, there are two
possible moves E and C. The other two groups and I and B, and D and O.
The first strategy is often used to design a pivot, such that if the initial point is a basic feasible
solution we can almost always find the next one as a basic feasible solution or decide that the
feasible region is unbounded, see the discussion below.
To illustrate how this work, we first assume that the current basic feasible solution is not
degenerate, that is, all basic variables are non-zero. The degenerate case does not occur often, and
we’ll talk about the modifications to handle this. Without loss of generality, we suppose the first
m variables are the basic ones, and have decided to swap in xq where q > m is given. There are two
equalities:
x1 a1 + x2 a2 + ··· + xm am = b ,
where xi > 0 for 1 ≤ i ≤ m and
aq = y1q a1 + y2q a2 + ··· + ymq am .
The idea is to take a linear combination of the two so that the coefficient of one ai , 1 ≤ i ≤ m
disappears:
(x1 − εy1q )a1 + (x2 − εy2q )a2 + ··· + (xm − εymq )am + εaq = b .
(1.4)
Starting with ε = 0 we increase its value gradually to make sure that the result is feasible for xq ,
and hope that one of the first m coefficients will reach zero. Clearly, there are two scenarios. (1) If
some yiq is positive, then we set:
xi
ε=
min
,
(1.5)
1≤i≤m,yiq >0 yiq
and set p to the index that achieves this minimum – if there are more than one such indices, we
set p to be any of them. (2) All yiq are non-positive, then we can set ε to arbitrarily large value
and obtain feasible points whose variables grow without bound. In the second scenario, the feasible
region is unbounded, yet another special case that we will deal with later.
Decreasing the objective function. Continue with the previous discussion, once a pivot is
found to move from one basic feasible solution to another one, the objective function will change
accordingly. Of course, we hope this value to decrease by choosing an appropriate xq to swap in.
At the current state, the objective value is:
z0 = ct x = c1 y10 + c2 y20 + ··· + cm ym0 .
This calculation is based on the tableau (1.2) and by setting the last n − m variables to zero. If
there is no restriction to use basic solutions only, we can always assign xm+1 ,···,xn first and use
4
(1.2) to find the other m variables:
n
X
x1 = y10 −
x2 = y20 −
y1j xj ,
j=m+1
n
X
y2j xj ,
j=m+1
..
.
n
X
xm = ym0 −
ymj xj ;
j=m+1
in fact, this method can be used to obtain the objective value of all x that satisfies the constraint
Ax = b. The corresponding objective function is:
m
n
n
n
X
X
X
X
z=
ci yi0 −
yij xj +
cj xj = z0 +
(cj − zj )xj
(1.6)
i=1
where
j=m+1
j=m+1
zj = y1j c1 + y2j c2 + ··· + ymj cm ,
j=m+1
m+1 ≤ j ≤ n .
Note that if some cq − zq < 0, we can try to find an admissible pivot according to the previous part
to swap xq in. If this pivot is successful, there will be xq > 0 as given by (1.5), and the objective
function decreases by (cq − zq )xq . Fortunately, this is guaranteed by the following theorem.
Theorem 1.1 (Improvement of Basic Feasible Solutions). Given a non-degenerate basic feasible
solution with objective function value z0 . If for a nonbasic variable xq there is cq −zq < 0 (as defined
before), then there is a feasible solution with objective function value z < z0 . In particular, if xq can
be swapped with a variable in the original basic ones, this pivot yields a new basic feasible solution
with smaller objective value; and if such a swap is not possible then the feasible region is unbounded
and the objective function can be made arbitrarily large and negative.
Proof. Suppose again that the current basic variables are the first m variables of x and suppose
that cq − zq < 0 for some q > m. We compute (x1 ,x2 ,···,xm ) corresponding to xq = ε and xj = 0,j ≥
m+1,j 6= q. It is not difficult to see that the resulting xε is exactly given by the coefficients in (1.4)
so that xq can be swapped in if and only if at least one yiq ,1 ≤ i ≤ m is positive.
If this is true, we set xq to (1.5) and the new solution is basic and feasible with the new objective
function value:
z = z0 + (cq − zq )xq < z0 .
If xq cannot be swapped, then all yiq are non-positive so that we can set ε to be arbitrarily large
and the corresponding solution xε is always feasible. For such a feasible solution, there is:
z ε = z0 + (cq − zq )ε → −∞
as
ε → +∞ .
Hence in this case, the feasible region is unbounded and the objective function can achieve arbitrarily
large negative value.
5
What if all cj − zj are non-negative for m + 1 ≤ j ≤ n? In this case, the solution is actually
optimal!
Theorem 1.2 (Optimality Condition). If for some basic feasible solution there is cj − zj ≥ 0 for
all nonbasic variables xj , then this solution is optimal.
Proof. Let x0 be any feasible solution, then following previous derivation its objective value is given
by:
X
z 0 = z0 +
(cj − zj )x0j .
j
x0j
But ≥ 0 since
be optimal.
x0
is feasible; hence
z0 ≥ z
0
due to the assumption that cj −zj ≥ 0,∀j. Thus x must
Semi-summary. So far we already have a fairly reasonable method to solve the linear programming problem, called the simplex method. It involves the following steps:
1. Start with a basic feasible solution, form the tableau (1.2).
2. Compute the relative cost coefficient rj = cj −zj ; if all rj ≥ 0, stop – the current basic feasible
solution is optimal.
3. Select q such that rq < 0 (usually the most negative one, or the one with the smallest index).
4. Calculate the ratios yi0 /yiq for yiq >0 and a basic variable xi . If no such yiq >0 exists, stop and
the problem is unbounded. Otherwise, select p as the index i corresponding to the smallest
ratio.
5. Pivot on the pq-component and update all rows including the last in the tableau; return to
step 2.
A working example. We shamelessly borrow this example from [1]: Maximize 3x1 + x2 + 3x3
subject to:
2x1 + x2 + x3 + x4 = 2
x1 + 2x2 + 3x3 + x5 = 5
2x1 + 2x2 + x3 + x6 = 6
and the non-negativity constraints:
x1 ≥ 0 ,
x2 ≥ 0 ,
x3 ≥ 0 ,
x4 ≥ 0 ,
x5 ≥ 0 ,
x6 ≥ 0 .
This problem is clearly derived from a standard maximum one for x1 ,x2 ,x3 , with x4 , x5 , and x6
being the slack variables. First we revert the sign of the objective function and set the target to
minimize −3x1 − x2 − 3x3 , and obtain the following (enhanced) initial tableau that is similar to
(1.2):
a1 a2 a3 a4 a5 a6 b
2 1 1 1 0 0 2
1 2 3 0 1 0 5
2 2 1 0 0 1 6
t
r −3 −1 −3 0 0 0 0
6
In this tableau, the basic variables are (x4 ,x5 ,x6 ) and we’re fortunate enough to start with a basic
feasible solution (0,0,0,2,5,6) so that the previous algorithm likely applies. The reference objective
value is clearly c0 = 0, and we can compute the relative cost coefficients for the nonbasic variables:
r1 = c1 − z1 = −3 − (2 × 0 + 1 × 0 + 2 × 0) = −3
r2 = c2 − z2 = −1 − (1 × 0 + 2 × 0 + 2 × 0) = −1
r3 = c3 − z3 = −3 − (1 × 0 + 3 × 0 + 1 × 0) = −3 ;
the zeroes in the brackets are the coefficients for x4 ,x5 ,x6 , our basic variables, in the objective
function. The relative cost coefficients are represented as the last row of the enhanced tableau.
Because all three cost coefficients are negative, the three current non-basic variables can all be
swapped in by a pivot. For example, if we choose to swap in x1 , from:
a1 = 2a4 + a5 + 2a6 ,
all three coefficients are positive, hence the basic variable to be swapped out is the one produces
the smallest of:
2
5
6
x4 : ; x5 : ; x6 : ,
2
1
2
or x4 – this pivot is circled in the first column. Similarly, the two other pivots are x4 for x2 , and
x5 for x3 , respectively.
0 = y /y , so
Remember that the pq-pivot produces the new basic variable with the value yp0
p0 pq
0
that the decrease (or more generally the change) of the objective value is rj yp0 = rj yp0 /ypq . For the
three candidate variables to be swapped in, this quantity is:
2
5
6
x1 : −3 × = −3 ; x2 : −1 × = −5 ; x3 : −3 × = −9 .
2
1
2
All three options lead to a decrease in the objective function (as shown by Theorem 1.1); but
normally we’d not bother to calculate all of them but pick anyone with a negative cost coefficient.
In the simple case of hand calculation here, we decide to avoid division by an integer larger than
one so that x2 is picked for the pivot. In this case the decrease in the objective function is given
by −1 × 2 = −2. After updating the tableau and computing the new relative cost coefficients, we
end up with the second tableau as below:
a1
2
−3
−2
t
r −1
a2
1
0
0
0
a3
1
1
−1
−2
a4
1
−2
−2
1
a5
0
1
0
0
a6
0
0
1
0
b
2
1
2
2
Now the new relative cost coefficients to be computed are r1 ,r3 ,r4 :
r1 = c1 − z1 = −3 − [2 × (−1) + (−3) × 0 + (−2) × 0] = −1
r3 = c3 − z3 = −3 − [1 × (−1) + 1 × 0 + (−1) × 0] = −2
r4 = c4 − z4 = 0 − [1 × (−1) + (−2) × 0 + (−2) × 0] = 1 ,
7
as described in the last row of the second tableau. Note that the current objective value times
−1 can be computed using the same formula as the relative cost coefficients, assuming that c0 = 0.
This number is marked at the lower-right corner of the tableau.
Now we only have two candidates for swapping in, x1 and x3 ; and the corresponding pivots are
circled in the tableau. We pick the x3 to swap in, which corresponds to x5 to swap out, and the
resulting tableau is showing next:
a1
5
−3
−5
t
r −7
a2
1
0
0
0
a3
0
1
0
0
a4
3
−2
−4
−3
a5
−1
1
1
2
a6
0
0
1
0
b
1
1
3
4
The new relative cost coefficients for x1 , x4 , and x5 are:
r1 = c1 − z1 = −3 − [5 × (−1) + (−3) × (−3) + (−5) × 0] = −7
r4 = c4 − z4 = 0 − [3 × (−1) + (−2) × (−3) + (−4) × 0] = −3
r5 = c5 − z5 = 0 − [(−1) × (−1) + 1 × (−3) + 1 × 0] = 2 ,
and the decrease in the objective value is −2 × 1 = −2, or equivalently ct x = −2 − 2 = −4.
With the two candidates x1 and x4 to swap out, the pivots are circled and we decide to go with
swapping in x1 and swapping out x2 ; the resulting tableau is:
a1
1
0
0
t
r 0
a2
0.2
0.6
1
1.4
a3
0
1
0
0
a4
a5 a6 b
0.6 −0.2 0 0.2
−0.2 0.4 0 1.6
−1
0
1 4
1.2 0.6 0 5.4
The three relative cost coefficients are:
r2 = c2 − z2 = −1 − [0.2 × (−3) + 0.6 × (−3) + 1 × 0] = 1.4
r4 = c4 − z4 = 0 − [0.6 × (−3) + (−0.2) × (−3) + (−1) × 0] = 1.2
r5 = c5 − z5 = 0 − [(−0.2) × (−3) + 0.4 × (−3) + 0 × 0] = 0.6 ,
all are positive! By Theorem 1.2, we obtained an optimal feasible solution; and the minimum
objective value is −4 + (−7) × 0.2 = −5.4.
2
Other Issues of the Simplex Method
If no degenerate basic feasible solution occurs, the preceding method guarantees to terminate as
there are only finite number of states. In practice, people usually find out that the method will
terminate in no more than 3m/2 iterations.
However, there are a few issues we need to address for practical use of the method. First, what
can we do if a degenerate basic feasible solution occurs. Second, how to find an initial basic feasible
8
solution. Third, how to implement the algorithm efficiently both in terms of computational cost
and storage.
Dealing with degeneracy. Suppose we already found a basic feasible solution and let us take
a look at what will happen in the simplex method if it is degenerate. The assumption that the
current iterate is not degenerate appears in using pivots to move from a basic feasible solution to
another one. Particularly, if some xi = 0 while yiq > 0 in (1.4) we find out that ε = 0 so that the
pivot to take xq in is not possible. With a slight modification, however, this is not an issue because
we can simply swap out the zero basic variable, say xp , and take in xq . In this case, we reached
another degenerate basic feasible solution where xq has the zero value. With bad luck, this process
can continue for a series of steps until, finally, the original degenerate solution is again obtained
and we have a cycle in executing the method.
There are a few facts about the degenerate case in the simplex method:
• If the algorithm hits a degenerate basic feasible solution, it is possible to jump out of a cycle
because it may happen that for all zero basic variables xi = 0, yiq ≤ 0.
• Based on the previous fact, cycling actually rarely occurs in the applications of the simplex
method as it is.
• People have never understood why the cycling is prevented from happening – some authors
conjecture that when cycling do occur, users may think the algorithm is just taking a too
long time to reach an optimal solution so that they kill it and never report (or even check
for) cycling.
Because there is no theoretical guarantee that the cycling will not occur, it is recommended to add
some “safeguard” feature to avoid it – namely to make sure that the simplex method will terminate
in finite steps even degenerate solutions occur during the pivoting process. A simple rule to achieve
this is due to Bland:
Definition 1 (Bland’s Rule). In the simplex method, choose the pivot according to:
(a) Select the column to enter the basis as the lowest indexed column with negative relative cost
coefficient.
(b) If ties occur in determining which column is to be swapped out, select the one with the lowest
index.
This rule is surprisingly simple, and one can prove that it will prevent cycling.
Finding one basic feasible solution. We shall now discuss how to find at least one basic feasible
solution. In the special case when the constraints are given by:
Ax ≤ b ≥ 0 ,
and
x≥0,
which is a special case of the standard maximum form, a basic feasible solution is directly available
after the problem is transformed into a standard one with only equality constraints. Particularly,
slack variables are introduced for each constraint so that we have:
Ax + = b ≥ 0 ,
and
9
x≥0, ≥0.
These equality constraints already lead to a tableau in the canonical form with the basic variables
being components of . Hence a basic feasible solution is immediately available, and it is given by
x = 0 and = b ≥ 0.
This method should be used whenever possible, for example, also in the case of the standard
minimum problem when the constraints are given by y t A ≥ ct and c ≤ 0. In the general situation,
however, such a basic feasible solution is not easily obtained. Interestingly, we can solve another
linear programming problem to obtain it. The key idea is to introduce artificial variables.
Let the constraints in the standard form be given by:
Ax = b ≥ 0 ,
and
x≥0.
(2.1)
We construct the following standard linear program:
minimize
y1 + y2 + ··· + ym ,
such that Ax + y = b
(2.2)
and x ≥ 0 , y ≥ 0 .
If (2.1) has at least one feasible solution, the optimal value to (2.2) is achieved at 0 by setting y = 0.
Note that (2.2) bears a natural basic feasible solution given by x = 0 and y = b; thus we can use
the simplex method to find an optimal solution to it. There are two scenarios:
• If the optimal objective value is zero, the final basic solution must satisfy y = 0 – thus the x
part of the result is a basic feasible solution to (2.1).
• If the optimal objective value is not zero, then (2.1) has no feasible solutions at all.
The matrix form. We can derive the matrix form, which is more tightly related to real implementation of the simplex method. Suppose for now again that the current basic variables are the
first m ones, and we write:
xB
A = [B D] , x =
xD
where B ∈ Rm×m is non-singular, and the split of x is done in the obvious way. Clearly, the basic
feasible solution is given by xB = B −1 b (assumed to be non-negative) and xD = 0. Suppose we split
c in a similar way into cB and cD , the current objective value is:
z 0 = ctB xB = ctB B −1 b ;
and if we fix xD first and then compute xB = B −1 (b − DxD ), the corresponding objective value is:
z = ctB B −1 (b − DxD ) + ctD xD = z 0 + (ctD − ctB B −1 D)xD .
Hence the relative cost coefficients associated with the non-basic variables are:
r tD = ctD − ctB B −1 D .
Putting everything in the tableau form as before, the starting point is one that may not be in the
canonical form:
B D b
ctB ctD 0
10
To derive the canonical form of the tableau by choosing B as a basis, we first left-multiply the first
row by B −1 to obtain:
I B −1 D B −1 b
Left multiply it by ctB and remove the result from the last row, the tableau ends up in:
B −1 b
I
B −1 D
0 ctD − ctB B −1 D −ctB B −1 b
We arrive at the relative cost coefficients r tD and the objective value for the basic feasible solution in
the last row of this canonical tableau. This procedure can be summarized as the following matrixform of the simplex method, where we start with B −1 and xB = y 0 = B −1 b already computed:
1. Compute the current relative cost coefficients r tD = ctD − ctB B −1 D.
2. If r D ≥ 0, stop; the current basic feasible solution is optimal.
3. Select the vector aq to be swapped in, it could either be the one with the most negative
relative cost coefficient or the one with the smallest index.
4. Calculate y q =B −1 aq , where aq is represented as in D – y q are the coefficients of aq expressed
in the basis of B.
5. If no yiq > 0, stop and the problem is unbounded. Otherwise, calculate the ratio yi0 /yiq for
yiq > 0 to determine the vector to be swapped out.
6. Update B −1 and the current solution B −1 b; return to Step 1.
This algorithm seems to be reminiscent of the previous one, but it has a major differences. Particularly, we never explicitly form the canonical form of the tableau at each stage, and never need
to update the part of A that is not in the standard basis. Particularly, if we look at the memory
usage of the algorithm, one implementation involves:
• A∈Rm×n is part of the problem definition, this matrix is not changed throughout the method.
• An m × m matrix to store B −1 and an m × 1 integer vector to keep track of the indices of
the basic variables. Once the latter is known, the matrix D of the algorithm is simply the
columns of A not included in the basic ones.
• An n × 1 vector c to store the coefficients of the objective function.
• An (n − m) × 1 vector r to store the relative cost coefficients.
• In order to evaluate r, another m × 1 vector λt = ctB B −1 is first computed (of course we do
not want to compute B −1 D first, that’s exactly why once aq is picked it needs to compute
y q = B −1 aq ).
Let us use the previous example again, the original matrix/tableau is repeated for reference:
a1
2
1
2
a2
1
2
2
a3
1
3
1
a4
1
0
0
11
a5
0
1
0
a6
0
0
1
b
2
5
6
and the objective function coefficients are:
ct = [−3 − 1 − 3 0 0 0] .
The starting B −1 is computed from x4 ,x5 ,x6 , and we keep track of this information as:
Variables
B −1
xB
4
1 0 0 2
5
0 1 0 5
6
0 0 1 6
y2
1
2
2
Now ctB = [0 0 0] hence λt = ctB B −1 = [0 0 0] and:
r tD = ctD − λt D = ctD = [−3 − 1 − 3] ;
it corresponds to the three current non-basic variables x1 ,x2 ,x3 . As before, we pick a2 to swap in,
compute y 2 = B −1 a2 to obtain y t2 = [1 2 2]t , calculate the ratios between components of xB and y 2 ,
and pick the smallest positive ratio to perform the pivot. This number is circled in the previous
tableau.
With the updated basic variables x2 ,x5 ,x6 we grab the sub-matrix B of A and computes its
inverse to obtain the next tableau:
B −1
xB y 2
Variables
2
1
0 0 2 1
5
−2 1 0 1 1
6
−2 0 1 2 −1
Here xB is again, computed as B −1 b. The other vectors are:
λt = ctB B −1 = [−1 0 0]B −1 = [−1 0 0] ,
2 1 1
t
= [−1 − 2 1] .
r tD = c−1
D − λ D = [−3 − 3 0] − [−1 0 0] 1 3 0
2 1 0
Choosing a3 to swap in we have y 3 = B −1 a3 as indicated in the last column of the tableau and
decide on a5 to be swapped out.
The new basic variables are x2 ,x3 ,x6 ; the corresponding B −1 , xB , λ, r tD , and the picked swap-in
vector y 1 are summarized below:
Variables
B −1
xB y 1
2
3 −1 0 1 5
3
−2 1 0 1 −3
6
−4 1 1 3 −5
λt = [3 − 2 0]
r tD = [−7 − 3 2]
There is only one variable to swap out, namely x2 .
Finally, with the basic variables x1 ,x3 ,x6 we proceeds until computing r D :
Variables
B −1
xB
1
0.6 −0.2 0 0.2
3
−0.2 0.4 0 1.6
6
−1
0 1 4
12
λt = [−1.2 − 0.6 0]
r tD = [1.4 1.2 0.6] ≥ 0
The final optimal solution is thusly (0.2,0,1.6,0,0,4).
LU-variant of the simplex method. The matrix-version of the simplex method is considerably
more computational efficient than the one using canonical tableaux; however, we still need to invert
an m × m matrix B at each iteration. Something the preceding algorithm has not made use of,
though, is the fact that two subsequent B’s only differ in one column. In an LU-variant of the
method, we keep track of the LU decomposition of B, and update L and U from one iteration to the
next. The overall method now assumes that we have a basic feasible variable and the corresponding
basis B. Furthermore, we know the LU decomposition B = LU (or more generally, BP = LU with
P being a permutation matrix) and the algorithm reads:
1. Compute the current solution by BxB = b.
2. Solve λt B = ctB and compute the current relative cost coefficients r tD = ctD − λt D.
3. If r D ≥ 0, stop; the current basic feasible solution is optimal.
4. Select the vector aq to be swapped in, it could either be the one with the most negative
relative cost coefficient or the one with the smallest index.
5. Calculate y q such that By q = aq .
6. If no yiq > 0, stop and the problem is unbounded. Otherwise, calculate the ratio yi0 /yiq for
yiq > 0 to determine the vector to be swapped out.
7. Update B and its LU decomposition; return to Step 1.
We just need to consider how the LU decomposition can be updated. Suppose B = [a1 a2 ··· am ]
and we swap out ap and swap in aq , which is attached to the last column to obtain:
B = [a1 a2 ··· ap−1 ap+1 ··· am aq ] .
Of course, if the column vectors are arranged such that the indices are in increasing order, aq does
not have to be at the end of new matrix B. However, we can always use the LU decomposition
with column permutation BP = LU so that we can write B as described and update P accordingly.
Suppose B = LU , then L−1 B = H is of the upper-Hessenberg form:
∗ · ∗ ∗ · ∗ ∗
· · · · · ·
∗ ∗ · ∗ ∗
∗ ∗ · ∗ ∗
(2.3)
H = [u1 u2 ··· up−1 up+1 ··· um L−1 aq ] =
.
∗ · ∗ ∗
· · ·
∗ ∗
We can construct a sequence of simple lower-triangular matrices Lk , k = p,···,m − 1 such that
Lm−1 ···Lp H = U is upper-triangular; so that the LU-decomposition of B is given by B = L U , where
−1
L = LL−1
p ···Lm−1 , see Exercise 3.
Finally, it is remarked that QR decomposition can also be used for the same purpose. That
is, at every step we keep track of BP = QR, where Q is orthogonal and R is upper-triangular.
Following the same notations before, the idea is that with B = QR already computed and B the
13
same as before, we have H = Q−1 B also in the upper-Hessenberg form that is very similar to (2.3).
Then instead of using a sequence of lower-triangular matrices, we can use a sequence of Givens
rotations to put this H into upper-triangular form. This is known as the QR-variant of the simplex
method.
3
Other Remarks
The simplex method has been used for solving linear programs for more than half a century.
People have long believed that the method will terminate in polynomial time – practices indicate
that almost at all times the simplex method finds the optimal solution in no more than 3m steps
and even no more than 3m/2 steps if m and n are not so large (on the scale of hundreds). However,
no theoretical proof of the polynomial complexity of the simplex method has been established, until
in 1972 when Klee and Minty showed an example for which the method will
Pexamine every possible
basic feasible solutions. One form of their example is given by: Maximize nj=1 10n−j xj subject to:
2
i−1
X
10i−j xj + xi ≤ 100i−1 ,
i = 1,···,n ,
j=1
and
x1 ,···,xn ≥ 0 .
After introducing a slack variable for each constraint, there are in total n equations and 2n nonnegative variables. People verified that if at each step the pivot with the largest reduced cost is
chosen, the simplex method takes 2n − 1 pivot steps to reach the optimal solution (note that the
feasible region is a polygon with 2n vertices for this problem).
Due to the lack of a theoretical guarantee on the polynomial complexity, people start to look
for alternative methods that has this property. One of the first successful attempts is the ellipsoid
method due to Khachiyan in 1979. The basic idea is to convert a linear program to an equivalent
form that aims at finding a point in a polyhedral:
Ω = {y ∈ Rm : y t A ≤ c ,A ∈ Rm×n } .
To see how this equivalence is established, we need to consider the “dual problem” first, which
is the subject of the next lecture. For now, ellipsoid method construct a sequence of ellipsoid in
Rm , denoted by {Ek , k = 1,···,}, such that each Ek contains the entire Ω and the volume of these
ellipsoids decreases at a fixed minimum rate as k → ∞. Finally, this sequence of ellipsoids with
decreasing volume can be constructed as long as their centers are not in Ω yet; hence with minor
assumptions on Ω (such as boundedness and non-empty interior) one conclude that a point in Ω
can be found in polynomial time.
Despite this beautiful theoretical bound on the complexity, the ellipsoid method is much slower
than the simplex method for many practical problems. But Khachiyan’s work motivates many
researchers to construct other polynomial complexity method that can compete with the simplex
method in practical applications. One of the most successful one is the interior-point method by
Karmarkar in 1984. The basic idea is to use nonlinear programming techniques
to solve a linear
P
programming problem, for example, by introducing the “barrier” −µ log(xi ) in the objective
function where µ ≥ 0. With any positive µ, xi > 0 must be staying away from 0 otherwise the
14
objective function will be extremely large. To find the solution to the original linear program, one
needs to find a suitable µ, solve the corresponding perturbed ”barrier” problem, and apply a last
procedure such as the “purification” step to find a feasible corner whose objective value is no more
than that of the solution to the perturbed problem. The interior point method is out of the scope
of this class.
Exercises
Exercise 1. Suppose we’re at a basic solution and the tableau is in the canonical form. The basic
variables are xj1 , xj2 , ···, and xjm , such that ajk = ek ∈ Rm for k = 1,···,m. Suppose we want to
/ {j1 ,j2 ,···,jm }. Show that this is possible if ypq 6= 0,
perform the pivot that swap xjp by xq where q ∈
and the coefficients of the new tableau are given by (1.3).
Exercise 2. Use the method of artificial variables to find a basic feasible solution to the following
constraints:
2x1 + x2 + 2x3 = 4
3x1 + 3x2 + x3 = 3
and the non-negativity constraints:
x1 ≥ 0 ,
x2 ≥ 0 ,
x3 ≥ 0 .
Exercise 3. We want to construct the sequence of lower-triangular matrices Lp ,···,Lm−1 so that
the upper-Hessenberg matrix H of (2.3) is converted to upper-triangular form: Lm−1 ···Lp H = U .
The strategy here is to define Lk as:
1
..
.
1
Lk =
lk 1
.
.
.
1
Here the lk is in the k + 1-th row and k-th column; and it is used to eliminate the (k + 1,k)-th
element of H. Compute these matrices Lk , k = p,···,m − 1.
References
[1] David G. Luenberger and Yingyu Ye. Linear and Nonlinear Programming, volume 228 of
International Series in Operations Research & Management Science. Springer International
Publishing, 4th ed. edition, 2016.
15
© Copyright 2026 Paperzz