Mathematics Of Operations Research

ANDREW TULLOCH
M AT H E M AT I C S O F O P E R AT I O N S R E S E A R C H
TRINITY COLLEGE
THE UNIVERSITY OF CAMBRIDGE
Contents
1
Generalities
2
Constrained Optimization
5
2.1 Lagrangian Multipliers
2.2 Lagrange Dual
7
7
8
2.3 Supporting Hyperplanes
3
8
Linear Programming
11
3.1 Convexity and Strong Duality
3.2 Linear Programs
12
3.3 Linear Program Duality
12
3.4 Complementary Slackness
4
11
Simplex Method
15
4.1 Basic Solutions
15
13
4.2 Extreme Points and Optimal Solutions
4.3 The Simplex Tableau
15
16
4.4 The Simplex Method in Tableau Form
16
4
5
6
andrew tulloch
Advanced Simplex Procedures
17
5.1 The Two-Phase Simplex Method
17
5.2 Gomory’s Cutting Plane Method
17
Complexity of Problems and Algorithms
6.1 Asymptotic Complexity
7
19
The Complexity of Linear Programming
7.1 A Lower Bound for the Simplex Method
7.2 The Idea for a New Method
8
Graphs and Flows
8.1 Introduction
21
21
21
23
23
8.2 Minimal Cost Flows
9
19
23
Transportation and Assignment Problems
10 Non-Cooperative Games
11 Strategic Equilibrium
12 Bibliography
31
27
29
25
1
Generalities
www.statslab.cam.ac.uk/~ff271/teachin/mor [email protected]
2
Constrained Optimization
Minimize f ( x ) subject to h( x ) = b, x ∈ X.
Objective function f : Rn → R Vector x ∈ Rn of decision variables, Functional constraint where h : Rn − > Rm , b ∈ Rm Regional
constraint where X ⊆ Rn .
Definition 2.1. The feasible set is X (b) = { x ∈ X : h( x ) = b}.
An inequality of the form g( x ) ≤ b can be written as g( x ) + z = b,
where z ∈ Rm called a slack variable with regional constraint z ≥ 0.
2.1
Lagrangian Multipliers
Definition 2.2. Define the Lagrangian of a problem as
L( x, λ) = f ( x ) − λ T (h( x ) − b)
(2.1)
where λ ∈ Rm is a vector of Lagrange multipliers
Theorem 2.3 (Lagrangian Sufficiency Theorem). Let x ∈ X and
λ ∈ Rm such that
L( x, λ) = inf L( x 0 , λ)
x0 ∈X
and h( x ) = b. Then x is optimal for P.
(2.2)
8
andrew tulloch
Proof.
min f ( x 0 ) = min [ f ( x ) − λ T (h( x 0 ) − b)]
x 0 ∈ X (b)
x 0 ∈ X (b)
≥ min
[ f ( x 0 ) − λ T (h( x 0 ) − b)]
0
(2.4)
= f ( x ) − λT (h( x ) − b)
(2.5)
= f (x)
(2.6)
x ∈X
2.2
(2.3)
Lagrange Dual
Definition 2.4. Let
φ(b) = inf f ( x ).
x ∈ X (b)
(2.7)
Define the Lagrange dual function g : Rm → R with
g(λ) = in f x∈X L( x, λ)
(2.8)
Then, for all λ ∈ Rm ,
inf f ( x ) = inf L( x, λ) ≥ inf L( x, λ) = g(λ)
x ∈ X (b)
x ∈ X (b)
x∈X
(2.9)
That is, g(λ) is a lower bound on our optimization function.
This motivates the dual problem to maximize g(λ) subject to
λ ∈ Y, where Y = {λ ∈ Rm : g(λ) > −∞}.
Theorem 2.5 (Duality). From (2.9), we see that the optimal value of the
primal is always greater than the optimal value of the dual. This is weak
duality.
2.3
Supporting Hyperplanes
Fix b ∈ Rm and consider φ as a function of c ∈ Rm . Further consider
the hyperplane given by α : Rm → R with
α(c) = β − λ T (b − c)
Now, try to find φ(b) as follow.
(2.10)
mathematics of operations research
(i) For each λ, find
β λ = max{ β : α(c) ≤ φ(c), ∀c ∈ Rm }
(2.11)
(ii) Choose λ to maximize β λ
Definition 2.6. Call α : Rm → R a supporting hyperplane to φ at b if
α(c) = φ(b) − λ T (b − c)
(2.12)
φ(c) ≥ φ(b) − λ T (b − c)
(2.13)
and
for all c ∈ Rm .
Theorem 2.7. The following are equivalent
(i) There exists a (non-vertical) supporting hyperplane to φ at b,
(ii) XXXXXXXXXXXXXXXXXXXXXXXXXX
9
3
Linear Programming
3.1
Convexity and Strong Duality
Definition 3.1. Let S ⊆ Rn . S is convex if for all δ ∈ [0, 1], x, y ∈ S
implies that δx + (1 − δ)y ∈ S.
f : S → R is convex if for all x, y ∈ S and δ ∈ [0, 1], δ f ( x ) + (1 −
δ) f (y) ≥ f (δx + (1 − δ)y.
Visually, the area under the function is a convex set.
Definition 3.2. x ∈ S is an interior point of S if there exists e > 0
such {y : ky − x k2 ≤ e} ⊆ S.
x ∈ S is an extreme point of S if for all y, z ∈ S and S ∈ (0, 1),
x = δy + (1 − δ)z implies x = y = z.
Theorem 3.3 (Supporting Hyperplane). Suppose that our function φ is
convex and b lies in the interior of the set of points where φ is finite. Then
there is a (non-vertical) supporting hyperplane to φ at b.
Theorem 3.4. Let X (b) = { x ∈ X : h( x ) ≤ b}, φ(b) = infx∈X (b) f ( x ).
Then φ is convex if X, f , and h are convex.
Proof. Let b1 , b2 ∈ Rm such that φ(b1 ), φ(b2 ) are defined. Let δ ∈ [0, 1]
and b = δb1 + (1 − δ)b2 . Consider x1 ∈ X (b1 ), x2 ∈ X (b2 ) and let
x = δx1 + (1 − δ) x2 .
12
andrew tulloch
By convexity of Y, x ∈ X. By convexity of h,
h( x ) = h(δx1 + (1 − δ) x2 )
≤ δh( x1 ) + (1 − δ)h( x2 )
≤ δb1 + (1 − δ)b2 = b
Thus x ∈ X (b), and by convexity of f ,
φ(b) ≤ f ( x )
= f (δx1 + (1 − δ) x2 )
≤ δ f ( x1 ) + (1 − δ ) f ( x2 )
≤ δφ(b1 ) + (1 − δ)φ(b2 )
3.2
Linear Programs
Definition 3.5. General form of a linear program is
min{c T x : Ax ≥ b, x ≥ 0}
(3.1)
Standard form of a linear program is
min{c T x : Ax = b, x ≥ 0}
(3.2)
3.3 Linear Program Duality
Introduce slack variables to turn problem P into the form
min{c T x : Ax − z = b, x, z ≥ 0}
(3.3)
We have X = {( x, z) : x, z ≥ 0} ⊆ Rm+n . The Lagrangian is
L(( x, z), λ) = c T x − λ T ( Ax − z − b)
= (c T − λ T A) x + λ T z + λ T b
(3.4)
(3.5)
mathematics of operations research
and has a finite minimum if and only if
λ ∈ Y = {λ : c T − λ T A ≥ 0, λ ≥ 0}
(3.6)
For a fixed λ ∈ Y, the minimum of L is satisfied when (c T −
λ T A) x = 0 and λ T z = 0, and thus
g(λ) =
inf L(( x, z), λ) = λ T b
( x,z)∈ X
(3.7)
We obtain that the dual problem
max{λ T b : A T λ ≤ c, λ ≥ 0}
(3.8)
and it can be shown (exercise) that the dual of the dual of a linear
program is the original program.
3.4
Complementary Slackness
Theorem 3.6. Let x and λ be feasible for P and its dual. Then x and λ are
optimal if and only if
(c T − λ T A) x = 0
(3.9)
λ T ( Ax − b) = 0
(3.10)
Proof. If x, λ are optimal, then
cT x =
λT b
|{z}
(3.11)
by strong duality
= inf
(c T x 0 − λ T ( Ax 0 − b)) ≤ c T x −
0
λ T ( Ax − b)
|
{z
}
x ∈X
≤ cT x
primal and dual optimality
(3.12)
Then the inequalities must be equalities. Thus
λ T b = c T x − λ T ( Ax − b) = (c T − λ T A) x +λ T b
|
{z
}
(3.13)
c T x − λ T ( Ax − b) = c T x
|
{z
}
(3.14)
=0
and
=0
13
14
andrew tulloch
If (c T − λ T A) x = 0 and λ T ( Ax − b) = 0, then
c T x = c T − λ T ( Ax − b) = (c T − λ T a) x + λ T b = λ T b
and so by weak duality, x and λ are optimal.
(3.15)
4
Simplex Method
4.1
Basic Solutions
Maximize c T x subject to Ax = b, x ≥ 0, A ∈ Rm×n .
Call a solution x ∈ Rn of Ax = b basic if it has at most m non-zero
entries, that is, there exists B ⊆ {1, . . . , n} with | B| = m and xi = 0 if
i∈
/ B.
A basic solution x with x ≥ 0 is called a basic feasible solution
(bfs).
4.2
Extreme Points and Optimal Solutions
We make the following assumptions:
(i) The rows of A are linearly independent
(ii) Every set of m columns of A are linearly independent.
(iii) Every basic solution is non-degenerate - that is, it has exactly m
non-zero entries.
Theorem 4.1. x is a bfs of Ax = b if and only if it is an extreme point of
the set X (b) = { x : Ax = b, x ≥ 0}.
Theorem 4.2. If the problem has a finite optimum (feasible and bounded),
then it has an optimal solution that is a bfs.
16
andrew tulloch
4.3
The Simplex Tableau
Let A ∈ Rm×n , b ∈ Rm . Let B be a basis (in the bfs sense), and
x ∈ Rn , such that Ax = b. Then
AB xB + A N x N = b
(4.1)
where A B ∈ Rm×m and A N ∈ Rm×(n−m) respectively consist of the
columns of A indexed by B and those not indexed by B. Moreover,
if x is a basic solution, then there is a basic B such that x N = 0 and
A B x B = b, and if x is a basic feasible solution, there is a basis B such
that xn = 0, A B x B = b, and x B ≥ 0.
For every x with Ax = b and every basis B, we have
1
x B = A−
B (b − A N x N )
(4.2)
as we assume that A B has full rank. Thus,
f ( x ) = c T x = c TB x B + c TN x N
(4.3)
1
T
= c TB A−
B (b − A N X N ) + c N x N
(4.4)
T −1
1
T
= CBT A−
B b + (c N − c B A B A N ) x N
(4.5)
−1
1
?
?
Assume we can guarantee that A−
B b = 0. Then x with x B = A B b
and x ?N = 0 is a bfs with
1
f ( x ? ) = CBT A−
B b
(4.6)
Assume that we are maximizing c T x. There are two different cases:
T − C T A−1 A ≤ 0, then f ( x ) ≤ c T A−1 b for every feasible x, so
(i) If CN
N
B B
B B
x ? is optimal.
1
(ii) If (c TN − c TB A−
B A N )i > 0, then we can increase the objective value
by increasing the corresponding row of ( x N )i .
4.4 The Simplex Method in Tableau Form
5
Advanced Simplex Procedures
5.1
The Two-Phase Simplex Method
Finding an initial bfs is easy if the constraints have the form Ax = b
where b ≥ 0, as
Ax + z = b, ( x, z) = (0, b)
5.2
(5.1)
Gomory’s Cutting Plane Method
Used in integer programming (IP). This is a linear program where in
addition some of the variables are required to be integral.
Assume that for a given integer program we have found an op1
timal fractional solution x ? with basis B and let aij = ( A−
B A j ) and
1
?
ai0 = ( A−
B b ) be the entries of the final tableau. If x is not integral,
the for some row i, ai0 is not integral. For every feasible solution x,
xi =
∑ baij cx j ≤ xi + ∑
j ∈N
j ∈N
aij x j = ai0 .
(5.2)
If x is integral, then the left hand side is integral as well, and the
inequality must still hold if the right hand side is rounded down.
Thus,
xi +
∑ baij cx j ≤ bai0 c.
(5.3)
j ∈N
Then, we can add this constraint to the problem and solve the
augmented program. One can show that this procedure converges in
a finite number of steps.
6
Complexity of Problems and Algorithms
6.1
Asymptotic Complexity
We measure complexity as a function of input size. The input of
a linear programming problem: c ∈ Rn , A ∈ Rm×n , b ∈ Rm is
represented in (n + m · n + m) · k bits if we represent each number
using k bits.
For two functions f : R → R and g : R → R write
f (n) = O( g(n))
(6.1)
if there exists c, n0 such that for all n ≥ n0 ,
f (n) ≤ c · g(n)
, . . . (similarly for Ω →≥, and Θ → (Ω + O))
(6.2)
7
The Complexity of Linear Programming
7.1
A Lower Bound for the Simplex Method
Theorem 7.1. There exists a LP of size O(n2 ), a pivoting rule, and an
initial BFS such that the simplex method requires 2n − 1 iterations.
Proof. Consider the unit cube in Rn , given by constraints 0 ≤ xi ≤ 1
for i = 1, . . . , n. Define a spanning path inductively as follows. In
dimension 1, go from x1 = 0 to x1 = 1. In dimension k, set xk = 0 and
follow the path for dimension 1, . . . , k − 1. Then set x = 1, and follow
the path for dimension 1, . . . , k − 1 backwards.
The objective xn currently increases only once. Instead consider
the perturbed unit cube given by the constraints e ≤ x1 ≤ 1, exi−1 ≤
xi ≤ 1 − exi−1 with e ∈ (0, 12 ).
7.2
The Idea for a New Method
min{c T x : Ax = b, x ≥ 0}
(7.1)
max{b T λ : A T λ ≤ c}
(7.2)
By strong duality, each of these problems has a bounded optimal
22
andrew tulloch
solution if and only if the following set of constraints is feasible:
cT x = bT λ
(7.3)
Ax = b
(7.4)
x≥0
(7.5)
AT λ ≤ c
(7.6)
It is thus enough to decide, for a given A0 ∈ Rm×n and b0 ∈ Rm ,
whether { x ∈ Rn : Ax ≥ b} 6= ∅.
Definition 7.2. A symmetric matrix D ∈ Rn×n is called positive
definite if x T Dx > 0 for every x ∈ Rn . Alternatively, all eigenvalues
of the matrix are strictly positive.
Definition 7.3. A set E ⊆ Rn given by
E = E(z, D ) = { x ∈ Rn : ( x − z) T D ( x − z) ≤ 1}
(7.7)
for a positive definite symmetric D ∈ Rn×n and z ∈ Rn is called an
ellipsoid with center z.
Let P = { x ∈ Rn : Ax ≥ b} for some A ∈ Rm×n and b ∈ Rm .
To decide whether P 6= ∅, we generate a sequence { Et } of ellipsoids
Et with centers xt . If xt ∈ P, then P is non-empty and the method
stops. If xt ∈
/ P, then one of the constraints is violated - so there exists
a row j of A such that a Tj xt < b j . Therefore, P is contained in the
half-space { x ∈ Rn : a Tj x ≥ a Tj xt , and in particular the intersection of
this half-space with Et , which we will call a half-ellipsoid.
Theorem 7.4. Let E = E(z, D ) be an ellipsoid in Rn and a ∈ Rn 6= 0.
Consider the half-space H = { x ∈ Rn | a T x ≥ a T z}, and let
1
Da
√
n + 1 a T Da
2
n
2 Daa T D
D0 = 2
D−
n + 1 a T Da
n −1
z0 = z +
(7.8)
(7.9)
Then D 0 is symmetric and positive definite, and therefore E0 = E(z0 , D 0 )
−1
is an ellipsoid. Moreover, E ∩ H ⊆ E0 and Vol( E0 ) < e 2(n+1) Vol( E).
8
Graphs and Flows
8.1
Introduction
Consider a directed graph (network) G = (V, E), Vthe set of vertices,
E ⊆ V × V a set of edges. Undirected if E is symmetric.
8.2
Minimal Cost Flows
Fill in this stuff from lectures
9
Transportation and Assignment Problems
10
Non-Cooperative Games
Theorem 10.1 (von Neumann, 1928). Let P ∈ Rm×n . Then
max min p( x, y) = min max p( x, y)
x ∈ X y ∈Y
y ∈Y x ∈ X
(10.1)
11
Strategic Equilibrium
Definition 11.1. x ∈ X is a best response to y ∈ Y if p( x, y) =
maxx0 ∈X p( x 0 , y). ( x, y) ∈ X × Y is a Nash equilibrium if x is a best
response to y and y is a best response to x.
Theorem 11.2. ( x, y) ∈ X × Y is an equilibrium of the matrix game P if
and only if
min p( x, y0 ) = max min p( x 0 , y0 )
(11.1)
max p( x 0 , y) = min max p( x 0 , y0 ).
(11.2)
y 0 ∈Y
x 0 ∈ X y 0 ∈Y
and
x∈X0
y 0 ∈Y x 0 ∈ X
Theorem 11.3. Let ( x, y), ( x 0 , y0 ) ∈ X × Y be equilibria of the matrix game
with payoff matrix P. Then p( x, y) = p( x 0 , y0 ) and ( x, y0 ) and ( x 0 , y) are
equilibria as well.
Theorem 11.4 (Nash, 1951). Every bimatrix game has an equilibrium.
12
Bibliography