ANDREW TULLOCH M AT H E M AT I C S O F O P E R AT I O N S R E S E A R C H TRINITY COLLEGE THE UNIVERSITY OF CAMBRIDGE Contents 1 Generalities 2 Constrained Optimization 5 2.1 Lagrangian Multipliers 2.2 Lagrange Dual 7 7 8 2.3 Supporting Hyperplanes 3 8 Linear Programming 11 3.1 Convexity and Strong Duality 3.2 Linear Programs 12 3.3 Linear Program Duality 12 3.4 Complementary Slackness 4 11 Simplex Method 15 4.1 Basic Solutions 15 13 4.2 Extreme Points and Optimal Solutions 4.3 The Simplex Tableau 15 16 4.4 The Simplex Method in Tableau Form 16 4 5 6 andrew tulloch Advanced Simplex Procedures 17 5.1 The Two-Phase Simplex Method 17 5.2 Gomory’s Cutting Plane Method 17 Complexity of Problems and Algorithms 6.1 Asymptotic Complexity 7 19 The Complexity of Linear Programming 7.1 A Lower Bound for the Simplex Method 7.2 The Idea for a New Method 8 Graphs and Flows 8.1 Introduction 21 21 21 23 23 8.2 Minimal Cost Flows 9 19 23 Transportation and Assignment Problems 10 Non-Cooperative Games 11 Strategic Equilibrium 12 Bibliography 31 27 29 25 1 Generalities www.statslab.cam.ac.uk/~ff271/teachin/mor [email protected] 2 Constrained Optimization Minimize f ( x ) subject to h( x ) = b, x ∈ X. Objective function f : Rn → R Vector x ∈ Rn of decision variables, Functional constraint where h : Rn − > Rm , b ∈ Rm Regional constraint where X ⊆ Rn . Definition 2.1. The feasible set is X (b) = { x ∈ X : h( x ) = b}. An inequality of the form g( x ) ≤ b can be written as g( x ) + z = b, where z ∈ Rm called a slack variable with regional constraint z ≥ 0. 2.1 Lagrangian Multipliers Definition 2.2. Define the Lagrangian of a problem as L( x, λ) = f ( x ) − λ T (h( x ) − b) (2.1) where λ ∈ Rm is a vector of Lagrange multipliers Theorem 2.3 (Lagrangian Sufficiency Theorem). Let x ∈ X and λ ∈ Rm such that L( x, λ) = inf L( x 0 , λ) x0 ∈X and h( x ) = b. Then x is optimal for P. (2.2) 8 andrew tulloch Proof. min f ( x 0 ) = min [ f ( x ) − λ T (h( x 0 ) − b)] x 0 ∈ X (b) x 0 ∈ X (b) ≥ min [ f ( x 0 ) − λ T (h( x 0 ) − b)] 0 (2.4) = f ( x ) − λT (h( x ) − b) (2.5) = f (x) (2.6) x ∈X 2.2 (2.3) Lagrange Dual Definition 2.4. Let φ(b) = inf f ( x ). x ∈ X (b) (2.7) Define the Lagrange dual function g : Rm → R with g(λ) = in f x∈X L( x, λ) (2.8) Then, for all λ ∈ Rm , inf f ( x ) = inf L( x, λ) ≥ inf L( x, λ) = g(λ) x ∈ X (b) x ∈ X (b) x∈X (2.9) That is, g(λ) is a lower bound on our optimization function. This motivates the dual problem to maximize g(λ) subject to λ ∈ Y, where Y = {λ ∈ Rm : g(λ) > −∞}. Theorem 2.5 (Duality). From (2.9), we see that the optimal value of the primal is always greater than the optimal value of the dual. This is weak duality. 2.3 Supporting Hyperplanes Fix b ∈ Rm and consider φ as a function of c ∈ Rm . Further consider the hyperplane given by α : Rm → R with α(c) = β − λ T (b − c) Now, try to find φ(b) as follow. (2.10) mathematics of operations research (i) For each λ, find β λ = max{ β : α(c) ≤ φ(c), ∀c ∈ Rm } (2.11) (ii) Choose λ to maximize β λ Definition 2.6. Call α : Rm → R a supporting hyperplane to φ at b if α(c) = φ(b) − λ T (b − c) (2.12) φ(c) ≥ φ(b) − λ T (b − c) (2.13) and for all c ∈ Rm . Theorem 2.7. The following are equivalent (i) There exists a (non-vertical) supporting hyperplane to φ at b, (ii) XXXXXXXXXXXXXXXXXXXXXXXXXX 9 3 Linear Programming 3.1 Convexity and Strong Duality Definition 3.1. Let S ⊆ Rn . S is convex if for all δ ∈ [0, 1], x, y ∈ S implies that δx + (1 − δ)y ∈ S. f : S → R is convex if for all x, y ∈ S and δ ∈ [0, 1], δ f ( x ) + (1 − δ) f (y) ≥ f (δx + (1 − δ)y. Visually, the area under the function is a convex set. Definition 3.2. x ∈ S is an interior point of S if there exists e > 0 such {y : ky − x k2 ≤ e} ⊆ S. x ∈ S is an extreme point of S if for all y, z ∈ S and S ∈ (0, 1), x = δy + (1 − δ)z implies x = y = z. Theorem 3.3 (Supporting Hyperplane). Suppose that our function φ is convex and b lies in the interior of the set of points where φ is finite. Then there is a (non-vertical) supporting hyperplane to φ at b. Theorem 3.4. Let X (b) = { x ∈ X : h( x ) ≤ b}, φ(b) = infx∈X (b) f ( x ). Then φ is convex if X, f , and h are convex. Proof. Let b1 , b2 ∈ Rm such that φ(b1 ), φ(b2 ) are defined. Let δ ∈ [0, 1] and b = δb1 + (1 − δ)b2 . Consider x1 ∈ X (b1 ), x2 ∈ X (b2 ) and let x = δx1 + (1 − δ) x2 . 12 andrew tulloch By convexity of Y, x ∈ X. By convexity of h, h( x ) = h(δx1 + (1 − δ) x2 ) ≤ δh( x1 ) + (1 − δ)h( x2 ) ≤ δb1 + (1 − δ)b2 = b Thus x ∈ X (b), and by convexity of f , φ(b) ≤ f ( x ) = f (δx1 + (1 − δ) x2 ) ≤ δ f ( x1 ) + (1 − δ ) f ( x2 ) ≤ δφ(b1 ) + (1 − δ)φ(b2 ) 3.2 Linear Programs Definition 3.5. General form of a linear program is min{c T x : Ax ≥ b, x ≥ 0} (3.1) Standard form of a linear program is min{c T x : Ax = b, x ≥ 0} (3.2) 3.3 Linear Program Duality Introduce slack variables to turn problem P into the form min{c T x : Ax − z = b, x, z ≥ 0} (3.3) We have X = {( x, z) : x, z ≥ 0} ⊆ Rm+n . The Lagrangian is L(( x, z), λ) = c T x − λ T ( Ax − z − b) = (c T − λ T A) x + λ T z + λ T b (3.4) (3.5) mathematics of operations research and has a finite minimum if and only if λ ∈ Y = {λ : c T − λ T A ≥ 0, λ ≥ 0} (3.6) For a fixed λ ∈ Y, the minimum of L is satisfied when (c T − λ T A) x = 0 and λ T z = 0, and thus g(λ) = inf L(( x, z), λ) = λ T b ( x,z)∈ X (3.7) We obtain that the dual problem max{λ T b : A T λ ≤ c, λ ≥ 0} (3.8) and it can be shown (exercise) that the dual of the dual of a linear program is the original program. 3.4 Complementary Slackness Theorem 3.6. Let x and λ be feasible for P and its dual. Then x and λ are optimal if and only if (c T − λ T A) x = 0 (3.9) λ T ( Ax − b) = 0 (3.10) Proof. If x, λ are optimal, then cT x = λT b |{z} (3.11) by strong duality = inf (c T x 0 − λ T ( Ax 0 − b)) ≤ c T x − 0 λ T ( Ax − b) | {z } x ∈X ≤ cT x primal and dual optimality (3.12) Then the inequalities must be equalities. Thus λ T b = c T x − λ T ( Ax − b) = (c T − λ T A) x +λ T b | {z } (3.13) c T x − λ T ( Ax − b) = c T x | {z } (3.14) =0 and =0 13 14 andrew tulloch If (c T − λ T A) x = 0 and λ T ( Ax − b) = 0, then c T x = c T − λ T ( Ax − b) = (c T − λ T a) x + λ T b = λ T b and so by weak duality, x and λ are optimal. (3.15) 4 Simplex Method 4.1 Basic Solutions Maximize c T x subject to Ax = b, x ≥ 0, A ∈ Rm×n . Call a solution x ∈ Rn of Ax = b basic if it has at most m non-zero entries, that is, there exists B ⊆ {1, . . . , n} with | B| = m and xi = 0 if i∈ / B. A basic solution x with x ≥ 0 is called a basic feasible solution (bfs). 4.2 Extreme Points and Optimal Solutions We make the following assumptions: (i) The rows of A are linearly independent (ii) Every set of m columns of A are linearly independent. (iii) Every basic solution is non-degenerate - that is, it has exactly m non-zero entries. Theorem 4.1. x is a bfs of Ax = b if and only if it is an extreme point of the set X (b) = { x : Ax = b, x ≥ 0}. Theorem 4.2. If the problem has a finite optimum (feasible and bounded), then it has an optimal solution that is a bfs. 16 andrew tulloch 4.3 The Simplex Tableau Let A ∈ Rm×n , b ∈ Rm . Let B be a basis (in the bfs sense), and x ∈ Rn , such that Ax = b. Then AB xB + A N x N = b (4.1) where A B ∈ Rm×m and A N ∈ Rm×(n−m) respectively consist of the columns of A indexed by B and those not indexed by B. Moreover, if x is a basic solution, then there is a basic B such that x N = 0 and A B x B = b, and if x is a basic feasible solution, there is a basis B such that xn = 0, A B x B = b, and x B ≥ 0. For every x with Ax = b and every basis B, we have 1 x B = A− B (b − A N x N ) (4.2) as we assume that A B has full rank. Thus, f ( x ) = c T x = c TB x B + c TN x N (4.3) 1 T = c TB A− B (b − A N X N ) + c N x N (4.4) T −1 1 T = CBT A− B b + (c N − c B A B A N ) x N (4.5) −1 1 ? ? Assume we can guarantee that A− B b = 0. Then x with x B = A B b and x ?N = 0 is a bfs with 1 f ( x ? ) = CBT A− B b (4.6) Assume that we are maximizing c T x. There are two different cases: T − C T A−1 A ≤ 0, then f ( x ) ≤ c T A−1 b for every feasible x, so (i) If CN N B B B B x ? is optimal. 1 (ii) If (c TN − c TB A− B A N )i > 0, then we can increase the objective value by increasing the corresponding row of ( x N )i . 4.4 The Simplex Method in Tableau Form 5 Advanced Simplex Procedures 5.1 The Two-Phase Simplex Method Finding an initial bfs is easy if the constraints have the form Ax = b where b ≥ 0, as Ax + z = b, ( x, z) = (0, b) 5.2 (5.1) Gomory’s Cutting Plane Method Used in integer programming (IP). This is a linear program where in addition some of the variables are required to be integral. Assume that for a given integer program we have found an op1 timal fractional solution x ? with basis B and let aij = ( A− B A j ) and 1 ? ai0 = ( A− B b ) be the entries of the final tableau. If x is not integral, the for some row i, ai0 is not integral. For every feasible solution x, xi = ∑ baij cx j ≤ xi + ∑ j ∈N j ∈N aij x j = ai0 . (5.2) If x is integral, then the left hand side is integral as well, and the inequality must still hold if the right hand side is rounded down. Thus, xi + ∑ baij cx j ≤ bai0 c. (5.3) j ∈N Then, we can add this constraint to the problem and solve the augmented program. One can show that this procedure converges in a finite number of steps. 6 Complexity of Problems and Algorithms 6.1 Asymptotic Complexity We measure complexity as a function of input size. The input of a linear programming problem: c ∈ Rn , A ∈ Rm×n , b ∈ Rm is represented in (n + m · n + m) · k bits if we represent each number using k bits. For two functions f : R → R and g : R → R write f (n) = O( g(n)) (6.1) if there exists c, n0 such that for all n ≥ n0 , f (n) ≤ c · g(n) , . . . (similarly for Ω →≥, and Θ → (Ω + O)) (6.2) 7 The Complexity of Linear Programming 7.1 A Lower Bound for the Simplex Method Theorem 7.1. There exists a LP of size O(n2 ), a pivoting rule, and an initial BFS such that the simplex method requires 2n − 1 iterations. Proof. Consider the unit cube in Rn , given by constraints 0 ≤ xi ≤ 1 for i = 1, . . . , n. Define a spanning path inductively as follows. In dimension 1, go from x1 = 0 to x1 = 1. In dimension k, set xk = 0 and follow the path for dimension 1, . . . , k − 1. Then set x = 1, and follow the path for dimension 1, . . . , k − 1 backwards. The objective xn currently increases only once. Instead consider the perturbed unit cube given by the constraints e ≤ x1 ≤ 1, exi−1 ≤ xi ≤ 1 − exi−1 with e ∈ (0, 12 ). 7.2 The Idea for a New Method min{c T x : Ax = b, x ≥ 0} (7.1) max{b T λ : A T λ ≤ c} (7.2) By strong duality, each of these problems has a bounded optimal 22 andrew tulloch solution if and only if the following set of constraints is feasible: cT x = bT λ (7.3) Ax = b (7.4) x≥0 (7.5) AT λ ≤ c (7.6) It is thus enough to decide, for a given A0 ∈ Rm×n and b0 ∈ Rm , whether { x ∈ Rn : Ax ≥ b} 6= ∅. Definition 7.2. A symmetric matrix D ∈ Rn×n is called positive definite if x T Dx > 0 for every x ∈ Rn . Alternatively, all eigenvalues of the matrix are strictly positive. Definition 7.3. A set E ⊆ Rn given by E = E(z, D ) = { x ∈ Rn : ( x − z) T D ( x − z) ≤ 1} (7.7) for a positive definite symmetric D ∈ Rn×n and z ∈ Rn is called an ellipsoid with center z. Let P = { x ∈ Rn : Ax ≥ b} for some A ∈ Rm×n and b ∈ Rm . To decide whether P 6= ∅, we generate a sequence { Et } of ellipsoids Et with centers xt . If xt ∈ P, then P is non-empty and the method stops. If xt ∈ / P, then one of the constraints is violated - so there exists a row j of A such that a Tj xt < b j . Therefore, P is contained in the half-space { x ∈ Rn : a Tj x ≥ a Tj xt , and in particular the intersection of this half-space with Et , which we will call a half-ellipsoid. Theorem 7.4. Let E = E(z, D ) be an ellipsoid in Rn and a ∈ Rn 6= 0. Consider the half-space H = { x ∈ Rn | a T x ≥ a T z}, and let 1 Da √ n + 1 a T Da 2 n 2 Daa T D D0 = 2 D− n + 1 a T Da n −1 z0 = z + (7.8) (7.9) Then D 0 is symmetric and positive definite, and therefore E0 = E(z0 , D 0 ) −1 is an ellipsoid. Moreover, E ∩ H ⊆ E0 and Vol( E0 ) < e 2(n+1) Vol( E). 8 Graphs and Flows 8.1 Introduction Consider a directed graph (network) G = (V, E), Vthe set of vertices, E ⊆ V × V a set of edges. Undirected if E is symmetric. 8.2 Minimal Cost Flows Fill in this stuff from lectures 9 Transportation and Assignment Problems 10 Non-Cooperative Games Theorem 10.1 (von Neumann, 1928). Let P ∈ Rm×n . Then max min p( x, y) = min max p( x, y) x ∈ X y ∈Y y ∈Y x ∈ X (10.1) 11 Strategic Equilibrium Definition 11.1. x ∈ X is a best response to y ∈ Y if p( x, y) = maxx0 ∈X p( x 0 , y). ( x, y) ∈ X × Y is a Nash equilibrium if x is a best response to y and y is a best response to x. Theorem 11.2. ( x, y) ∈ X × Y is an equilibrium of the matrix game P if and only if min p( x, y0 ) = max min p( x 0 , y0 ) (11.1) max p( x 0 , y) = min max p( x 0 , y0 ). (11.2) y 0 ∈Y x 0 ∈ X y 0 ∈Y and x∈X0 y 0 ∈Y x 0 ∈ X Theorem 11.3. Let ( x, y), ( x 0 , y0 ) ∈ X × Y be equilibria of the matrix game with payoff matrix P. Then p( x, y) = p( x 0 , y0 ) and ( x, y0 ) and ( x 0 , y) are equilibria as well. Theorem 11.4 (Nash, 1951). Every bimatrix game has an equilibrium. 12 Bibliography
© Copyright 2026 Paperzz