College of Management, NCTU Operation Research II Spring, 2009 Chap12 Nonlinear Programming General Form of Nonlinear Programming Problems Max f(x) S.T. gi(x) ≤ bi for i = 1,…, m x≥0 9 No algorithm that will solve every specific problem fitting this format is available. An Example – The Product-Mix Problem with Price Elasticity 9 The amount of a product that can be sold has an inverse relationship to the price charged. That is, the relationship between demand and price is an inverse curve. 9 The firm’s profit from producing and selling x units is the sales revenue xp(x) minus the production costs. That is, P(x) = xp(x) – cx. 9 If each of the firm’s products has a similar profit function, say, Pj(xj) for producing and selling xj units of product j, then the overall objective function is n f(x) = ∑ P (x j =1 j j ) , a sum of nonlinear functions. 9 Nonlinearities also may arise in the gi(x) constraint function. Jin Y. Wang Chap12-1 College of Management, NCTU Operation Research II Spring, 2009 An Example – The Transportation Problem with Volume Discounts 9 Determine an optimal plan for shipping goods from various sources to various destinations, given supply and demand constraints. 9 In actuality, the shipping costs may not be fixed. Volume discounts sometimes are available for large shipments, which cause a piecewise linear cost function. Graphical Illustration of Nonlinear Programming Problems Max Z = 3x1 + 5x2 S.T. x1 ≤ 4 9x12 + 5x22 ≤ 216 x1, x2 ≥ 0 9 The optimal solution is no longer a CPF anymore. (Sometimes, it is; sometimes, it isn’t). But, it still lies on the boundary of the feasible region. ¾ We no longer have the tremendous simplification used in LP of limiting the search for an optimal solution to just the CPF solutions. 9 What if the constraints are linear; but the objective function is not? Jin Y. Wang Chap12-2 College of Management, NCTU Operation Research II Spring, 2009 Max Z = 126x1 – 9x12 + 182x2 – 13x22 ≤ 4 S.T. x1 2x2 ≤ 12 3x1 + 2x2 ≤ 18 x1, x2 ≥ 0 9 What if we change the objective function to 54x1 – 9x12 + 78x2 – 13x22 9 The optimal solution lies inside the feasible region. 9 That means we cannot only focus on the boundary of feasible region. We need to look at the entire feasible region. The local optimal needs not to be global optimal--Complicate further Jin Y. Wang Chap12-3 College of Management, NCTU Operation Research II Spring, 2009 9 Nonlinear programming algorithms generally are unable to distinguish between a local optimal and a global optimal. 9 It is desired to know the conditions under which any local optimal is guaranteed to be a global optimal. If a nonlinear programming problem has no constraints, the objective function being concave (convex) guarantees that a local maximum (minimum) is a global maximum (minimum). 9 What is a concave (convex) function? 9 A function that is always “curving downward” (or not curving at all) is called a concave function. 9 A function is always “curving upward” (or not curving at all), it is called a convex function. 9 This is neither concave nor convex. Definition of concave and convex functions of a single variable 9 A function of a single variable f (x) is a convex function, if for each pair of values of x, say, x ' and x '' ( x ' < x '' ), f [λx '' + (1 − λ ) x ' ] ≤ λf ( x '' ) + (1 − λ ) f ( x ' ) for all value of λ such that 0 < λ < 1 . 9 It is a strictly convex function if ≤ can be replaced by <. 9 It is a concave function if this statement holds when ≤ is replaced by ≥ (> for the case of strict concave). Jin Y. Wang Chap12-4 College of Management, NCTU Operation Research II Spring, 2009 9 The geometric interpretation of concave and convex functions. How to judge a single variable function is convex or concave? 9 Consider any function of a single variable f(x) that possesses a second derivative at all possible value of x. Then f(x) is d 2 f ( x) ≥ 0 for all possible value of x. convex if and only if dx 2 concave if and only if d 2 f ( x) ≤ 0 for all possible value of x. dx 2 How to judge a two-variables function is convex or concave? 9 If the derivatives exist, the following table can be used to determine a two-variable function is concave of convex. (for all possible values of x1 and x2) Quantity Convex Concave ≥ 0 ≥ 0 ∂ 2 f ( x1 , x 2 ) ∂x12 ≥ 0 ≤ 0 ∂ 2 f ( x1 , x 2 ) ∂x 22 ≥ 0 ≤ 0 ∂ 2 f ( x1 , x 2 ) ∂ 2 f ( x1 , x 2 ) ⎡ ∂ 2 f ( x1 , x 2 ) ⎤ −⎢ ⎥ ∂x12 ∂x 22 ⎣ ∂x1 ∂x 2 ⎦ Jin Y. Wang 2 Chap12-5 College of Management, NCTU Operation Research II Spring, 2009 9 Example: f ( x1 , x 2 ) = x12 − 2 x1 x 2 + x 22 How to judge a multi-variables function is convex or concave? 9 The sum of convex functions is a convex function, and the sum of concave functions is a concave function. 9 Example: f(x1, x2, x3) = 4x1 – x12 – (x2 – x3)2 = [4x1– x12] + [–(x2 – x3)2] If there are constraints, then one more condition will provide the guarantee, namely, that the feasible region is a convex set. Convex set 9 A convex set is a collection of points such that, for each pair of points in the collection, the entire line segment joining these two points is also in the collection. 9 In general, the feasible region for a nonlinear programming problem is a convex set whenever all the gi(x) (for the constraints gi(x) ≤ bi) are convex. Max Z = 3x1 + 5x2 S.T. x1 ≤ 4 9x12 + 5x22 ≤ 216 x1, x2 ≥ 0 Jin Y. Wang Chap12-6 College of Management, NCTU Operation Research II Spring, 2009 9 What happens when just one of these gi(x) is a concave function instead? Max Z = 3x1 + 5x2 ≤ 4 S.T. x1 ≤ 14 2x2 8x1 – x12 + 14x2 – x22 ≤ 49 x1, x2 ≥ 0 ¾ The feasible region is not a convex set. ¾ Under this circumstance, we cannot guarantee that a local maximum is a global maximum. Condition for local maximum = global maximum (with gi(x) ≤ bi constraints). 9 To guarantee that a local maximum is a global maximum for a nonlinear programming problem with constraint gi(x) ≤ bi and x ≥ 0, the objective function f(x) must be a concave function and each gi(x) must be a convex function. 9 Such a problem is called a convex programming problem. One-Variable Unconstrained Optimization 9 The differentiable function f(x) to be maximized is concave. 9 The necessary and sufficient condition for x = x* to be optimal (a global max) is df * = 0 , at x = x . dx 9 It is usually not very easy to solve the above equation analytically. 9 The One-Dimensional Search Procedure. ¾ Fining a sequence of trial solutions that leads toward an optimal solution. ¾ Using the signs of derivative to determine where to move. Positive derivative indicates that x* is greater than x; and vice versa. Jin Y. Wang Chap12-7 College of Management, NCTU Operation Research II Spring, 2009 The Bisection Method 9 Initialization: Select ε (error tolerance). Find an initial x (lower bound on x* ) and x (upper bound on x* ) by inspection. Set the initial trial solution x ' = x+x . 2 9 Iteration: ¾ Evaluate df ( x) at x = x ' . dx df ( x) ' ≥ 0 , reset x = x . dx df ( x) ' ≤ 0 , reset x = x . ¾ If dx ¾ If ' ¾ Select a new x = x+x . 2 9 Stopping Rule: If x − x ≤ 2ε , so that the new x ' must be within ε of x * , stops. Otherwise, perform another iteration. 9 Example: Max f(x) = 12x – 3x4 – 2x6 0 1 2 3 4 5 6 7 Jin Y. Wang df(x)/dx x x 4.09 -2.19 1.31 -0.34 0.51 0.75 0.75 0.8125 0.8125 0.828125 1 0.875 0.875 0.84375 0.84375 New x ' 0.875 0.8125 0.84375 0.828125 0.8359375 f (x' ) 7.8439 7.8672 7.8829 7.8815 7.8839 Chap12-8 College of Management, NCTU Operation Research II Spring, 2009 Newton’s Method 9 The bisection method converges slowly. ¾ Only take the information of first derivative into account. 9 The basic idea is to approximate f(x) within the neighborhood of the current trial solution by a quadratic function and then to maximize (or minimize) the approximate function exactly to obtain the new trial solution. 9 This approximating quadratic function is obtained by truncating the Taylor series after the second derivative term. f '' ( xi ) f ( xi +1 ) ≈ f ( xi ) + f ( xi )( xi +1 − xi ) + ( xi +1 − xi ) 2 2 ' 9 This quadratic function can be optimized in the usual way by setting its first derivative to zero and solving for xi+1. Thus, xi +1 = xi − f ' ( xi ) . f '' ( x i ) 9 Stopping Rule: If xi +1 − xi ≤ ε , stop and output xi+1. 9 Example: Max f(x) = 12x – 3x4 – 2x6 (same as the bisection example) ¾ xi +1 = xi − f ' ( xi ) = f '' ( x i ) ¾ Select ε = 0.00001, and choose x1 = 1. f ( xi ) Iteration i xi 1 2 3 4 0.84003 0.83763 7.8838 7.8839 f ' ( xi ) f '' ( xi ) -0.1325 -0.0006 -55.279 -54.790 xi+1 0.83763 0.83762 Multivariable Unconstrained Optimization 9 Usually, there is no analytical method for solving the system of equations given by setting the respective partial derivatives equal to zero. 9 Thus, a numerical search procedure must be used. Jin Y. Wang Chap12-9 College of Management, NCTU Operation Research II Spring, 2009 The Gradient Search Procedure (for multivariable unconstrained maximization problems) 9 The goal is to reach a point where all the partial derivatives are 0. 9 A natural approach is to use the values of the partial derivatives to select the specific direction in which to move. 9 The gradient at point x = x’ is ∇f (x) = ( ∂f ∂f ∂f ’ , ,..., ) at x = x . ∂x1 ∂x x ∂xn 9 The direction of the gradient is interpreted as the direction of the directed line segment from the origin to the point ( ∂f ∂f ∂f , ,..., ) , which is the direction of ∂x1 ∂x x ∂x n changing x that will maximize f(x) change rate. 9 However, normally it would not be practical to change x continuously in the direction of ∇f (x), because this series of changes would require continuously reevaluating the ∂f and changing the direction of the path. ∂xi 9 A better approach is to keep moving in a fixed direction from the current trial solution, not stopping until f(x) stops increasing. 9 The stopping point would be the next trial solution and reevaluate gradient. The gradient would be recalculated to determine the new direction in which to move. ¾ Reset x’ = x’ + t* ∇f (x’), where t* is the positive value that maximizes f(x’+t* ∇ f(x’)) = 9 The iterations continue until ∇f ( x) = 0 with a small tolerance ε . Summary of the Gradient Search Procedures 9 Initialization: Select ε and any initial trail solution x’. Go first to the stopping rule. 9 Step 1: Express f(x’+t ∇ f(x’)) as a function of t by setting x j = x 'j + t ( ∂f ) ' , for ∂x j x = x j = 1, 2,…, n, and then substituting these expressions into f(x). 9 Step 2: Use the one-dimensional search procedure to find t = t* that maximizes f(x’+t ∇ f(x’)) over t ≥ 0. 9 Step 3: Reset x’ = x’ + t* ∇ f(x’). Then go to the stopping rule. Jin Y. Wang Chap12-10 College of Management, NCTU Operation Research II 9 Stopping Rule: Evaluate ∇ f(x’) at x = x’. Check if Spring, 2009 ∂f ≤ ε , for all j = 1,2,…, n. ∂xi If so, stop with the current x’ as the desired approximation of an optimal solution x*. Otherwise, perform another iteration. Example for multivariate unconstraint nonlinear programming Max f(x) = 2x1x2 + 2x2 – x12 – 2x22 ∂f ∂f = 2 x 2 − 2 x1 , = 2 x1 + 2 − 4 x 2 ∂x1 ∂x 2 . We verify that f(x) is Suppose pick x = (0, 0) as the initial trial solution. ∇f (0,0) = 9 Iteration 1: x = (0, 0) + t(0, 2) = (0, 2t) f ( x ' + t∇f ( x ' )) = f(0, 2t) = 9 Iteration 2: x = (0, 1/2) + t(1, 0) = (t, 1/2) 9 Usually, we will use a table for convenience purpose. Jin Y. Wang Chap12-11 College of Management, NCTU Iteration 1 2 x' Operation Research II ∇f ( x ' ) x ' + t ∇f ( x ' ) Spring, 2009 f ( x ' + t∇f ( x ' )) t* x ' + t * ∇f ( x ' ) For minimization problem 9 We move in the opposite direction. That is x’ = x’ – t* ∇ f(x’). 9 Another change is t = t* that minimize f(x’ – t ∇ f(x’)) over t ≥ 0 Necessary and Sufficient Conditions for Optimality (Maximization) Problem Necessary Condition Also Sufficient if: f(x) concave One-variable unconstrained df = 0 Multivariable unconstrained dx ∂f = 0 (j=1,2,…n) ∂xi f(x) concave General constrained problem KKT conditions f(x) is concave and gi(x) is convex The Karush-Kuhn-Tucker (KKT) Conditions for Constrained Optimization 9 Assumed that f(x), g1(x), g2(x), …, gm(x) are differentiable functions. Then x* = (x1*, x2*, …, xn*) can be an optimal solution for the nonlinear programming problem only if there exist m numbers u1, u2, …, um such that all the following KKT conditions are satisfied: (1) m ∂g ∂f * − ∑ u i i ≤ 0 , at x = x , for j = 1, 2, …, n ∂x j i =1 ∂x j (2) x *j ( m ∂g ∂f * − ∑ u i i ) = 0 , at x = x , for j = 1, 2, …, n ∂x j i =1 ∂x j (3) gi(x*) – bi ≤ 0, for i =1, 2, …, m (4) ui [gi(x*) – bi] = 0, for i =1, 2, …, m Jin Y. Wang Chap12-12 College of Management, NCTU Operation Research II Spring, 2009 (5) x *j ≥ 0 , for j =1, 2, …, m (6) u j ≥ 0 , for j =1, 2, …, m Corollary of KKT Theorem (Sufficient Conditions) 9 Note that satisfying these conditions does not guarantee that the solution is optimal. 9 Assume that f(x) is a concave function and that g1(x), g2(x), …, gm(x) are convex functions. Then x* = (x1*, x2*, … , xn*) is an optimal solution if and only if all the KKT conditions are satisfied. An Example Max f(x) = ln(x1 + 1) + x2 S.T. 2x1 + x2 ≤ 3 x1, x2 ≥ 0 n = 2; m = 1; g1(x) = 2x1 + x2 is convex; f(x) is concave. 1. ( j = 1) 1 − 2u1 ≤ 0 x1 + 1 2. ( j = 1) x1 ( 1 − 2u1 ) = 0 x1 + 1 1. (j = 2) 1 − u1 ≤ 0 2. (j = 2) x 2 (1 − u1 ) = 0 3. 2 x1 + x2 − 3 ≤ 0 4. u1 (2 x1 + x 2 − 3) = 0 5. x1 ≥ 0, x2 ≥ 0 6. u1 ≥ 0 9 Therefore, There exists a u1 = 1 such that x1 = 0, x2 = 3, and u1 = 1 satisfy KKT conditions. The optimal solution is (0, 3). How to solve the KKT conditions 9 Sorry, there is no easy way. 9 In the above example, there are 8 combinations for x1( ≥ 0), x2( ≥ 0), and u1( ≥ 0). Try each one until find a fit one. 9 What if there are lots of variables? 9 Let’s look at some easier (special) cases. Jin Y. Wang Chap12-13 College of Management, NCTU Operation Research II Spring, 2009 Quadratic Programming Max f(x) = cx – 1/2 xTQx S.T. Ax ≤ b x≥0 n 9 The objective function is f(x) = cx – 1/2 xTQx = ∑ c j x j − j =1 1 n n ∑∑ qij xi x j . 2 i =1 j =1 9 The qij are elements of Q. If i = j, then xixj = xj2, so –1/2qij is the coefficient of xj2. If i ≠ j, then –1/2(qij xixj + qji xjxi) = –qij xixj, so –qij is the coefficient for the product of xi and xj (since qij = qji). 9 An example Max f(x1, x2) = 15x1 + 30x2 + 4x1x2 – 2x12 – 4x22 S.T. x1 + 2x2 ≤ 30 x1, x2 ≥ 0 9 The KKT conditions for the above quadratic programming problem. 1. 2. 1. 2. 3. 4. 5. 6. (j = 1) (j = 1) (j = 2) (j = 2) Jin Y. Wang 15 + 4x2 – 4x1 – u1 ≤ 0 x1(15 + 4x2 – 4x1 – u1) = 0 30 + 4x1 – 8x2 – 2u1 ≤ 0 x2(30 + 4x1 – 8x2– 2u1) = 0 x1 + 2x2 – 30 ≤ 0 u1(x1 + 2x2 – 30) = 0 x1 ≥ 0, x2 ≥ 0 u1 ≥ 0 Chap12-14 College of Management, NCTU Operation Research II Spring, 2009 9 Introduce slack variables (y1, y2, and v1) for condition 1 (j=1), 1 (j=2), and 3. = –15 1. (j = 1) – 4x1 + 4x2 – u1 + y1 + y2 = –30 1. (j = 2) 4x1 – 8x2 – 2u1 + v1 = 30 3. x1 + 2x2 Condition 2 (j = 1) can be reexpressed as 2. (j = 1) x1y1 = 0 Similarly, we have 2. (j = 2) x2y2 = 0 4. u1v1 = 0 9 For each of these pairs—(x1, y1), (x2, y2), (u1, v1)—the two variables are called complementary variables, because only one of them can be nonzero. ¾ Combine them into one constraint x1y1 + x2y2 + u1v1 = 0, called the complementary constraint. 9 Rewrite the whole conditions 4x1 – 4x2 + u1 – y1 = 15 – y2 = 30 –4x1 + 8x2 + 2u1 + v1 = 30 x1 + 2x2 =0 x1y1 + x2y2 + u1v1 x1 ≥ 0, x2 ≥ 0, u1 ≥ 0, y1 ≥ 0, y2 ≥ 0, v1 ≥ 0 9 Except for the complementary constraint, they are all linear constraints. 9 For any quadratic programming problem, its KKT conditions have this form Qx + ATu – y = cT Ax + v = b x ≥ 0, u ≥ 0, y ≥ 0, v ≥ 0 x Ty + u Tv = 0 9 Assume the objective function (of a quadratic programming problem) is concave and constraints are convex (they are all linear). 9 Thus, x is optimal if and only if there exist values of y, u, and v such that all four vectors together satisfy all these conditions. 9 The original problem is thereby reduced to the equivalent problem of finding a feasible solution to these constraints. 9 These constraints are really the constraints of a LP except the complementary constraint. Why don’t we just modify the Simplex Method? Jin Y. Wang Chap12-15 College of Management, NCTU Operation Research II Spring, 2009 The Modified Simplex Method 9 The complementary constraint implies that it is not permissible for both complementary variables of any pair to be basic variables. 9 The problem reduces to finding an initial BF solution to any linear programming problem that has these constraints, subject to this additional restriction on the identify of the basic variables. 9 When cT ≤ 0 (unlikely) and b ≥ 0, the initial solution is easy to find. x = 0, u = 0, y = – cT, v = b 9 Otherwise, introduce artificial variable into each of the equations where cj > 0 or bi < 0, in order to use these artificial variables as initial basic variables ¾ This choice of initial basic variables will set x = 0 and u = 0 automatically, which satisfy the complementary constraint. 9 Then, use phase 1 of the two-phase method to find a BF solution for the real problem. ¾ That is, apply the simplex to (zi is the artificial variables) Min Z = ∑ z j j Subject to the linear programming constraints obtained from the KKT conditions, but with these artificial variables included. ¾ Still need to modify the simplex method to satisfy the complementary constraint. 9 Restricted-Entry Rule: ¾ Exclude from consideration any nonbasic variable to be the entering variable whose complementary variable already is a basic variable. ¾ Choice the other nonbasic variables according to the usual criterion. ¾ This rule keeps the complementary constraint satisfied all the time. 9 When an optimal solution x*, u*, y*, v*, z1 = 0, …, zn = 0 is obtained for the phase 1 problem, x* is the desired optimal solution for the original quadratic programming problem. Jin Y. Wang Chap12-16 College of Management, NCTU Operation Research II Spring, 2009 A Quadratic Programming Example Max 15x1 + 30x2 + 4x1x2 – 2x12 – 4x22 S.T. x1 + 2x2 ≤ 30 x1, x2 ≥ 0 Jin Y. Wang Chap12-17 College of Management, NCTU Operation Research II Spring, 2009 Constrained Optimization with Equality Constraints 9 Consider the problem of finding the minimum or maximum of the function f(x), subject to the restriction that x must satisfy all the equations g1(x) = b1 … gm(x) = bm 9 Example: Max f(x1, x2) = x12 + 2x2 S.T. g(x1, x2) = x12 + x22 = 1 9 A classical method is the method of Lagrange multipliers. m ¾ The Lagrangian function h( x, λ ) = f ( x) − ∑ λi [ g i ( x) − bi ] , where (λ1 , λ2 ,..., λm ) i =1 are called Lagrange multipliers. 9 For the feasible values of x, gi(x) – bi = 0 for all i, so h(x, λ ) = f(x). 9 The method reduces to analyzing h(x, λ ) by the procedure for unconstrained optimization. ¾ Set all partial derivative to zero m ∂g ∂h ∂f = − ∑ λi i = 0 , for j = 1, 2, …, n ∂x j ∂x j i =1 ∂x j ∂h = − g i ( x) + bi = 0, for i = 1, 2, …, m ∂λi ¾ Notice that the last m equations are equivalent to the constraints in the original problem, so only feasible solutions are considered. 9 Back to our example ¾ h(x1, x2) = x12 + 2x2 – λ ( x12 + x22 – 1). ¾ ∂h = ∂x1 ∂h = ∂x 21 ∂h = ∂λ Jin Y. Wang Chap12-18 College of Management, NCTU Operation Research II Spring, 2009 Other types of Nonlinear Programming Problems 9 Separable Programming ¾ It is a special case of convex programming with one additional assumption: f(x) and g(x) functions are separable functions. ¾ A separable function is a function where each term involves just a single variable. ¾ Example: f(x1, x2) = 126x1 – 9x12 + 182x2 – 13x22 = f1(x1) + f2(x2) f1(x1) = f2(x2) = ¾ Such problem can be closely approximated by a linear programming problem. Please refer to section 12.8 for details. 9 Geometric Programming ¾ The objective and the constraint functions take the form N g ( x) = ∑ ci Pi ( x) , where Pi ( x) = x1ai1 x 2ai 2 ...x3ai 3 for i = 1, 2, …, N i =1 ¾ When all the ci are strictly positive and the objective function is to be minimized, this geometric programming can be converted to a convex programming problem by setting x j = e y . j 9 Fractional Programming ¾ Suppose that the objective function is in the form of a (linear) fraction. Maximize f(x) = f1(x) / f2(x) = (cx + c0) / (dx + d0). ¾ Also assume that the constraints gi(x) are linear. Ax ≤ b, x ≥ 0. ¾ We can transform it to an equivalent problem of a standard type for which effective solution procedures are available. ¾ We can transform the problem to an equivalent linear programming problem by letting y = x / (dx + d0) and t = 1 / (dx + d0), so that x = y/t. ¾ The original formulation is transformed to a linear programming problem. Max Z = cy + c0t S.T. Ay – bt ≤ 0 dy + d0t = 1 y,t ≥ 0 Jin Y. Wang Chap12-19
© Copyright 2025 Paperzz